Re: [PR] [HUDI-7492] fix the issue of incorrect keygenerator specification when creating m… [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10840: URL: https://github.com/apache/hudi/pull/10840#issuecomment-1985190500 ## CI report: * ad19525993057e8f0152067fdae1fab2ff57dedc Azure:

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10837: URL: https://github.com/apache/hudi/pull/10837#issuecomment-1985190458 ## CI report: * 0091a3c574639d0422bfe57a1ca236dd1f80e8dd Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1985190415 ## CI report: * 743f394ba5d3b6f7ebe79d399fb8d11d50a26a3b Azure:

[jira] [Updated] (HUDI-7492) When using Flinkcatalog to create hudi multiple partitions or multiple primary keys, the keygenerator generation is incorrect

2024-03-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7492: - Labels: pull-request-available (was: ) > When using Flinkcatalog to create hudi multiple

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10837: URL: https://github.com/apache/hudi/pull/10837#issuecomment-1985182342 ## CI report: * 0091a3c574639d0422bfe57a1ca236dd1f80e8dd Azure:

Re: [PR] [HUDI-7492] fix the issue of incorrect keygenerator specification when creating m… [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10840: URL: https://github.com/apache/hudi/pull/10840#issuecomment-1985182375 ## CI report: * ad19525993057e8f0152067fdae1fab2ff57dedc UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Created] (HUDI-7492) When using Flinkcatalog to create hudi multiple partitions or multiple primary keys, the keygenerator generation is incorrect

2024-03-07 Thread Jira
陈磊 created HUDI-7492: Summary: When using Flinkcatalog to create hudi multiple partitions or multiple primary keys, the keygenerator generation is incorrect Key: HUDI-7492 URL: https://issues.apache.org/jira/browse/HUDI-7492

[PR] fix the issue of incorrect keygenerator specification when creating m… [hudi]

2024-03-07 Thread via GitHub
empcl opened a new pull request, #10840: URL: https://github.com/apache/hudi/pull/10840 …ulti partition or multi primary key tables ### Change Logs fix the issue of incorrect keygenerator specification when creating multi partition or multi primary key tables ### Impact

Re: [I] [SUPPORT] java.lang.NoClassDefFoundError: org/apache/hudi/com/fasterxml/jackson/module/scala/DefaultScalaModule$ when doing an Incremental CDC Query in 0.14.1 [hudi]

2024-03-07 Thread via GitHub
blackcheckren commented on issue #10590: URL: https://github.com/apache/hudi/issues/10590#issuecomment-1985138812 I also encountered the same problem. Under the error log information and the suggestion of that friend, I saw that the problem seemed to be that there was a configuration

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1985109727 ## CI report: * 743f394ba5d3b6f7ebe79d399fb8d11d50a26a3b Azure:

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10837: URL: https://github.com/apache/hudi/pull/10837#issuecomment-1985103947 ## CI report: * 0091a3c574639d0422bfe57a1ca236dd1f80e8dd Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1985103920 ## CI report: * 743f394ba5d3b6f7ebe79d399fb8d11d50a26a3b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] updated powered by logo [hudi]

2024-03-07 Thread via GitHub
nfarah86 commented on PR #10839: URL: https://github.com/apache/hudi/pull/10839#issuecomment-1985103812 @bhasudha please review https://github.com/apache/hudi/assets/5392555/97b916d9-e350-4296-893c-0f30161c762a;> -- This is an automated message from the Apache Git Service. To

[PR] updated powered by logo [hudi]

2024-03-07 Thread via GitHub
nfarah86 opened a new pull request, #10839: URL: https://github.com/apache/hudi/pull/10839 ### Change Logs updated powered by logo ### Impact none ### Risk level (write none, low medium or high below) none ### Documentation Update none

Re: [PR] [DOCS] Initial commit for blogs [hudi]

2024-03-07 Thread via GitHub
nfarah86 commented on code in PR #10825: URL: https://github.com/apache/hudi/pull/10825#discussion_r1517194855 ## website/blog/2024-02-27-Building-Data-Lakes-on-AWS-with-Kafka-Connect-Debezium-Apicurio-Registry-and-Apache-Hudi.mdx: ## @@ -0,0 +1,31 @@ +--- +title: "Building

[jira] [Updated] (HUDI-6037) Improve compaction docs

2024-03-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6037: - Labels: docs pull-request-available (was: docs) > Improve compaction docs >

Re: [PR] [HUDI-5886][DOCS] Improve File Sizing, Timeline, and Flink docs [hudi]

2024-03-07 Thread via GitHub
nfarah86 closed pull request #8093: [HUDI-5886][DOCS] Improve File Sizing, Timeline, and Flink docs URL: https://github.com/apache/hudi/pull/8093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [HUDI-6037]-[DOCS]-Improve-compaction-doc [hudi]

2024-03-07 Thread via GitHub
nfarah86 closed pull request #8381: [HUDI-6037]-[DOCS]-Improve-compaction-doc URL: https://github.com/apache/hudi/pull/8381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] rewrote clustering doc [hudi]

2024-03-07 Thread via GitHub
nfarah86 closed pull request #8577: rewrote clustering doc URL: https://github.com/apache/hudi/pull/8577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
the-other-tim-brown commented on code in PR #10837: URL: https://github.com/apache/hudi/pull/10837#discussion_r1517166950 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java: ## @@ -234,8 +234,8 @@ private List

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10837: URL: https://github.com/apache/hudi/pull/10837#issuecomment-1985025119 ## CI report: * 0091a3c574639d0422bfe57a1ca236dd1f80e8dd Azure:

Re: [PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10837: URL: https://github.com/apache/hudi/pull/10837#issuecomment-1985020305 ## CI report: * 0091a3c574639d0422bfe57a1ca236dd1f80e8dd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[I] [SUPPORT] hudi0.14.0: Insert data into hudi with spark or create a new table exception [hudi]

2024-03-07 Thread via GitHub
SmyxBug opened a new issue, #10838: URL: https://github.com/apache/hudi/issues/10838 ## Hudi env - centos7 - hadoop3.1.3 - scala2.12.18 - spark3.3.0 ## Maven Project pom.xml ```xml org.apache.hudi hudi-spark3.3-bundle_2.12

[jira] [Updated] (HUDI-7491) Handle null extra metadata w/ clean commit metadata

2024-03-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7491: - Labels: pull-request-available (was: ) > Handle null extra metadata w/ clean commit metadata >

[jira] [Created] (HUDI-7491) Handle null extra metadata w/ clean commit metadata

2024-03-07 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-7491: - Summary: Handle null extra metadata w/ clean commit metadata Key: HUDI-7491 URL: https://issues.apache.org/jira/browse/HUDI-7491 Project: Apache Hudi

[PR] [HUDI-7491] Fixing handling null values of extra metadata in clean commit metadata [hudi]

2024-03-07 Thread via GitHub
nsivabalan opened a new pull request, #10837: URL: https://github.com/apache/hudi/pull/10837 ### Change Logs Fixing handling null values of extra metadata in clean commit metadata ### Impact Fixing handling null values of extra metadata in clean commit metadata

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984976158 ## CI report: * 72a23b30a71d227e54ee63cf5684215fb3d2b2f5 Azure:

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10807: URL: https://github.com/apache/hudi/pull/10807#issuecomment-1984971145 ## CI report: * cdfdefb4555ce5049bd3c750b61d05e4d1546ce1 Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
beyond1920 commented on code in PR #10836: URL: https://github.com/apache/hudi/pull/10836#discussion_r1517113136 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/ConsistentBucketIndexUtils.java: ## @@ -210,7 +210,15 @@ private static void

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
beyond1920 commented on code in PR #10836: URL: https://github.com/apache/hudi/pull/10836#discussion_r1517113136 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/ConsistentBucketIndexUtils.java: ## @@ -210,7 +210,15 @@ private static void

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984938449 ## CI report: * f3a929e52bed04c7752a4540069614bd3d708d11 Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984933674 ## CI report: * f3a929e52bed04c7752a4540069614bd3d708d11 Azure:

Re: [PR] force rollback [hudi]

2024-03-07 Thread via GitHub
jonvex closed pull request #8959: force rollback URL: https://github.com/apache/hudi/pull/8959 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] add validation to HoodieAvroUtils [hudi]

2024-03-07 Thread via GitHub
jonvex closed pull request #7974: add validation to HoodieAvroUtils URL: https://github.com/apache/hudi/pull/7974 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984928127 ## CI report: * f3a929e52bed04c7752a4540069614bd3d708d11 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-5418] Remove misleading line about mor precombine from quickstart spark-sql guide [hudi]

2024-03-07 Thread via GitHub
jonvex closed pull request #7514: [HUDI-5418] Remove misleading line about mor precombine from quickstart spark-sql guide URL: https://github.com/apache/hudi/pull/7514 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [HUDI-5418] Remove misleading line about mor precombine from quickstart spark-sql guide [hudi]

2024-03-07 Thread via GitHub
jonvex commented on PR #7514: URL: https://github.com/apache/hudi/pull/7514#issuecomment-1984927882 quick start has been updated. No longer needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [HUDI-7045] fix evolution by using legacy ff for reader [hudi]

2024-03-07 Thread via GitHub
jonvex closed pull request #10007: [HUDI-7045] fix evolution by using legacy ff for reader URL: https://github.com/apache/hudi/pull/10007 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

(hudi) branch asf-site updated: [HUDI-7482] Update schema evolution docs to explicitly state allowed type promotions (#10833)

2024-03-07 Thread jonvex
This is an automated email from the ASF dual-hosted git repository. jonvex pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 45d37ddc407 [HUDI-7482] Update schema

Re: [PR] [HUDI-7482] Update schema evolution docs to explicitly state allowed type promotions [hudi]

2024-03-07 Thread via GitHub
jonvex merged PR #10833: URL: https://github.com/apache/hudi/pull/10833 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7488] The BigQuerySyncTool can't work well when the hudi table schema changed #10829 [hudi]

2024-03-07 Thread via GitHub
steve-xi-awx commented on PR #10830: URL: https://github.com/apache/hudi/pull/10830#issuecomment-1984919605 @danny0405 Got it, thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on code in PR #10836: URL: https://github.com/apache/hudi/pull/10836#discussion_r1517073079 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieDatasetBulkInsertHelper.scala: ## @@ -149,53 +150,53 @@ object HoodieDatasetBulkInsertHelper

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
jonvex commented on code in PR #10836: URL: https://github.com/apache/hudi/pull/10836#discussion_r1517067137 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/ConsistentBucketIndexUtils.java: ## @@ -210,7 +210,15 @@ private static void

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984889599 ## CI report: * f3a929e52bed04c7752a4540069614bd3d708d11 Azure:

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10807: URL: https://github.com/apache/hudi/pull/10807#issuecomment-1984889556 ## CI report: * e7ce757189d4a1de1e81b3866a16c795be410b95 Azure:

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10807: URL: https://github.com/apache/hudi/pull/10807#issuecomment-1984883717 ## CI report: * e7ce757189d4a1de1e81b3866a16c795be410b95 Azure:

[jira] [Updated] (HUDI-7490) Fix archival guarding data files not yet cleaned up by cleaner when savepoint is removed

2024-03-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7490: -- Description: We added a fix recently where cleaner will take care of cleaning up

[jira] [Updated] (HUDI-7490) Fix archival guarding data files not yet cleaned up by cleaner when savepoint is removed

2024-03-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7490: -- Description: We added a fix recently where cleaner will take care of cleaning up

Re: [PR] initial commit for blogs [hudi]

2024-03-07 Thread via GitHub
bhasudha commented on code in PR #10825: URL: https://github.com/apache/hudi/pull/10825#discussion_r1517038603 ## website/blog/2024-02-27-Building-Data-Lakes-on-AWS-with-Kafka-Connect-Debezium-Apicurio-Registry-and-Apache-Hudi.mdx: ## @@ -0,0 +1,31 @@ +--- +title: "Building

[jira] [Created] (HUDI-7490) Fix archival guarding data files not yet cleaned up by cleaner when savepoint is removed

2024-03-07 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-7490: - Summary: Fix archival guarding data files not yet cleaned up by cleaner when savepoint is removed Key: HUDI-7490 URL: https://issues.apache.org/jira/browse/HUDI-7490

Re: [PR] [MINOR][HUDI-7466] Add parallel listing of existing partitions [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on PR #10460: URL: https://github.com/apache/hudi/pull/10460#issuecomment-1984866026 cc @VitoMakarevich for updating and I think it is a valuable contribution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [HUDI-6472] fix spark sql does not ignore case [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on PR #10826: URL: https://github.com/apache/hudi/pull/10826#issuecomment-1984864777 > SimpleAnalyzer is a test class that is hardcoded as case sensitive. I kind of think we should not introduce a test code snippet that is inconsistent with Spark's regular norms.

(hudi) branch master updated: [MINOR] Separate HoodieSparkWriterTestBase to reduce duplication (#10832)

2024-03-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 80c3f033b54 [MINOR] Separate

Re: [PR] [MINOR] Separate HoodieSparkWriterTestBase to reduce duplication [hudi]

2024-03-07 Thread via GitHub
danny0405 merged PR #10832: URL: https://github.com/apache/hudi/pull/10832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on code in PR #10807: URL: https://github.com/apache/hudi/pull/10807#discussion_r1517025133 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java: ## @@ -581,4 +596,43 @@ public HoodieDefaultTimeline

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on code in PR #10807: URL: https://github.com/apache/hudi/pull/10807#discussion_r1517025625 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorBase.java: ## @@ -79,24 +71,26 @@ public TimeGeneratorBase(HoodieTimeGeneratorConfig

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on code in PR #10807: URL: https://github.com/apache/hudi/pull/10807#discussion_r1517025133 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java: ## @@ -581,4 +596,43 @@ public HoodieDefaultTimeline

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984840258 ## CI report: * 7f92d4eb347ab8a2fcb1aeb5ea278c64e730089a Azure:

[jira] [Closed] (HUDI-7488) The BigQuerySyncTool can't work well when the hudi table schema changed

2024-03-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7488. Resolution: Fixed Fixed via master branch: 65a9d481249410b2b9c0ffc864025e0675d839b9 > The BigQuerySyncTool

Re: [I] The BigQuerySyncTool can't work well when the hudi table schema changed [SUPPORT] [hudi]

2024-03-07 Thread via GitHub
danny0405 closed issue #10829: The BigQuerySyncTool can't work well when the hudi table schema changed [SUPPORT] URL: https://github.com/apache/hudi/issues/10829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

(hudi) branch master updated: [HUDI-7488] The BigQuerySyncTool can't work well when the hudi table schema changed (#10830)

2024-03-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 65a9d481249 [HUDI-7488] The BigQuerySyncTool

Re: [PR] [HUDI-7488] The BigQuerySyncTool can't work well when the hudi table schema changed #10829 [hudi]

2024-03-07 Thread via GitHub
danny0405 merged PR #10830: URL: https://github.com/apache/hudi/pull/10830 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7488] The BigQuerySyncTool can't work well when the hudi table schema changed #10829 [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on PR #10830: URL: https://github.com/apache/hudi/pull/10830#issuecomment-1984836192 It looks okay, all the tests passed except the supplement of the description context which is unnecessary. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Update version-0.14.1-sidebars.json [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on PR #10834: URL: https://github.com/apache/hudi/pull/10834#issuecomment-1984834022 Would you mind to show the new page screenshot here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984832640 ## CI report: * 7f92d4eb347ab8a2fcb1aeb5ea278c64e730089a Azure:

Re: [PR] [HUDI-3921] Improve rewriteRecordWithNewSchema and refactor code [hudi]

2024-03-07 Thread via GitHub
danny0405 closed pull request #5393: [HUDI-3921] Improve rewriteRecordWithNewSchema and refactor code URL: https://github.com/apache/hudi/pull/5393 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [HUDI-7163] fix not parsable text DateTimeParseException when compact [hudi]

2024-03-07 Thread via GitHub
danny0405 commented on PR #10220: URL: https://github.com/apache/hudi/pull/10220#issuecomment-1984829652 We can land it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [HUDI-7457] Remove runtime shutdown hook from HoodieLogFormatWriter [hudi]

2024-03-07 Thread via GitHub
bvaradar commented on code in PR #10789: URL: https://github.com/apache/hudi/pull/10789#discussion_r1516956811 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java: ## @@ -62,15 +61,14 @@ public class HoodieLogFormatWriter implements

(hudi) branch asf-site updated (49880163591 -> 568775d392e)

2024-03-07 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a change to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git from 49880163591 GitHub Actions build asf-site add 568775d392e initial commit for hudi blogs (#10719) No new

Re: [PR] initial commit for hudi blogs [hudi]

2024-03-07 Thread via GitHub
bhasudha merged PR #10719: URL: https://github.com/apache/hudi/pull/10719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984621241 ## CI report: * 7f92d4eb347ab8a2fcb1aeb5ea278c64e730089a Azure:

Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10778: URL: https://github.com/apache/hudi/pull/10778#issuecomment-1984516210 ## CI report: * a5ccaf5573250350c960d33dd750d4d9c8a6e690 Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984369583 ## CI report: * 7f92d4eb347ab8a2fcb1aeb5ea278c64e730089a Azure:

Re: [PR] [HUDI-7489] Avoid collecting WriteStatus to driver in row writer code path [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10836: URL: https://github.com/apache/hudi/pull/10836#issuecomment-1984358872 ## CI report: * 7f92d4eb347ab8a2fcb1aeb5ea278c64e730089a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7489) Row writer clustering collects write statuses on the driver

2024-03-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7489: - Labels: pull-request-available (was: ) > Row writer clustering collects write statuses on the

[PR] [HUDI-7489] get rid of collect in row writer clustering [hudi]

2024-03-07 Thread via GitHub
jonvex opened a new pull request, #10836: URL: https://github.com/apache/hudi/pull/10836 ### Change Logs Collect of all the write statuses will be a ton of data on the driver. We can avoid this because the data is parallelized again immediately after the collect ### Impact

[jira] [Created] (HUDI-7489) Row writer clustering collects write statuses on the driver

2024-03-07 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-7489: - Summary: Row writer clustering collects write statuses on the driver Key: HUDI-7489 URL: https://issues.apache.org/jira/browse/HUDI-7489 Project: Apache Hudi

Re: [PR] [HUDI-7482] Update schema evolution docs to explicitly state allowed type promotions [hudi]

2024-03-07 Thread via GitHub
bhasudha commented on PR #10833: URL: https://github.com/apache/hudi/pull/10833#issuecomment-1984287058 > @bhasudha, @nfarah86, and @xushiyan , could you check if this is clear? @jonvex Looks good. I feel one minor ambiguity in the row and column heading. Can we rename `Incoming

Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10778: URL: https://github.com/apache/hudi/pull/10778#issuecomment-1984274667 ## CI report: * 63c818246106fb3efe7cde8a3293317efd6af202 Azure:

Re: [PR] [HUDI-7486] Classify schema exceptions when converting from avro to spark row representation [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10778: URL: https://github.com/apache/hudi/pull/10778#issuecomment-1984261909 ## CI report: * 63c818246106fb3efe7cde8a3293317efd6af202 Azure:

Re: [PR] [HUDI-3921] Improve rewriteRecordWithNewSchema and refactor code [hudi]

2024-03-07 Thread via GitHub
jonvex commented on PR #5393: URL: https://github.com/apache/hudi/pull/5393#issuecomment-1984112820 A lot of the changes in this pr are already in master. I think it's probably ok to close this -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [HUDI-4191] docker failure [hudi]

2024-03-07 Thread via GitHub
jonvex commented on PR #5756: URL: https://github.com/apache/hudi/pull/5756#issuecomment-1984099369 @yihua docker demo works for me, so I'm guessing this is not needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [HUDI-7476] Incremental loading for archived timeline [hudi]

2024-03-07 Thread via GitHub
codope commented on code in PR #10807: URL: https://github.com/apache/hudi/pull/10807#discussion_r1516523304 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorBase.java: ## @@ -79,24 +71,26 @@ public TimeGeneratorBase(HoodieTimeGeneratorConfig

Re: [PR] [HUDI-6472] fix spark sql does not ignore case [hudi]

2024-03-07 Thread via GitHub
jonvex commented on PR #10826: URL: https://github.com/apache/hudi/pull/10826#issuecomment-1983930177 @danny0405 the root cause of the case insensitive problem is that in `HoodieSpark32PlusAnalysis` I imported

Re: [PR] [MINOR] Changing the Properties to Load From Both Default Path and Enviorment [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10835: URL: https://github.com/apache/hudi/pull/10835#issuecomment-1983881230 ## CI report: * 9d96c9e8d1416544638bfec3e8e98a1a5b3018b7 Azure:

Re: [I] [SUPPORT] - Hudi 0.12.1 - production job slowing down [hudi]

2024-03-07 Thread via GitHub
joshhamann commented on issue #10822: URL: https://github.com/apache/hudi/issues/10822#issuecomment-1983870364 You can see the timestamps in the above screenshots from the Spark UI if that works. For instance, the test job, which is processing more data, goes from around 23:18 to 23:23

Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-03-07 Thread via GitHub
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983754269 Or if using a schema registry could help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-03-07 Thread via GitHub
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983749818 With the MySQLSyncDatabaseAction they claim the following:Currently supported schema changes includes: Adding columns. Altering column types. More

Re: [I] [SUPPORT] - Hudi 0.12.1 - production job slowing down [hudi]

2024-03-07 Thread via GitHub
ad1happy2go commented on issue #10822: URL: https://github.com/apache/hudi/issues/10822#issuecomment-1983745782 @joshhamann That's the correct understanding. If we are not using global bloom, then if your incremental dataset only had data from very few partitions , then index lookup stage

Re: [PR] [MINOR] Changing the Properties to Load From Both Default Path and Enviorment [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10835: URL: https://github.com/apache/hudi/pull/10835#issuecomment-1983736188 ## CI report: * 9d96c9e8d1416544638bfec3e8e98a1a5b3018b7 Azure:

Re: [PR] [MINOR] Changing the Properties to Load From Both Default Path and Enviorment [hudi]

2024-03-07 Thread via GitHub
yihua commented on PR #10835: URL: https://github.com/apache/hudi/pull/10835#issuecomment-1983708727 @CTTY would you like to review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

(hudi) branch master updated: [HUDI-5167] Reducing total test run time: reducing tests for virtual keys (#7153)

2024-03-07 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new d8e675d035a [HUDI-5167] Reducing total test run

Re: [PR] [HUDI-5167] Reducing total test run time: reducing tests for virtual keys [hudi]

2024-03-07 Thread via GitHub
yihua merged PR #7153: URL: https://github.com/apache/hudi/pull/7153 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR] Changing the Properties to Load From Both Default Path and Enviorment [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #10835: URL: https://github.com/apache/hudi/pull/10835#issuecomment-1983634301 ## CI report: * 9d96c9e8d1416544638bfec3e8e98a1a5b3018b7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[PR] [MINOR] Changing the Properties to Load From Both Default Path and Enviorme [hudi]

2024-03-07 Thread via GitHub
Amar1404 opened a new pull request, #10835: URL: https://github.com/apache/hudi/pull/10835 ### Change Logs The changes in the deltastreamer to support the Global File from Enviorment variable, currently in EMR it is not able to read from the ENV Variable ### Impact

Re: [I] Failed to create Marker file [hudi]

2024-03-07 Thread via GitHub
soumilshah1995 commented on issue #7909: URL: https://github.com/apache/hudi/issues/7909#issuecomment-1983565158 the version specified is quite old and we recommend to upgrade Hudi version 0.14.0 + please use jar files and let us know if you have issue -- This is an automated message

Re: [I] [SUPPORT] cannot assign instance of java.lang.invoke.SerializedLambda [hudi]

2024-03-07 Thread via GitHub
parisni commented on issue #8340: URL: https://github.com/apache/hudi/issues/8340#issuecomment-1983524308 > I solved the above problem by build new image and copy all packages in .ivy2/jars/* to /opt/spark/jars/ same here on kubernetes. Sounds like k8s does not works well with

Re: [I] [SUPPORT] Spark Read Hudi Tables with WARN Message [hudi]

2024-03-07 Thread via GitHub
ad1happy2go commented on issue #10828: URL: https://github.com/apache/hudi/issues/10828#issuecomment-1983184313 @michael1991 Not sure if this one is related to warning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] Update version-0.14.1-sidebars.json [hudi]

2024-03-07 Thread via GitHub
SebastianoMeneghin opened a new pull request, #10834: URL: https://github.com/apache/hudi/pull/10834 The "faq" item should be removed from the sidebar, for two main reasons: 1. It is already included in the Frequently Asked Questions (FAQs) above 2. It is displayed as "Overview" and

Re: [PR] [HUDI-5167] Reducing total test run time: reducing tests for virtual keys [hudi]

2024-03-07 Thread via GitHub
hudi-bot commented on PR #7153: URL: https://github.com/apache/hudi/pull/7153#issuecomment-1983097839 ## CI report: * 688eeecfb21062a0f0ed64c5ea3e3e10eb3e83e1 Azure:

  1   2   >