Re: [PR] [HUDI-7146] [RFC-77] RFC for secondary index [hudi]

2024-03-19 Thread via GitHub
vinothchandar commented on code in PR #10814: URL: https://github.com/apache/hudi/pull/10814#discussion_r1531350640 ## rfc/README.md: ## @@ -111,4 +111,5 @@ The list of all RFCs can be found here. | 73 | [Multi-Table Transactions](./rfc-73/rfc-73.md)

[I] [SUPPORT] spark stuctrued streaming failed to update metadata with MDT [hudi]

2024-03-19 Thread via GitHub
xicm opened a new issue, #10891: URL: https://github.com/apache/hudi/issues/10891 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

[jira] [Created] (HUDI-7519) Memory leaks in RocksDBDAO after rocksdbjni upgraded to 7

2024-03-19 Thread Kevin Lau (Jira)
Kevin Lau created HUDI-7519: --- Summary: Memory leaks in RocksDBDAO after rocksdbjni upgraded to 7 Key: HUDI-7519 URL: https://issues.apache.org/jira/browse/HUDI-7519 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-7516) Put jdbc-h2 creds into static variables for hudi-utilities tests

2024-03-19 Thread Vova Kolmakov (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vova Kolmakov updated HUDI-7516: Priority: Minor (was: Major) > Put jdbc-h2 creds into static variables for hudi-utilities tests >

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2008479903 ## CI report: * 8c678704c75c3cdff4eff3962ed43324afe52540 Azure:

Re: [PR] [HUDI-7510] Loosen the compaction scheduling and rollback check for MDT [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on code in PR #10874: URL: https://github.com/apache/hudi/pull/10874#discussion_r1531392517 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1410,35 +1388,19 @@ protected void

Re: [I] [SUPPORT] Change drop.partitionpath behavior [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on issue #10878: URL: https://github.com/apache/hudi/issues/10878#issuecomment-2008449889 > because it's written into hoodie.properties file - if I change the contents of this file yeah, it's a table configuration instead of write config. So the impact is global

Re: [I] [SUPPORT] Spark snapshot query against MOR table data written by Flink gives an incorrect timestamp [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on issue #10879: URL: https://github.com/apache/hudi/issues/10879#issuecomment-2008430919 Is the precision of the timestamp correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on code in PR #10886: URL: https://github.com/apache/hudi/pull/10886#discussion_r1531326627 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePartitionMetadata.java: ## @@ -92,11 +92,12 @@ public int getPartitionDepth() { /** * Write

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2008331336 ## CI report: * b741deda9e9a038050a93620b05084dad1ea8396 Azure:

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2008325862 ## CI report: * b741deda9e9a038050a93620b05084dad1ea8396 Azure:

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Description: When there are repeated deletes to the partition file list in files partition of the MDT, the

[jira] [Assigned] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7518: --- Assignee: Ethan Guo > Fix HoodieMetadataPayload merging logic around repeated deletes >

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Fix Version/s: 0.15.0 1.0.0 > Fix HoodieMetadataPayload merging logic around repeated

[jira] [Created] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7518: --- Summary: Fix HoodieMetadataPayload merging logic around repeated deletes Key: HUDI-7518 URL: https://issues.apache.org/jira/browse/HUDI-7518 Project: Apache Hudi

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-19 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7518: Priority: Blocker (was: Major) > Fix HoodieMetadataPayload merging logic around repeated deletes >

Re: [PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10890: URL: https://github.com/apache/hudi/pull/10890#issuecomment-2007928202 ## CI report: * 71941da6eb713bc672a28f4c81ff814f2106acb2 UNKNOWN * 9cdd961e7583a51d8e94f78f22ff357b2d864516 Azure:

Re: [PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10890: URL: https://github.com/apache/hudi/pull/10890#issuecomment-2007839070 ## CI report: * 71941da6eb713bc672a28f4c81ff814f2106acb2 UNKNOWN * 9cdd961e7583a51d8e94f78f22ff357b2d864516 Azure:

Re: [PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10890: URL: https://github.com/apache/hudi/pull/10890#issuecomment-2007824816 ## CI report: * 71941da6eb713bc672a28f4c81ff814f2106acb2 UNKNOWN * 9cdd961e7583a51d8e94f78f22ff357b2d864516 UNKNOWN Bot commands @hudi-bot supports the

Re: [PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
rmahindra123 commented on code in PR #10890: URL: https://github.com/apache/hudi/pull/10890#discussion_r1530819437 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java: ## @@ -119,6 +122,28 @@ public void

Re: [PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10890: URL: https://github.com/apache/hudi/pull/10890#issuecomment-2007740495 ## CI report: * 71941da6eb713bc672a28f4c81ff814f2106acb2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7517) Add ability to reset the checkpoint for kafka source

2024-03-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7517: - Labels: pull-request-available (was: ) > Add ability to reset the checkpoint for kafka source >

[PR] [HUDI-7517] Add ability to reset the checkpoint for kafka source [hudi]

2024-03-19 Thread via GitHub
sampan-s-nayak opened a new pull request, #10890: URL: https://github.com/apache/hudi/pull/10890 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[jira] [Created] (HUDI-7517) Add ability to reset the checkpoint for kafka source

2024-03-19 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7517: - Summary: Add ability to reset the checkpoint for kafka source Key: HUDI-7517 URL: https://issues.apache.org/jira/browse/HUDI-7517 Project: Apache Hudi

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2007574356 ## CI report: * b741deda9e9a038050a93620b05084dad1ea8396 Azure:

Re: [I] [SUPPORT] Duplicate data in base file of MOR table [hudi]

2024-03-19 Thread via GitHub
wqwl611 commented on issue #10882: URL: https://github.com/apache/hudi/issues/10882#issuecomment-2007549709 > If you are using multiwriter, which lock provider you are using? I cant see the lock configuration in the code. @ad1happy2go "hoodie.cleaner.policy.failed.writes" ->

Re: [PR] [HUDI-7504] replace expensive existence check with spark options [hudi]

2024-03-19 Thread via GitHub
bhat-vinay commented on code in PR #10865: URL: https://github.com/apache/hudi/pull/10865#discussion_r1529707163 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -112,10 +110,15 @@ public S3EventsHoodieIncrSource(

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2007426028 ## CI report: * b741deda9e9a038050a93620b05084dad1ea8396 Azure:

(hudi) branch asf-site updated: [DOCS] Hardcode config names instead of params (#10888)

2024-03-19 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new f47da4bde7f [DOCS] Hardcode config names

Re: [PR] [DOCS] Hardcode config names instead of params [hudi]

2024-03-19 Thread via GitHub
xushiyan merged PR #10888: URL: https://github.com/apache/hudi/pull/10888 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7187] Fix integ test props to honor new streamer properties [hudi]

2024-03-19 Thread via GitHub
wombatu-kun commented on code in PR #10866: URL: https://github.com/apache/hudi/pull/10866#discussion_r1529606423 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java: ## @@ -2418,15 +2418,15 @@ public void testSqlSourceSource()

Re: [PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10889: URL: https://github.com/apache/hudi/pull/10889#issuecomment-2007310499 ## CI report: * b741deda9e9a038050a93620b05084dad1ea8396 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7516) Put jdbc-h2 creds into static variables for hudi-utilities tests

2024-03-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7516: - Labels: pull-request-available (was: ) > Put jdbc-h2 creds into static variables for

[PR] [HUDI-7516] Put jdbc-h2 creds into static variables for hudi-utilities tests [hudi]

2024-03-19 Thread via GitHub
wombatu-kun opened a new pull request, #10889: URL: https://github.com/apache/hudi/pull/10889 ### Change Logs From this discussion https://github.com/apache/hudi/pull/10866 hudi-utilities tests refactoring: put the JDBC user and password into static variables to avoid any

[jira] [Updated] (HUDI-7516) Put jdbc-h2 creds into static variables for hudi-utilities tests

2024-03-19 Thread Vova Kolmakov (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vova Kolmakov updated HUDI-7516: Status: In Progress (was: Open) > Put jdbc-h2 creds into static variables for hudi-utilities tests

Re: [I] [SUPPORT] Duplicate data in base file of MOR table [hudi]

2024-03-19 Thread via GitHub
ad1happy2go commented on issue #10882: URL: https://github.com/apache/hudi/issues/10882#issuecomment-2007250457 If you are using multiwriter, which lock provider you are using? I cant see the lock configuration in the code. -- This is an automated message from the Apache Git Service. To

Re: [I] [SUPPORT] restore timeline [hudi]

2024-03-19 Thread via GitHub
ad1happy2go commented on issue #10887: URL: https://github.com/apache/hudi/issues/10887#issuecomment-2007244899 I dont think We can get .hoodie folder back. There will be no way to recreate the timeline. Only way is to check with cloud provider if bucket have some DR stuff. -- This is

Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-03-19 Thread via GitHub
ad1happy2go commented on issue #10884: URL: https://github.com/apache/hudi/issues/10884#issuecomment-2007239157 Dont think it can be kafka version related issue as job is not failing. we need to know more logs to debug this. -- This is an automated message from the Apache Git Service.

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007159791 ## CI report: * b802619f011c1d9ef5b334ecf67ab7df74964e08 Azure:

[jira] [Created] (HUDI-7516) Put jdbc-h2 creds into static variables for hudi-utilities tests

2024-03-19 Thread Vova Kolmakov (Jira)
Vova Kolmakov created HUDI-7516: --- Summary: Put jdbc-h2 creds into static variables for hudi-utilities tests Key: HUDI-7516 URL: https://issues.apache.org/jira/browse/HUDI-7516 Project: Apache Hudi

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
wecharyu commented on code in PR #10886: URL: https://github.com/apache/hudi/pull/10886#discussion_r1530312027 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePartitionMetadata.java: ## @@ -92,11 +92,12 @@ public int getPartitionDepth() { /** * Write

Re: [I] [SUPPORT] Change drop.partitionpath behavior [hudi]

2024-03-19 Thread via GitHub
VitoMakarevich commented on issue #10878: URL: https://github.com/apache/hudi/issues/10878#issuecomment-2007105692 0.13.x solves this, I mean it's writing files with a partition column but it does not cause an exception for `upsert` path. Unfortunately we cannot upgrade, so I'm asking for

Re: [PR] [HUDI-7510] Loosen the compaction scheduling and rollback check for MDT [hudi]

2024-03-19 Thread via GitHub
codope commented on code in PR #10874: URL: https://github.com/apache/hudi/pull/10874#discussion_r1530294507 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -1410,35 +1388,19 @@ protected void

Re: [PR] [DOCS] Hardcode config names instead of params [hudi]

2024-03-19 Thread via GitHub
bhasudha commented on PR #10888: URL: https://github.com/apache/hudi/pull/10888#issuecomment-2007058027 Tested locally. https://github.com/apache/hudi/assets/2179254/de479070-5d81-48dd-954a-dd41b214577d;>

[PR] [DOCS] Hardcode config names instead of params [hudi]

2024-03-19 Thread via GitHub
bhasudha opened a new pull request, #10888: URL: https://github.com/apache/hudi/pull/10888 ### Change Logs Hardcode all configs params in the website. ### Impact This allows for better readability and consistency. ### Risk level (write none, low medium or high

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10886: URL: https://github.com/apache/hudi/pull/10886#issuecomment-2007036084 ## CI report: * aadcb616ac338ef60c5799414bef660a19135c06 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007022918 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007009003 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

[I] [SUPPORT] restore timeline [hudi]

2024-03-19 Thread via GitHub
clp007 opened a new issue, #10887: URL: https://github.com/apache/hudi/issues/10887 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

Re: [I] [SUPPORT] Spark snapshot query against MOR table data written by Flink gives an incorrect timestamp [hudi]

2024-03-19 Thread via GitHub
dderjugin commented on issue #10879: URL: https://github.com/apache/hudi/issues/10879#issuecomment-2006987430 > Did you specify the read options like `read.utc-timezone`, by default it is true, and recently we also support the write utc timezone option in: #10594 yes, both writer and

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on code in PR #10886: URL: https://github.com/apache/hudi/pull/10886#discussion_r1530176270 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePartitionMetadata.java: ## @@ -92,11 +92,12 @@ public int getPartitionDepth() { /** * Write

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10886: URL: https://github.com/apache/hudi/pull/10886#issuecomment-2006901528 ## CI report: * aadcb616ac338ef60c5799414bef660a19135c06 Azure:

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10886: URL: https://github.com/apache/hudi/pull/10886#issuecomment-2006875566 ## CI report: * aadcb616ac338ef60c5799414bef660a19135c06 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-03-19 Thread via GitHub
ROOBALJINDAL commented on issue #10884: URL: https://github.com/apache/hudi/issues/10884#issuecomment-2006856626 @ad1happy2go need time to setup new cluster. Our aws msk kafka cluster uses kafka version=2.6.2, can you confirm is this fine or this can be an issue? Any specific supported

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-2006850399 ## CI report: * 07a4b73ca51fb4811f3bddf49a78b580aa29bc66 Azure:

[jira] [Updated] (HUDI-7515) Fix partition metadata write failure

2024-03-19 Thread Wechar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wechar updated HUDI-7515: - Attachment: screenshot-1.png > Fix partition metadata write failure > > >

Re: [PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
wecharyu commented on PR #10886: URL: https://github.com/apache/hudi/pull/10886#issuecomment-2006795897 For https://github.com/apache/hudi/issues/10885, cc: @beyond1920 @boneanxs @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please

[jira] [Updated] (HUDI-7515) Fix partition metadata write failure

2024-03-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7515: - Labels: pull-request-available (was: ) > Fix partition metadata write failure >

[PR] [HUDI-7515] Fix partition metadata write failure [hudi]

2024-03-19 Thread via GitHub
wecharyu opened a new pull request, #10886: URL: https://github.com/apache/hudi/pull/10886 ### Change Logs When `spark.speculation` is enabled, if the write metadata operation become slow for some reason, a speculative will be started to write the same metadata file concurrently.

Re: [I] [SUPPORT] Duplicate data in base file of MOR table [hudi]

2024-03-19 Thread via GitHub
wqwl611 commented on issue #10882: URL: https://github.com/apache/hudi/issues/10882#issuecomment-2006766143 > @danny0405 Can you give more details on how did you ingested this table? What writer configuration you used and did you changed index type for this table? @ad1happy2go I

[jira] [Closed] (HUDI-7514) Update Manifest file after the parquet writer closed in LSMTimelineWriter

2024-03-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7514. Resolution: Fixed Fixed via master branch: 784af0e17867f249159d4b6040568f3139f91545 > Update Manifest file

[jira] [Updated] (HUDI-7514) Update Manifest file after the parquet writer closed in LSMTimelineWriter

2024-03-19 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7514: - Fix Version/s: 1.0.0 > Update Manifest file after the parquet writer closed in LSMTimelineWriter >

(hudi) branch master updated: [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter (#10883)

2024-03-19 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 784af0e1786 [HUDI-7514] Update Manifest file

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
danny0405 merged PR #10883: URL: https://github.com/apache/hudi/pull/10883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [SUPPORT] Duplicate data in base file of MOR table [hudi]

2024-03-19 Thread via GitHub
ad1happy2go commented on issue #10882: URL: https://github.com/apache/hudi/issues/10882#issuecomment-2006733479 @danny0405 Can you give more details on how did you ingested this table? What writer configuration you used and did you changed index type for this table? -- This is an

Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-03-19 Thread via GitHub
ad1happy2go commented on issue #10884: URL: https://github.com/apache/hudi/issues/10884#issuecomment-2006696281 @ROOBALJINDAL Is it possible to try the same on EMR so that you will get all the logs to look into this more. There is no known updates which can cause this for 0.14.0 upgrade.

[jira] [Updated] (HUDI-7515) Fix partition metadata write failure

2024-03-19 Thread Wechar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wechar updated HUDI-7515: - Summary: Fix partition metadata write failure (was: Fix partition metadata write fail when speculation enabled)

[jira] [Created] (HUDI-7515) Fix partition metadata write fail when speculation enabled

2024-03-19 Thread Wechar (Jira)
Wechar created HUDI-7515: Summary: Fix partition metadata write fail when speculation enabled Key: HUDI-7515 URL: https://issues.apache.org/jira/browse/HUDI-7515 Project: Apache Hudi Issue Type: Bug

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-2006620197 ## CI report: * 91389889006dd1070145b1f777d68de124c91de7 Azure:

Re: [I] [SUPPORT] Partition query result is unexpected because some partitions missed .hoodie_partition_metadata file [hudi]

2024-03-19 Thread via GitHub
wecharyu commented on issue #10885: URL: https://github.com/apache/hudi/issues/10885#issuecomment-2006546726 Did you enable spark speculation? We have encountered this issue, will push a PR for discussion. cc: @boneanxs -- This is an automated message from the Apache Git Service. To

Re: [I] [SUPPORT] Partition query result is unexpected because some partitions missed .hoodie_partition_metadata file [hudi]

2024-03-19 Thread via GitHub
beyond1920 commented on issue #10885: URL: https://github.com/apache/hudi/issues/10885#issuecomment-2006476822 Use 014 HUDI version. and Use HDFS as storage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[I] [SUPPORT] Partition query result is unexpected because some partitions missed .hoodie_partition_metadata file [hudi]

2024-03-19 Thread via GitHub
beyond1920 opened a new issue, #10885: URL: https://github.com/apache/hudi/issues/10885 Dear community, I recently encountered an problem: The `.hoodie_partition_metadata` does not exist in leaf partition path. But the writer job completed successfully. It would result in incorrect

Re: [PR] [DOCS] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10856: URL: https://github.com/apache/hudi/pull/10856#discussion_r153376 ## website/docs/basic_configurations.md: ## @@ -101,15 +102,14 @@ Flink jobs using the SQL can be configured through the options in WITH clause. T |

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10851: URL: https://github.com/apache/hudi/pull/10851#discussion_r1529998508 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java: ## @@ -118,28 +120,32 @@ public class HoodieCleanConfig extends

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10883: URL: https://github.com/apache/hudi/pull/10883#issuecomment-2006453781 ## CI report: * ec4778998217fa76b34dd42b8610c40c5acc635f Azure:

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-2006453276 ## CI report: * 91389889006dd1070145b1f777d68de124c91de7 Azure:

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10851: URL: https://github.com/apache/hudi/pull/10851#discussion_r1529992613 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java: ## @@ -168,15 +174,16 @@ public class HoodieCleanConfig extends

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10851: URL: https://github.com/apache/hudi/pull/10851#discussion_r1529992613 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java: ## @@ -168,15 +174,16 @@ public class HoodieCleanConfig extends

Re: [I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-03-19 Thread via GitHub
ROOBALJINDAL commented on issue #10884: URL: https://github.com/apache/hudi/issues/10884#issuecomment-2006449206 @nsivabalan can you please check -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10851: URL: https://github.com/apache/hudi/pull/10851#discussion_r1529994322 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java: ## @@ -48,29 +48,31 @@ description = "Cleaning (reclamation of

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
geserdugarov commented on code in PR #10851: URL: https://github.com/apache/hudi/pull/10851#discussion_r1529992613 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCleanConfig.java: ## @@ -168,15 +174,16 @@ public class HoodieCleanConfig extends

[I] [SUPPORT] Hudi cdc upserts stopped working after migrating from hudi 13.1 to 14.0 [hudi]

2024-03-19 Thread via GitHub
ROOBALJINDAL opened a new issue, #10884: URL: https://github.com/apache/hudi/issues/10884 Issue: We have migrated from Hudi 0.13.0 to Hudi 0.14.0 and in this version, CDC events from Kafka upserts are not working. Table is created first time but afterwards, any new record

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10883: URL: https://github.com/apache/hudi/pull/10883#issuecomment-2006428483 ## CI report: * ec4778998217fa76b34dd42b8610c40c5acc635f Azure:

Re: [I] [SUPPORT] Change drop.partitionpath behavior [hudi]

2024-03-19 Thread via GitHub
VitoMakarevich commented on issue #10878: URL: https://github.com/apache/hudi/issues/10878#issuecomment-2006405417 It looks to be, and in this version(0.12.x) it causes Upsert unresolvable exception -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10883: URL: https://github.com/apache/hudi/pull/10883#issuecomment-2006402680 ## CI report: * ec4778998217fa76b34dd42b8610c40c5acc635f Azure:

Re: [I] [SUPPORT] Archived parquet file lenght is 0 when spark do streaming read [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on issue #10881: URL: https://github.com/apache/hudi/issues/10881#issuecomment-2006256237 Fixed in https://github.com/apache/hudi/pull/10883/files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
danny0405 commented on code in PR #10883: URL: https://github.com/apache/hudi/pull/10883#discussion_r1529914716 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/LSMTimelineWriter.java: ## @@ -128,6 +128,11 @@ public void write( } catch

Re: [PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10883: URL: https://github.com/apache/hudi/pull/10883#issuecomment-2006250432 ## CI report: * ec4778998217fa76b34dd42b8610c40c5acc635f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7493] Consistent naming of Cleaner configuration parameters [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10851: URL: https://github.com/apache/hudi/pull/10851#issuecomment-2006249940 ## CI report: * 91389889006dd1070145b1f777d68de124c91de7 Azure:

Re: [I] [SUPPORT] Duplicate data in base file of MOR table [hudi]

2024-03-19 Thread via GitHub
wqwl611 commented on issue #10882: URL: https://github.com/apache/hudi/issues/10882#issuecomment-2006238293 > can you also share the basic write configs @danny0405 updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] Archived parquet file lenght is 0 when spark do streaming read [hudi]

2024-03-19 Thread via GitHub
xicm closed issue #10881: [SUPPORT] Archived parquet file lenght is 0 when spark do streaming read URL: https://github.com/apache/hudi/issues/10881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [SUPPORT] Archived parquet file lenght is 0 when spark do streaming read [hudi]

2024-03-19 Thread via GitHub
xicm commented on issue #10881: URL: https://github.com/apache/hudi/issues/10881#issuecomment-2006235716 The cause is that we update the manifest and version file before we close the parquet writer. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2006227969 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

[jira] [Updated] (HUDI-7514) Update Manifest file after the parquet writer closed in LSMTimelineWriter

2024-03-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7514: - Labels: pull-request-available (was: ) > Update Manifest file after the parquet writer closed in

[PR] [HUDI-7514] Update Manifest file after the parquet writer closed in LSMTimelineWriter [hudi]

2024-03-19 Thread via GitHub
xicm opened a new pull request, #10883: URL: https://github.com/apache/hudi/pull/10883 ### Change Logs In LSMTimelineWriter we should wait the parquet writer closed and then update the manifest and version file. ### Impact none ### Risk level (write none, low

  1   2   >