[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
liujinhui1994 commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r605394316 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java ## @@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) { Review comment: When the implementation plan is confirmed, I will quickly add test When the program is confirmed, I will quickly add test ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java ## @@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) { Review comment: When the program is confirmed, I will quickly add test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
liujinhui1994 commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r605393953 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java ## @@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed."); this.props = properties.get(); + String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, ""); Review comment: KAFKA_CHECKPOINT_TIMESTAMP, I think is just a way to make it easier for users to set checkpoint -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
liujinhui1994 commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r605393270 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java ## @@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception { return upsert(WriteOperationType.UPSERT); } - public Pair>> fetchSource() throws Exception { + public Pair>, Pair> fetchSource() throws Exception { Review comment: Okay, I'll add this class to this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vingov commented on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer
vingov commented on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-811654731 @yanghua - I've fixed the build, can you please merge this code? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan closed pull request #1929: [HUDI-1160] Support update partial fields for CoW table
nsivabalan closed pull request #1929: URL: https://github.com/apache/hudi/pull/1929 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #1929: [HUDI-1160] Support update partial fields for CoW table
nsivabalan commented on pull request #1929: URL: https://github.com/apache/hudi/pull/1929#issuecomment-811627648 Closing in favor of https://github.com/apache/hudi/pull/2666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log
codecov-io edited a comment on pull request #2754: URL: https://github.com/apache/hudi/pull/2754#issuecomment-811625350 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=h1) Report > Merging [#2754](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=desc) (dcc1fc4) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `5.27%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2754/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2754 +/- ## - Coverage 52.04% 46.77% -5.28% + Complexity 3625 3301 -324 Files 479 479 Lines 2280422806 +2 Branches 2415 2414 -1 - Hits 1186810667-1201 - Misses 991111231+1320 + Partials 1025 908 -117 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.92% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `9.39% <0.00%> (-60.40%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.04%)` | `0.00 <0.00> (-19.00)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | |
[GitHub] [hudi] codecov-io commented on pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log
codecov-io commented on pull request #2754: URL: https://github.com/apache/hudi/pull/2754#issuecomment-811625350 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=h1) Report > Merging [#2754](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=desc) (dcc1fc4) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `5.57%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2754/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2754 +/- ## - Coverage 52.04% 46.47% -5.58% + Complexity 3625 3111 -514 Files 479 457 -22 Lines 2280421198-1606 Branches 2415 2259 -156 - Hits 11868 9851-2017 - Misses 991110514 +603 + Partials 1025 833 -192 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.92% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.39% <0.00%> (-60.40%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.04%)` | `0.00 <0.00> (-19.00)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | |
[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report > Merging [#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (bb49f77) into [master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc) (ce3e8ec) will **decrease** coverage by `42.38%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2651 +/- ## - Coverage 51.76% 9.38% -42.39% + Complexity 3602 48 -3554 Files 476 54 -422 Lines 225791993-20586 Branches 2408 236 -2172 - Hits 11688 187-11501 + Misses 98741793 -8081 + Partials 1017 13 -1004 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.38% <0.00%> (-60.41%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.73%)` | `0.00 <0.00> (-56.00)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` |
[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving
ssdong commented on issue #2707: URL: https://github.com/apache/hudi/issues/2707#issuecomment-811619487 @jsbali To give out extra insights and details, as @zherenyu831 has posted in the beginning: ``` [20210323080718__replacecommit__COMPLETED]: size : 0 [20210323081449__replacecommit__COMPLETED]: size : 1 [20210323082046__replacecommit__COMPLETED]: size : 1 [20210323082758__replacecommit__COMPLETED]: size : 1 [20210323084004__replacecommit__COMPLETED]: size : 1 [20210323085044__replacecommit__COMPLETED]: size : 1 [20210323085823__replacecommit__COMPLETED]: size : 1 [20210323090550__replacecommit__COMPLETED]: size : 1 [20210323091700__replacecommit__COMPLETED]: size : 1 ``` If we keep everything the same and let archive logic handling everything, it would fail at 0 `partitionToReplaceFileIds` against `20210323080718__replacecommit__COMPLETED`(the second item in the list above), and this is a known issue. To make the archive work, we tried to _manually_ delete the first _empty_ commit file, which is `20210323080718__replacecommit__COMPLETED`(the first item in the list above). This has succeeded the archive, but instead, it has failed upon `User class threw exception: org.apache.hudi.exception.HoodieIOException: Could not read commit details from s3://xxx/data/.hoodie/20210323081449.replacecommit`(the second item in the list above) Now to reason through the underlying mechanism of this error, given the archive was successful, that means a few commit files have been placed within the `.archive` folder, let's say ``` [20210323081449__replacecommit__COMPLETED]: size : 1 [20210323082046__replacecommit__COMPLETED]: size : 1 [20210323082758__replacecommit__COMPLETED]: size : 1 [20210323084004__replacecommit__COMPLETED]: size : 1 [20210323085044__replacecommit__COMPLETED]: size : 1 ``` have been successfully moved and placed in `.archive`. At this moment, the timeline has been updated and there are 3 remaining commit files which are: ``` [20210323085823__replacecommit__COMPLETED]: size : 1 [20210323090550__replacecommit__COMPLETED]: size : 1 [20210323091700__replacecommit__COMPLETED]: size : 1 ``` Now, if you pay attention to the stack trace which caused `User class threw exception: org.apache.hudi.exception.HoodieIOException: Could not read commit details from s3://xxx/data/.hoodie/20210323081449.replacecommit`, and I am just pasting them again: ``` User class threw exception: org.apache.hudi.exception.HoodieIOException: Could not read commit details from s3://xxx/data/.hoodie/20210323081449.replacecommit at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:530) at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:194) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$resetFileGroupsReplaced$8(AbstractTableFileSystemView.java:217) at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.resetFileGroupsReplaced(AbstractTableFileSystemView.java:228) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:106) at org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:106) at org.apache.hudi.common.table.view.AbstractTableFileSystemView.reset(AbstractTableFileSystemView.java:248) at org.apache.hudi.common.table.view.HoodieTableFileSystemView.close(HoodieTableFileSystemView.java:353) at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707) at org.apache.hudi.common.table.view.FileSystemViewManager.close(FileSystemViewManager.java:118) at org.apache.hudi.timeline.service.TimelineService.close(TimelineService.java:179) at org.apache.hudi.client.embedded.EmbeddedTimelineService.stop(EmbeddedTimelineService.java:112) ``` After a `close` action being triggered on `TimelineService`, which is understandable, it propagates to `HoodieTableFileSystemView.close` and there is: ``` at org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:106) at org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:106) at
[GitHub] [hudi] tooptoop4 commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet
tooptoop4 commented on issue #2609: URL: https://github.com/apache/hudi/issues/2609#issuecomment-811614768 @rshanmugam1 see https://prestodb.io/blog/2020/08/04/prestodb-and-hudi and https://github.com/prestodb/presto/commit/9fd2459d98efd0809023b175ba53775466b74cc6 and https://github.com/prestodb/presto/commit/cfb2e7aa077954a02c048e81c97a47994d329852 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1751) DeltaStream print many unnecessary warn log
[ https://issues.apache.org/jira/browse/HUDI-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1751: - Labels: pull-request-available (was: ) > DeltaStream print many unnecessary warn log > --- > > Key: HUDI-1751 > URL: https://issues.apache.org/jira/browse/HUDI-1751 > Project: Apache Hudi > Issue Type: Improvement >Reporter: lrz >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > Because we add both kafka parameters and hudi configs at the same properties > file, such as kafka-source.properties, then when creating kafkaParams obj > will add some hoodie config also, which lead to the warn log printing: > !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/76572ba9f4094fb29b018db91fbf1450/image.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report > Merging [#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (201e4ff) into [master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc) (ce3e8ec) will **increase** coverage by `0.57%`. > The diff coverage is `70.90%`. > :exclamation: Current head 201e4ff differs from pull request most recent head 1735f33. Consider uploading reports for the commit 1735f33 to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2651 +/- ## + Coverage 51.76% 52.33% +0.57% + Complexity 3602 3477 -125 Files 476 460 -16 Lines 2257921425-1154 Branches 2408 2303 -105 - Hits 1168811213 -475 + Misses 9874 9224 -650 + Partials 1017 988 -29 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.87% <0.00%> (-0.06%)` | `0.00 <0.00> (ø)` | | | hudiflink | `56.01% <ø> (+1.73%)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `71.29% <75.09%> (+0.41%)` | `0.00 <26.00> (ø)` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.74% <50.00%> (-0.04%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==) | `47.34% <0.00%> (-0.94%)` | `57.00 <0.00> (ø)` | | | [...rg/apache/hudi/common/table/HoodieTableConfig.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL0hvb2RpZVRhYmxlQ29uZmlnLmphdmE=) | `43.20% <0.00%> (-2.25%)` | `17.00 <0.00> (ø)` | | | [...pache/hudi/common/table/HoodieTableMetaClient.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL0hvb2RpZVRhYmxlTWV0YUNsaWVudC5qYXZh) | `66.66% <0.00%> (-1.65%)` | `43.00 <0.00> (ø)` | | | [...ecution/datasources/Spark2ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazJQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [...ecution/datasources/Spark3ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazNQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | `83.33% <33.33%> (-5.56%)` | `0.00 <0.00> (ø)` | | | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.42% <50.00%> (-0.30%)` | `56.00 <0.00> (ø)` | | | [...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh) | `81.63% <64.28%> (-6.94%)` | `0.00 <0.00> (ø)` | | |
[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report > Merging [#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (201e4ff) into [master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc) (ce3e8ec) will **increase** coverage by `18.71%`. > The diff coverage is `74.71%`. > :exclamation: Current head 201e4ff differs from pull request most recent head 1735f33. Consider uploading reports for the commit 1735f33 to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2651 +/- ## = + Coverage 51.76% 70.47% +18.71% + Complexity 3602 609 -2993 = Files 476 93 -383 Lines 22579 3780-18799 Branches 2408 481 -1927 = - Hits 11688 2664 -9024 + Misses 9874 831 -9043 + Partials 1017 285 -732 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `71.29% <75.09%> (+0.41%)` | `0.00 <26.00> (ø)` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.74% <50.00%> (-0.04%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ecution/datasources/Spark2ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazJQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [...ecution/datasources/Spark3ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazNQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | | | [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | `83.33% <33.33%> (-5.56%)` | `0.00 <0.00> (ø)` | | | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.42% <50.00%> (-0.30%)` | `56.00 <0.00> (ø)` | | | [...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh) | `81.63% <64.28%> (-6.94%)` | `0.00 <0.00> (ø)` | | | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `78.78% <67.50%> (-5.36%)` | `31.00 <0.00> (+14.00)` | :arrow_down: | | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `79.08% <79.08%> (ø)` | `24.00 <24.00> (?)` | | | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.00% <88.00%> (+0.86%)` | `18.00 <1.00> (+1.00)` | | |
[GitHub] [hudi] li36909 opened a new pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log
li36909 opened a new pull request #2754: URL: https://github.com/apache/hudi/pull/2754 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request optimize the log print at deltastreamer ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request run deltastreamer test and check the log ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail
codecov-io commented on pull request #2752: URL: https://github.com/apache/hudi/pull/2752#issuecomment-811610689 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=h1) Report > Merging [#2752](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=desc) (2152760) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `0.00%`. > The diff coverage is `0.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2752/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2752 +/- ## - Coverage 52.04% 52.03% -0.01% + Complexity 3625 3624 -1 Files 479 479 Lines 2280422808 +4 Branches 2415 2415 + Hits 1186811869 +1 - Misses 9911 9914 +3 Partials 1025 1025 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `36.94% <0.00%> (-0.07%)` | `0.00 <0.00> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.94% <ø> (+0.01%)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `69.73% <ø> (-0.06%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...n/java/org/apache/hudi/cli/commands/SparkMain.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NwYXJrTWFpbi5qYXZh) | `6.72% <0.00%> (-0.12%)` | `4.00 <0.00> (ø)` | | | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.37% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `79.68% <0.00%> (+1.56%)` | `26.00% <0.00%> (ø%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1751) DeltaStream print many unnecessary warn log
lrz created HUDI-1751: - Summary: DeltaStream print many unnecessary warn log Key: HUDI-1751 URL: https://issues.apache.org/jira/browse/HUDI-1751 Project: Apache Hudi Issue Type: Improvement Reporter: lrz Fix For: 0.9.0 Because we add both kafka parameters and hudi configs at the same properties file, such as kafka-source.properties, then when creating kafkaParams obj will add some hoodie config also, which lead to the warn log printing: !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/76572ba9f4094fb29b018db91fbf1450/image.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-io commented on pull request #2753: [HUDI-1750] Fail to load user's class if user move hudi-spark-bundle jar into spark classpath
codecov-io commented on pull request #2753: URL: https://github.com/apache/hudi/pull/2753#issuecomment-811608059 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=h1) Report > Merging [#2753](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=desc) (43eb4f1) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2753/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2753 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[jira] [Updated] (HUDI-1750) Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into spark classpath
[ https://issues.apache.org/jira/browse/HUDI-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1750: - Labels: pull-request-available (was: ) > Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into > spark classpath > > > Key: HUDI-1750 > URL: https://issues.apache.org/jira/browse/HUDI-1750 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > Attachments: image-2021-04-01-10-55-43-760.png > > > Hudi use Class.forName(clazzName) to load user's class, which classloader is > same as call,see here: > !image-2021-04-01-10-55-43-760.png! > if user move hudi-spark-bundle jar into spark classPath, and use --jar to add > customer jars, then the caller classLoader will be AppClassLoader, and the > customer jars will be load by spark's MutableURLClassLoader, then lead to > ClassNotFoundException -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] li36909 opened a new pull request #2753: [HUDI-1750] Fail to load user's class if user move hudi-spark-bundle jar into spark classpath
li36909 opened a new pull request #2753: URL: https://github.com/apache/hudi/pull/2753 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request fix classloader bug ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request veryfy the fix by move hudi-spark-bundle jar into spark jars directory munaly and run test ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on a change in pull request #2721: [HUDI-1720] when query incr view of mor table which has many delete records use sparksql/hive-beeline, StackOverflowError
garyli1019 commented on a change in pull request #2721: URL: https://github.com/apache/hudi/pull/2721#discussion_r605342627 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeCompactedRecordReader.java ## @@ -95,15 +103,24 @@ public boolean next(NullWritable aVoid, ArrayWritable arrayWritable) throws IOEx // TODO(NA): Invoke preCombine here by converting arrayWritable to Avro. This is required since the // deltaRecord may not be a full record and needs values of columns from the parquet Option rec; -if (usesCustomPayload) { - rec = deltaRecordMap.get(key).getData().getInsertValue(getWriterSchema()); -} else { - rec = deltaRecordMap.get(key).getData().getInsertValue(getReaderSchema()); +rec = buildGenericRecordwithCustomPayload(deltaRecordMap.get(key)); +// If the record is not present, this is a delete record using an empty payload so skip this base record +// and move to the next record +while (!rec.isPresent()) { + // if current parquet reader has no record, return false + if (!this.parquetReader.next(aVoid, arrayWritable)) { Review comment: ok, I got confused by Spark Record Reader Iterator with this. There is no problem here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1750) Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into spark classpath
lrz created HUDI-1750: - Summary: Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into spark classpath Key: HUDI-1750 URL: https://issues.apache.org/jira/browse/HUDI-1750 Project: Apache Hudi Issue Type: Bug Reporter: lrz Fix For: 0.9.0 Attachments: image-2021-04-01-10-55-43-760.png Hudi use Class.forName(clazzName) to load user's class, which classloader is same as call,see here: !image-2021-04-01-10-55-43-760.png! if user move hudi-spark-bundle jar into spark classPath, and use --jar to add customer jars, then the caller classLoader will be AppClassLoader, and the customer jars will be load by spark's MutableURLClassLoader, then lead to ClassNotFoundException -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] li36909 commented on pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail
li36909 commented on pull request #2752: URL: https://github.com/apache/hudi/pull/2752#issuecomment-811596187 cc @nsivabalan could you help to take a look, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail
[ https://issues.apache.org/jira/browse/HUDI-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1749: - Labels: pull-request-available (was: ) > Clean/Compaction/Rollback command maybe never exit when operation fail > -- > > Key: HUDI-1749 > URL: https://issues.apache.org/jira/browse/HUDI-1749 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > There are two issues: > 1) After Clean/Compaction/Rollback command finish, yarn application will > always show fail because the command exit directly without waitting for > sparkContext stop. > 2)when Clean/Compaction/Rollback command failed because of some exception, > the command will never exit because of sparkContext didn't stop. This is > because sparkUI use jetty, and introduce non-daemon thread, and > sparkContext.stop will stopUI to stop the non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] li36909 opened a new pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail
li36909 opened a new pull request #2752: URL: https://github.com/apache/hudi/pull/2752 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request Fix hung bug for clean/compaction/rollback command when operation fail. ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request verify the fix manually ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
pengzhiwei2018 commented on a change in pull request #2651: URL: https://github.com/apache/hudi/pull/2651#discussion_r605337987 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala ## @@ -81,4 +82,53 @@ object HoodieWriterUtils { params.foreach(kv => props.setProperty(kv._1, kv._2)) props } + + /** + * Get the partition columns to stored to hoodie.properties. + * @param parameters + * @return + */ + def getPartitionColumns(parameters: Map[String, String]): Option[String] = { +val keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY, + DEFAULT_KEYGENERATOR_CLASS_OPT_VAL) +try { + val constructor = getClass.getClassLoader.loadClass(keyGenClass) +.getConstructor(classOf[TypedProperties]) + constructor.setAccessible(true) + val props = new TypedProperties() + props.putAll(parameters.asJava) + val keyGen = constructor.newInstance(props) Review comment: Reuse the KeyGenerator except the bootstrap method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail
lrz created HUDI-1749: - Summary: Clean/Compaction/Rollback command maybe never exit when operation fail Key: HUDI-1749 URL: https://issues.apache.org/jira/browse/HUDI-1749 Project: Apache Hudi Issue Type: Bug Reporter: lrz There are two issues: 1) After Clean/Compaction/Rollback command finish, yarn application will always show fail because the command exit directly without waitting for sparkContext stop. 2)when Clean/Compaction/Rollback command failed because of some exception, the command will never exit because of sparkContext didn't stop. This is because sparkUI use jetty, and introduce non-daemon thread, and sparkContext.stop will stopUI to stop the non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail
[ https://issues.apache.org/jira/browse/HUDI-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lrz updated HUDI-1749: -- Fix Version/s: 0.9.0 > Clean/Compaction/Rollback command maybe never exit when operation fail > -- > > Key: HUDI-1749 > URL: https://issues.apache.org/jira/browse/HUDI-1749 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Fix For: 0.9.0 > > > There are two issues: > 1) After Clean/Compaction/Rollback command finish, yarn application will > always show fail because the command exit directly without waitting for > sparkContext stop. > 2)when Clean/Compaction/Rollback command failed because of some exception, > the command will never exit because of sparkContext didn't stop. This is > because sparkUI use jetty, and introduce non-daemon thread, and > sparkContext.stop will stopUI to stop the non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-io commented on pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running
codecov-io commented on pull request #2751: URL: https://github.com/apache/hudi/pull/2751#issuecomment-811590982 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=h1) Report > Merging [#2751](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=desc) (2de7140) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **increase** coverage by `17.69%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2751/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2751 +/- ## = + Coverage 52.04% 69.73% +17.69% + Complexity 3625 371 -3254 = Files 479 54 -425 Lines 22804 1989-20815 Branches 2415 236 -2179 = - Hits 11868 1387-10481 + Misses 9911 471 -9440 + Partials 1025 131 -894 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.73% <ø> (-0.06%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.37% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | | | [...sioning/clean/CleanMetadataV2MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY2xlYW4vQ2xlYW5NZXRhZGF0YVYyTWlncmF0aW9uSGFuZGxlci5qYXZh) | | | | | [...org/apache/hudi/common/table/log/AppendResult.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9BcHBlbmRSZXN1bHQuamF2YQ==) | | | | | [...oning/compaction/CompactionV2MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY29tcGFjdGlvbi9Db21wYWN0aW9uVjJNaWdyYXRpb25IYW5kbGVyLmphdmE=) | | | | | [...on/table/timeline/versioning/MetadataMigrator.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvTWV0YWRhdGFNaWdyYXRvci5qYXZh) | | | | | [...e/hudi/common/table/timeline/dto/FileGroupDTO.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL2R0by9GaWxlR3JvdXBEVE8uamF2YQ==) | | | | | [...di/sink/partitioner/delta/DeltaBucketAssigner.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL2RlbHRhL0RlbHRhQnVja2V0QXNzaWduZXIuamF2YQ==) | | | | | [...che/hudi/metadata/TimelineMergedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvVGltZWxpbmVNZXJnZWRUYWJsZU1ldGFkYXRhLmphdmE=) | | | | | [...pache/hudi/io/storage/HoodieFileReaderFactory.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vc3RvcmFnZS9Ib29kaWVGaWxlUmVhZGVyRmFjdG9yeS5qYXZh) | | | | | [...oning/compaction/CompactionV1MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY29tcGFjdGlvbi9Db21wYWN0aW9uVjFNaWdyYXRpb25IYW5kbGVyLmphdmE=) | | | | | ... and [415 more](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree-more) | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above
[GitHub] [hudi] garyli1019 commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE
garyli1019 commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-811589593 @xiarixiaoyao Thanks for your contribution. Looks like you are able to reproduce this problem in the unit test. Is that possible to add the unit test to this pr as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2325: [HUDI-699]Fix CompactionCommand and add unit test for CompactionCommand
codecov-io edited a comment on pull request #2325: URL: https://github.com/apache/hudi/pull/2325#issuecomment-742860619 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=h1) Report > Merging [#2325](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=desc) (24790dc) into [master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc) (aa0da72) will **decrease** coverage by `42.65%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2325/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2325 +/- ## - Coverage 52.06% 9.40% -42.66% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11872 187-11685 + Misses 99071789 -8118 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.34%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
codecov-io edited a comment on pull request #2651: URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report > Merging [#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (fd68e0f) into [master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc) (ce3e8ec) will **decrease** coverage by `42.38%`. > The diff coverage is `0.00%`. > :exclamation: Current head fd68e0f differs from pull request most recent head 4af89c9. Consider uploading reports for the commit 4af89c9 to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2651 +/- ## - Coverage 51.76% 9.38% -42.39% + Complexity 3602 48 -3554 Files 476 54 -422 Lines 225791993-20586 Branches 2408 236 -2172 - Hits 11688 187-11501 + Misses 98741793 -8081 + Partials 1017 13 -1004 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.38% <0.00%> (-60.41%)` | `0.00 <0.00> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.73%)` | `0.00 <0.00> (-56.00)` | | | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | |
[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
pengzhiwei2018 commented on a change in pull request #2651: URL: https://github.com/apache/hudi/pull/2651#discussion_r605330589 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala ## @@ -81,4 +82,53 @@ object HoodieWriterUtils { params.foreach(kv => props.setProperty(kv._1, kv._2)) props } + + /** + * Get the partition columns to stored to hoodie.properties. + * @param parameters + * @return + */ + def getPartitionColumns(parameters: Map[String, String]): Option[String] = { +val keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY, + DEFAULT_KEYGENERATOR_CLASS_OPT_VAL) +try { + val constructor = getClass.getClassLoader.loadClass(keyGenClass) +.getConstructor(classOf[TypedProperties]) + constructor.setAccessible(true) + val props = new TypedProperties() + props.putAll(parameters.asJava) + val keyGen = constructor.newInstance(props) Review comment: Hi @umehrot2 , for bootstrap in `HoodieSparkSqlWriter`, there is no KeyGenerator created, so we need to recreating it here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] rshanmugam1 commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet
rshanmugam1 commented on issue #2609: URL: https://github.com/apache/hudi/issues/2609#issuecomment-811582823 Thanks very much Sudha and team. i will look in that direction and make sure that the cause. if that is the case, any pointers how to fix it. or any reference how it got fixed in facebook version of presto would be helpful. so that we can try same in Trino. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] li36909 commented on pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running
li36909 commented on pull request #2751: URL: https://github.com/apache/hudi/pull/2751#issuecomment-811582115 cc @nsivabalan could you help to take a look, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1748) Read operation will possibility fail on mor table rt view when a write operations is concurrency running
[ https://issues.apache.org/jira/browse/HUDI-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1748: - Labels: pull-request-available (was: ) > Read operation will possibility fail on mor table rt view when a write > operations is concurrency running > > > Key: HUDI-1748 > URL: https://issues.apache.org/jira/browse/HUDI-1748 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > during reading operation, a new base file maybe produced by a writting > operation. then the reading will opooibility to get a NPE when getSplit. here > is the exception stack: > !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/7bacca8042104499b0991d50b4bc3f2a/image.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] li36909 opened a new pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running
li36909 opened a new pull request #2751: URL: https://github.com/apache/hudi/pull/2751 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request Solve read write concurrency bug on mor table rt view ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request Testing concurrency probabilistic problems in UT is difficult. I add sleep stability to the getRealtimeSplits method to reproduce the problem and verify the fix. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1748) Read operation will possibility fail on mor table rt view when a write operations is concurrency running
lrz created HUDI-1748: - Summary: Read operation will possibility fail on mor table rt view when a write operations is concurrency running Key: HUDI-1748 URL: https://issues.apache.org/jira/browse/HUDI-1748 Project: Apache Hudi Issue Type: Bug Reporter: lrz Fix For: 0.9.0 during reading operation, a new base file maybe produced by a writting operation. then the reading will opooibility to get a NPE when getSplit. here is the exception stack: !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/7bacca8042104499b0991d50b4bc3f2a/image.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zherenyu831 edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving
zherenyu831 edited a comment on issue #2707: URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897 @jsbali For make everything clear: On archiving: Since the first replacecommit have 0 partitionToReplaceFileIds, and it failed at https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73 with error `Positive number of partitions required` Solution is Manually delete first replacecommit or ignore deletion on 0 partitionToReplaceFileIds replacecommit in the code as your mentioned on your ticket After I deleted the first replacecommit, the archiving finished successfully But I got another issue, https://github.com/apache/hudi/issues/2707#issuecomment-804831651 This is not related to what I deleted, as you can see, instant time is different Once the commits has been deleted on archiving process, hudi tried to load timeline again without reload the commits. I didn't debug more, since I want to ask the developers about how they want to deal with replaceFile deletion. seems like someone want to use cleaner to handle it. https://github.com/apache/hudi/issues/2707#issuecomment-804849028 Unfortunately, I also found cleaner is not working well with insert_overwrite_table, it only keep one file group -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zherenyu831 edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving
zherenyu831 edited a comment on issue #2707: URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897 @jsbali For make everything clear: On archiving: Since the first replacecommit have 0 partitionToReplaceFileIds, and it failed at https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73 with error `Positive number of partitions required` Solution is Manually delete first replacecommit or ignore deletion on 0 partitionToReplaceFileIds replacecommit in the code as your mentioned on your ticket After I deleted the first replacecommit, the archiving finished successfully But I got another issue, https://github.com/apache/hudi/issues/2707#issuecomment-804831651 This is not related to what I deleted, as you can see, instant time is different Once the commits has been deleted on archiving process, hudi tried to load timeline again without refresh it. I didn't debug more, since I want to ask the developers about how they want to deal with replaceFile deletion. seems like someone want to use cleaner to handle it. https://github.com/apache/hudi/issues/2707#issuecomment-804849028 Unfortunately, I also found cleaner is not working well with insert_overwrite_table, it only keep one file group -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zherenyu831 commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving
zherenyu831 commented on issue #2707: URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897 @jsbali For make everything clear: On archiving: Since the first replacecommit have 0 partitionToReplaceFileIds, and it failed at https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73 with error `Positive number of partitions required` Solution is Manually delete first replacecommit or ignore deletion on 0 partitionToReplaceFileIds replacecommit in the code as your mentioned on your ticket After I deleted the first replacecommit, the archiving finished successfully But I got another issue, https://github.com/apache/hudi/issues/2707#issuecomment-804831651 This is not related to what I deleted, commit time is different Once the commits has been deleted on archiving process, hudi tried to load timeline again without refresh it. I didn't debug more, since I want to ask the developers about how they want to deal with replaceFile deletion. seems like someone want to use cleaner to handle it. https://github.com/apache/hudi/issues/2707#issuecomment-804849028 Unfortunately, I also found cleaner is not working well with insert_overwrite_table, it only keep one file group -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1747) Deltastreamer incremental read is not working on the MOR table
Vinoth Govindarajan created HUDI-1747: - Summary: Deltastreamer incremental read is not working on the MOR table Key: HUDI-1747 URL: https://issues.apache.org/jira/browse/HUDI-1747 Project: Apache Hudi Issue Type: Bug Components: Common Core Reporter: Vinoth Govindarajan I was trying to read the MOR HUDI table incrementally using delta streamer, while doing that I ran into this issue where it says: {code:java} Found recursive reference in Avro schema, which can not be processed by Spark:{code} Spark Version: 2.4 Hudi Version: 0.7.0-SNAPSHOT or the latest master Full Stack Trace: {code:java} Found recursive reference in Avro schema, which can not be processed by Spark: { "type" : "record", "name" : "meta", "fields" : [ { "name" : "verified", "type" : [ "null", "boolean" ], "default" : null }, { "name" : "zip", "type" : [ "null", "string" ], "default" : null }, { "name" : "lname", "type" : [ "null", "string" ], "default" : null }] } at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:75) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:95) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81) at org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82) at org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at
[GitHub] [hudi] nsivabalan closed pull request #2750: [HUDI-1754] [HOT_FIX] Revert "[HUDI-1526] Translate the api partitionBy in spark datasource (#2431) due to usage of unavailable apis with older sp
nsivabalan closed pull request #2750: URL: https://github.com/apache/hudi/pull/2750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan closed issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
nsivabalan closed issue #2748: URL: https://github.com/apache/hudi/issues/2748 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
nsivabalan commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811513554 @aditiwari01 : don't think we are looking to support spark versions < 2.4.3. Its always been the case, and its a documentation issue. we will fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer
codecov-io edited a comment on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] rubenssoto commented on issue #2294: [SUPPORT] java.lang.IllegalArgumentException: Can not create a Path from an empty string on non partitioned COW table
rubenssoto commented on issue #2294: URL: https://github.com/apache/hudi/issues/2294#issuecomment-811477227 @bvaradar is it expected that this bug exists on 0.7.0? What problem this bug cause? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] umehrot2 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…
umehrot2 commented on a change in pull request #2651: URL: https://github.com/apache/hudi/pull/2651#discussion_r605196012 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala ## @@ -81,4 +82,53 @@ object HoodieWriterUtils { params.foreach(kv => props.setProperty(kv._1, kv._2)) props } + + /** + * Get the partition columns to stored to hoodie.properties. + * @param parameters + * @return + */ + def getPartitionColumns(parameters: Map[String, String]): Option[String] = { +val keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY, + DEFAULT_KEYGENERATOR_CLASS_OPT_VAL) +try { + val constructor = getClass.getClassLoader.loadClass(keyGenClass) +.getConstructor(classOf[TypedProperties]) + constructor.setAccessible(true) + val props = new TypedProperties() + props.putAll(parameters.asJava) + val keyGen = constructor.newInstance(props) Review comment: Can't we pass the KeyGenerator already created in `HoodieSparkSqlWriter` and `DeltaSync` instead of recreating it again here ? Both these places already create `KeyGenerator` using reflection. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer
codecov-io edited a comment on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=h1) Report > Merging [#2747](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=desc) (61d0222) into [master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc) (aa0da72) will **decrease** coverage by `1.70%`. > The diff coverage is `n/a`. > :exclamation: Current head 61d0222 differs from pull request most recent head ee4c06f. Consider uploading reports for the commit ee4c06f to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2747/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2747 +/- ## - Coverage 52.06% 50.35% -1.71% + Complexity 3625 3253 -372 Files 479 425 -54 Lines 2280420815-1989 Branches 2415 2179 -236 - Hits 1187210482-1390 + Misses 9907 9439 -468 + Partials 1025 894 -131 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.94% <ø> (-0.03%)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `?` | `?` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==) | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | | | [...ities/schema/NullTargetSchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9OdWxsVGFyZ2V0U2NoZW1hUmVnaXN0cnlQcm92aWRlci5qYXZh) | | | | | [...che/hudi/utilities/sources/HiveIncrPullSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSGl2ZUluY3JQdWxsU291cmNlLmphdmE=) | | | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | | | | | [...ities/checkpointing/InitialCheckPointProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrUG9pbnRQcm92aWRlci5qYXZh) | | | | | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | | | | | [.../org/apache/hudi/utilities/sources/InputBatch.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSW5wdXRCYXRjaC5qYXZh) | | | | | [...i/utilities/deser/KafkaAvroSchemaDeserializer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2Rlc2VyL0thZmthQXZyb1NjaGVtYURlc2VyaWFsaXplci5qYXZh) | | | | | [...lities/checkpointing/KafkaConnectHdfsProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvS2Fma2FDb25uZWN0SGRmc1Byb3ZpZGVyLmphdmE=) | | | | | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)
[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer
codecov-io edited a comment on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=h1) Report > Merging [#2747](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=desc) (61d0222) into [master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc) (aa0da72) will **decrease** coverage by `1.70%`. > The diff coverage is `n/a`. > :exclamation: Current head 61d0222 differs from pull request most recent head ee4c06f. Consider uploading reports for the commit ee4c06f to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2747/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2747 +/- ## - Coverage 52.06% 50.35% -1.71% + Complexity 3625 3253 -372 Files 479 425 -54 Lines 2280420815-1989 Branches 2415 2179 -236 - Hits 1187210482-1390 + Misses 9907 9439 -468 + Partials 1025 894 -131 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.94% <ø> (-0.03%)` | `0.00 <ø> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `?` | `?` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==) | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | | | [.../hudi/utilities/schema/SparkAvroPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TcGFya0F2cm9Qb3N0UHJvY2Vzc29yLmphdmE=) | | | | | [.../hudi/utilities/schema/SchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFSZWdpc3RyeVByb3ZpZGVyLmphdmE=) | | | | | [...ck/kafka/HoodieWriteCommitKafkaCallbackConfig.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NhbGxiYWNrL2thZmthL0hvb2RpZVdyaXRlQ29tbWl0S2Fma2FDYWxsYmFja0NvbmZpZy5qYXZh) | | | | | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | | | | | [...che/hudi/utilities/schema/SchemaPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQb3N0UHJvY2Vzc29yLmphdmE=) | | | | | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | | | | | [...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==) | | | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | | | | | [...java/org/apache/hudi/utilities/sources/Source.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvU291cmNlLmphdmE=) | | | |
[GitHub] [hudi] codecov-io edited a comment on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files
codecov-io edited a comment on pull request #2749: URL: https://github.com/apache/hudi/pull/2749#issuecomment-811147597 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=h1) Report > Merging [#2749](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=desc) (e97262f) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **increase** coverage by `17.74%`. > The diff coverage is `n/a`. > :exclamation: Current head e97262f differs from pull request most recent head e0dcab7. Consider uploading reports for the commit e0dcab7 to get more accurate results [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2749/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2749 +/- ## = + Coverage 52.04% 69.78% +17.74% + Complexity 3625 372 -3253 = Files 479 54 -425 Lines 22804 1989-20815 Branches 2415 236 -2179 = - Hits 11868 1388-10480 + Misses 9911 471 -9440 + Partials 1025 130 -895 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `69.78% <ø> (ø)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh) | | | | | [...ache/hudi/sink/StreamWriteOperatorCoordinator.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlT3BlcmF0b3JDb29yZGluYXRvci5qYXZh) | | | | | [...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh) | | | | | [.../org/apache/hudi/exception/HoodieKeyException.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUtleUV4Y2VwdGlvbi5qYXZh) | | | | | [...g/apache/hudi/exception/InvalidTableException.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0ludmFsaWRUYWJsZUV4Y2VwdGlvbi5qYXZh) | | | | | [...org/apache/hudi/common/util/SpillableMapUtils.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvU3BpbGxhYmxlTWFwVXRpbHMuamF2YQ==) | | | | | [...java/org/apache/hudi/common/lock/LockProvider.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2xvY2svTG9ja1Byb3ZpZGVyLmphdmE=) | | | | | [...cala/org/apache/hudi/HoodieBootstrapRelation.scala](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUJvb3RzdHJhcFJlbGF0aW9uLnNjYWxh) | | | | | [.../hudi/hadoop/realtime/HoodieRealtimeFileSplit.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0hvb2RpZVJlYWx0aW1lRmlsZVNwbGl0LmphdmE=) | | | | | [...g/apache/hudi/sink/partitioner/BucketAssigner.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL0J1Y2tldEFzc2lnbmVyLmphdmE=) | | | | | ... and [415 more](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree-more) | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this
[GitHub] [hudi] jsbali commented on pull request #2678: Added support for replace commits in commit showpartitions, commit sh…
jsbali commented on pull request #2678: URL: https://github.com/apache/hudi/pull/2678#issuecomment-811284724 Created the jira https://issues.apache.org/jira/browse/HUDI-1746. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-1746) Added support for replace commits in hudi-cli
[ https://issues.apache.org/jira/browse/HUDI-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312583#comment-17312583 ] Jagmeet Bali commented on HUDI-1746: https://github.com/apache/hudi/pull/2678 > Added support for replace commits in hudi-cli > - > > Key: HUDI-1746 > URL: https://issues.apache.org/jira/browse/HUDI-1746 > Project: Apache Hudi > Issue Type: New Feature >Reporter: Jagmeet Bali >Priority: Minor > > Currently, hudi-cli doesn't support replace commits in the commit show* > functions. This adds the foundation for that. > This PR still doesn't support the extraMetadata of the replace commit which > will be added in subsequent PR's. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-1746) Added support for replace commits in hudi-cli
Jagmeet Bali created HUDI-1746: -- Summary: Added support for replace commits in hudi-cli Key: HUDI-1746 URL: https://issues.apache.org/jira/browse/HUDI-1746 Project: Apache Hudi Issue Type: New Feature Reporter: Jagmeet Bali Currently, hudi-cli doesn't support replace commits in the commit show* functions. This adds the foundation for that. This PR still doesn't support the extraMetadata of the replace commit which will be added in subsequent PR's. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] aditiwari01 commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
aditiwari01 commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811280058 Yes. I faced the issue with 2.4.0 version itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
nsivabalan commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811277730 @vinothchandar : guess issue is w/ spark2.4.1. looks like the api is available from 2.4.2 and not available in 2.4.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] kimberlyamandalu commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue
kimberlyamandalu commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-811249839 > @kimberlyamandalu Can you try turning off the metadata table in hoodie to get your pipeline unblocked ? > > ``` > hoodie.metadata.enable=false > ``` > > This looks like an exception in the metadata table. Without any more logs, it's hard to debug what may be going on. If you are OK to deploy a custom build, we can work on adding more logs to help surface the underlying issue. > > https://github.com/apache/hudi/blob/release-0.7.0/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/FileSystemViewHandler.java#L378 this is where the exception is coming from. If we can add more logs to this function to see why a runtime exception is being thrown, it may help to find the root cause. Hi @n3nash, Thanks for looking into this. I can try disabling metadata for my workload. Can we toggle metadata on and off for a data set? or does it need to be enabled from the time the data is bootstrapped? Is it something we can turn on after we have been ingesting for a bit and it will still work? I will be willing to deploy a custom build so we can get more details on this issue. I will need your help on it though as i'm not very familiar with how to do this. Thanks!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
vinothchandar commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811224728 @aditiwari01 This is a bit of my bad. Actually we have always required Spark 2.4 for a while now (we need `spark-avro` from there). So as long as Spark 2.4 works and Spark 3.x works we are good. Sorry for the false alarms -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
vinothchandar commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811202865 @nsivabalan this would be a release blocker and we need to fix the RC if spark 2.x has issues. agree? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on issue #2648: [SUPPORT] a NPE error when reading MOR table in spark datasource
nsivabalan edited a comment on issue #2648: URL: https://github.com/apache/hudi/issues/2648#issuecomment-811188216 @n3nash : can you please add severity label if you have sense of it. just reminding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2655: [WIP] [HUDI-1615] Fixing null schema for delete operation in spark datasource
nsivabalan commented on pull request #2655: URL: https://github.com/apache/hudi/pull/2655#issuecomment-811192144 yeah, I haven't got a chance to fix it. once its ready, will let you know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2717: [SUPPORT] run_sync_tool support hive3.1.2 on hadoop3.1.4
nsivabalan commented on issue #2717: URL: https://github.com/apache/hudi/issues/2717#issuecomment-811181956 sure. Closing this out. we can continue the discussion in PR. [Here](https://issues.apache.org/jira/browse/HUDI-1721) is the right jira (closed the other one as duplicate) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan closed issue #2717: [SUPPORT] run_sync_tool support hive3.1.2 on hadoop3.1.4
nsivabalan closed issue #2717: URL: https://github.com/apache/hudi/issues/2717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2648: [SUPPORT] a NPE error when reading MOR table in spark datasource
nsivabalan commented on issue #2648: URL: https://github.com/apache/hudi/issues/2648#issuecomment-811188216 @n3nash : can you please add severity label if you have sense of it. just reminding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet
nsivabalan commented on issue #2609: URL: https://github.com/apache/hudi/issues/2609#issuecomment-811177897 thanks Sudha. if you feel you are done w/ you response, feel free to remove the "awaiting-community-help" label -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1724) run_sync_tool support for hive3.1.2 on hadoop3.1.4
[ https://issues.apache.org/jira/browse/HUDI-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1724: -- Labels: sev:critical user-support-issues (was: sev:triage user-support-issues) > run_sync_tool support for hive3.1.2 on hadoop3.1.4 > -- > > Key: HUDI-1724 > URL: https://issues.apache.org/jira/browse/HUDI-1724 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Reporter: Balaji Varadarajan >Priority: Major > Labels: sev:critical, user-support-issues > > Context: https://github.com/apache/hudi/issues/2717 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan commented on issue #2633: Empty File Slice causing application to fail in small files optimization code
nsivabalan commented on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-811179179 @n3nash : Can you please file a tracking jira and close this out. Do add labels w/ severity as appropriate to both issues and jira -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer
yanghua commented on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-811163812 @vingov Thanks, but the CI has failed, would you please check the reason? If you not sure, you can push an empty commit to retrigger the Travis. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
nsivabalan commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-811154406 thanks a lot for raising this @aditiwari01 . we are looking into it. https://issues.apache.org/jira/browse/HUDI-1745 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files
codecov-io commented on pull request #2749: URL: https://github.com/apache/hudi/pull/2749#issuecomment-811147597 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=h1) Report > Merging [#2749](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=desc) (e0dcab7) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2749/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2749 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[jira] [Updated] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api
[ https://issues.apache.org/jira/browse/HUDI-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1745: -- Labels: sev:critical (was: ) > Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable > spark api > - > > Key: HUDI-1745 > URL: https://issues.apache.org/jira/browse/HUDI-1745 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.8.0 >Reporter: sivabalan narayanan >Priority: Major > Labels: sev:critical > > [https://github.com/apache/hudi/issues/2748] > > PR thats of interest: https://github.com/apache/hudi/pull/2431 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api
[ https://issues.apache.org/jira/browse/HUDI-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1745: -- Description: [https://github.com/apache/hudi/issues/2748] PR thats of interest: [https://github.com/apache/hudi/pull/2431] I see we have three options. Let me know if we have more. Option1: Similar to SparkRowSerDe, we might have to introduce an interface for translateSqlOptions and override based on spark versions. But already we have two sub modules for spark2 and spark3. and now we might have to add more such modules for diff spark2 versions which might need more thought to do it elegantly. Option2: Since this feature is added only w/ 0.8.0, and not like a more sought after feature, we could revert this commit and unblock ourselves for 0.8.0. Once release is complete, we can decide how to do about doing this and get this feature in for next release. Option3: we say that hudi does not support spark version < 2.4.4 w/ 0.8.0. Don't think we can go this route. But just listing it out. was: [https://github.com/apache/hudi/issues/2748] PR thats of interest: https://github.com/apache/hudi/pull/2431 > Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable > spark api > - > > Key: HUDI-1745 > URL: https://issues.apache.org/jira/browse/HUDI-1745 > Project: Apache Hudi > Issue Type: Bug > Components: Spark Integration >Affects Versions: 0.8.0 >Reporter: sivabalan narayanan >Priority: Major > Labels: sev:critical > > [https://github.com/apache/hudi/issues/2748] > > PR thats of interest: [https://github.com/apache/hudi/pull/2431] > > I see we have three options. Let me know if we have more. > Option1: > Similar to SparkRowSerDe, we might have to introduce an interface for > translateSqlOptions and override based on spark versions. But already we have > two sub modules for spark2 and spark3. and now we might have to add more such > modules for diff spark2 versions which might need more thought to do it > elegantly. > Option2: > Since this feature is added only w/ 0.8.0, and not like a more sought after > feature, we could revert this commit and unblock ourselves for 0.8.0. Once > release is complete, we can decide how to do about doing this and get this > feature in for next release. > Option3: > we say that hudi does not support spark version < 2.4.4 w/ 0.8.0. Don't think > we can go this route. But just listing it out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api
sivabalan narayanan created HUDI-1745: - Summary: Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api Key: HUDI-1745 URL: https://issues.apache.org/jira/browse/HUDI-1745 Project: Apache Hudi Issue Type: Bug Components: Spark Integration Affects Versions: 0.8.0 Reporter: sivabalan narayanan [https://github.com/apache/hudi/issues/2748] PR thats of interest: https://github.com/apache/hudi/pull/2431 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan opened a new pull request #2750: Revert "[HUDI-1526] Translate the api partitionBy in spark datasource (#2431) due to usage of incompatabile apis with older spark versions
nsivabalan opened a new pull request #2750: URL: https://github.com/apache/hudi/pull/2750 This reverts commit 26da4f546275e8ab6496537743efe73510cb723d. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-1744) [Rollback] rollback fail on mor table when the partition path hasn't any files
[ https://issues.apache.org/jira/browse/HUDI-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1744: - Labels: pull-request-available (was: ) > [Rollback] rollback fail on mor table when the partition path hasn't any files > -- > > Key: HUDI-1744 > URL: https://issues.apache.org/jira/browse/HUDI-1744 > Project: Apache Hudi > Issue Type: Bug >Reporter: lrz >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > when rollback on a mor table, and if the partition path hasn't any files, > then will throw exception because of call rdd.flatmap with 0 as numpartitions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] li36909 commented on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files
li36909 commented on pull request #2749: URL: https://github.com/apache/hudi/pull/2749#issuecomment-811060626 cc @n3nash could you help to take a look, thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle
codecov-io edited a comment on pull request #2745: URL: https://github.com/apache/hudi/pull/2745#issuecomment-810102368 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=h1) Report > Merging [#2745](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=desc) (2d6a76c) into [master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc) (fe16d0d) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2745/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2745 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] li36909 opened a new pull request #2749: [Rollback] rollback fail on mor table when the partition path hasn't any files
li36909 opened a new pull request #2749: URL: https://github.com/apache/hudi/pull/2749 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request for example, if someone write into a mor table and fail, and if this is the first commit, then the partition path is empty. then second time we submit a new write commit, it will fail because of rollback fail. this pr use to solve this bug. ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request add new ut ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-1744) [Rollback] rollback fail on mor table when the partition path hasn't any files
lrz created HUDI-1744: - Summary: [Rollback] rollback fail on mor table when the partition path hasn't any files Key: HUDI-1744 URL: https://issues.apache.org/jira/browse/HUDI-1744 Project: Apache Hudi Issue Type: Bug Reporter: lrz Fix For: 0.9.0 when rollback on a mor table, and if the partition path hasn't any files, then will throw exception because of call rdd.flatmap with 0 as numpartitions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
nsivabalan commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r604841403 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java ## @@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) { Review comment: Can we add tests for the new code that is added. I don't see any tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
nsivabalan commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r604840433 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java ## @@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed."); this.props = properties.get(); + String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, ""); Review comment: Let me think more on this. Wondering if we should just rely on existing "HoodieDeltaStreamer.Config.checkpoint" only and add another config named "checkpoint.type" or something which could be set to timestamp for this purpose. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
nsivabalan commented on a change in pull request #2438: URL: https://github.com/apache/hudi/pull/2438#discussion_r604828288 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java ## @@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception { return upsert(WriteOperationType.UPSERT); } - public Pair>> fetchSource() throws Exception { + public Pair>, Pair> fetchSource() throws Exception { Review comment: actually my PR was closed as it was invalid. But [here](https://github.com/nsivabalan/hudi/blob/f7439e2e28748bf7b713fb72ba611f8af7bb97a1/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/ReadBatch.java) is the class that I added. May be you can add it in this patch only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2520: [HUDI-1446] Support skip bootstrapIndex's init in abstract fs view init
nsivabalan commented on pull request #2520: URL: https://github.com/apache/hudi/pull/2520#issuecomment-811001660 btw, CI build is failed. do check it out as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2520: [HUDI-1446] Support skip bootstrapIndex's init in abstract fs view init
nsivabalan commented on a change in pull request #2520: URL: https://github.com/apache/hudi/pull/2520#discussion_r604825768 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java ## @@ -67,6 +68,7 @@ public static final String HOODIE_TIMELINE_LAYOUT_VERSION = "hoodie.timeline.layout.version"; public static final String HOODIE_PAYLOAD_CLASS_PROP_NAME = "hoodie.compaction.payload.class"; public static final String HOODIE_ARCHIVELOG_FOLDER_PROP_NAME = "hoodie.archivelog.folder"; + public static final String HOODIE_BOOTSTRAP_INDEX_ENABLE = "hoodie.bootstrap.index.enable"; Review comment: may I know where do we set default value for this config to false? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] Sugamber commented on issue #2637: [SUPPORT] - Partial Update : update few columns of a table
Sugamber commented on issue #2637: URL: https://github.com/apache/hudi/issues/2637#issuecomment-810995008 Thank you!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle
RocMarshal commented on a change in pull request #2745: URL: https://github.com/apache/hudi/pull/2745#discussion_r604791723 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java ## @@ -179,29 +181,47 @@ public IOType getIOType() { Review comment: all of differences above here is caused by merging from latest master branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle
RocMarshal commented on a change in pull request #2745: URL: https://github.com/apache/hudi/pull/2745#discussion_r604791723 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java ## @@ -179,29 +181,47 @@ public IOType getIOType() { Review comment: all of differences above here is caused by merging from latest master branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vingov edited a comment on pull request #2747: [HUDI_1743] Added support for SqlFileBasedTransformer
vingov edited a comment on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095 @yanghua - I've already filled Jira and linked in the description https://issues.apache.org/jira/browse/HUDI-1743 Updated the title as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle
RocMarshal commented on a change in pull request #2745: URL: https://github.com/apache/hudi/pull/2745#discussion_r604785572 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java ## @@ -180,28 +180,34 @@ public IOType getIOType() { fileWriter.close(); HoodieWriteStat stat = new HoodieWriteStat(); - stat.setPartitionPath(writeStatus.getPartitionPath()); - stat.setNumWrites(recordsWritten); - stat.setNumDeletes(recordsDeleted); - stat.setNumInserts(insertRecordsWritten); - stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); - stat.setFileId(writeStatus.getFileId()); - stat.setPath(new Path(config.getBasePath()), path); long fileSizeInBytes = FSUtils.getFileSize(fs, path); stat.setTotalWriteBytes(fileSizeInBytes); stat.setFileSizeInBytes(fileSizeInBytes); - stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); Review comment: @danny0405 Thank you for your suggestions. I made some changes, please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vingov edited a comment on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer
vingov edited a comment on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095 @yanghua - I've already filled Jira and linked in the description https://issues.apache.org/jira/browse/HUDI-1743 Do you want me to update the title of this pull request? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vingov commented on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer
vingov commented on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095 @yanghua - I've already filled Jira and linked in the description https://issues.apache.org/jira/browse/HUDI-1743 Do you want me to update the title of this pull request? I will fix this build issue as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2742: [HUDI-1738] Emit deletes for flink MOR table streaming read
codecov-io edited a comment on pull request #2742: URL: https://github.com/apache/hudi/pull/2742#issuecomment-810101347 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=h1) Report > Merging [#2742](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=desc) (e99b8c5) into [master](https://codecov.io/gh/apache/hudi/commit/050626ad6cb8bbd06d138456ccc00dddcff2a860?el=desc) (050626a) will **decrease** coverage by `42.64%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2742/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2742 +/- ## - Coverage 52.04% 9.40% -42.65% + Complexity 3625 48 -3577 Files 479 54 -425 Lines 228041989-20815 Branches 2415 236 -2179 - Hits 11868 187-11681 + Misses 99111789 -8122 + Partials 1025 13 -1012 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `?` | `?` | | | hudiclient | `?` | `?` | | | hudicommon | `?` | `?` | | | hudiflink | `?` | `?` | | | hudihadoopmr | `?` | `?` | | | hudisparkdatasource | `?` | `?` | | | hudisync | `?` | `?` | | | huditimelineservice | `?` | `?` | | | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | | | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | | | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | | | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | | | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | | | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | | | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%>
[GitHub] [hudi] codecov-io edited a comment on pull request #2325: [HUDI-699]Fix CompactionCommand and add unit test for CompactionCommand
codecov-io edited a comment on pull request #2325: URL: https://github.com/apache/hudi/pull/2325#issuecomment-742860619 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=h1) Report > Merging [#2325](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=desc) (9cb7bf6) into [master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc) (aa0da72) will **increase** coverage by `0.25%`. > The diff coverage is `27.58%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2325/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2325 +/- ## + Coverage 52.06% 52.31% +0.25% - Complexity 3625 3645 +20 Files 479 479 Lines 2280422836 +32 Branches 2415 2416 +1 + Hits 1187211946 +74 + Misses 9907 9858 -49 - Partials 1025 1032 +7 ``` | Flag | Coverage Δ | Complexity Δ | | |---|---|---|---| | hudicli | `40.37% <38.09%> (+3.35%)` | `0.00 <3.00> (ø)` | | | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | | | hudicommon | `50.81% <0.00%> (-0.16%)` | `0.00 <0.00> (ø)` | | | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | | | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | | | hudiutilities | `69.78% <ø> (+0.05%)` | `0.00 <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...a/org/apache/hudi/cli/HoodieTableHeaderFields.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZVRhYmxlSGVhZGVyRmllbGRzLmphdmE=) | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | | | [...n/java/org/apache/hudi/cli/commands/SparkMain.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NwYXJrTWFpbi5qYXZh) | `6.95% <0.00%> (+0.11%)` | `4.00 <0.00> (ø)` | | | [.../common/table/timeline/HoodieArchivedTimeline.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFyY2hpdmVkVGltZWxpbmUuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...i/common/table/timeline/TimelineMetadataUtils.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL1RpbWVsaW5lTWV0YWRhdGFVdGlscy5qYXZh) | `70.17% <0.00%> (-2.56%)` | `17.00 <0.00> (ø)` | | | [...rg/apache/hudi/cli/commands/CompactionCommand.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0NvbXBhY3Rpb25Db21tYW5kLmphdmE=) | `30.18% <52.17%> (+29.38%)` | `22.00 <3.00> (+20.00)` | | | [...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==) | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | | | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | | | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.72% <0.00%> (+0.34%)` | `56.00% <0.00%> (+1.00%)` | | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer
yanghua commented on pull request #2747: URL: https://github.com/apache/hudi/pull/2747#issuecomment-810887489 @vingov Thanks for your contribution. IMO, this is a feature, it would be better to file a Jira ticket to track it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yanghua merged pull request #2746: [MINOR] Delete useless UpsertPartitioner for flink integration
yanghua merged pull request #2746: URL: https://github.com/apache/hudi/pull/2746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated (aa0da72 -> fe16d0d)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from aa0da72 Preparation for Avro update (#2650) add fe16d0d [MINOR] Delete useless UpsertPartitioner for flink integration (#2746) No new revisions were added by this update. Summary of changes: .../table/action/commit/UpsertPartitioner.java | 319 - 1 file changed, 319 deletions(-) delete mode 100644 hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
[GitHub] [hudi] liujinhui1994 commented on pull request #2710: [RFC-20][HUDI-648] Implement error log/table for Datasource/DeltaStreamer/WriteClient/Compaction writes
liujinhui1994 commented on pull request #2710: URL: https://github.com/apache/hudi/pull/2710#issuecomment-810842963 @lw309637554 Please help review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2710: [RFC-20][HUDI-648] Implement error log/table for Datasource/DeltaStreamer/WriteClient/Compaction writes
liujinhui1994 commented on a change in pull request #2710: URL: https://github.com/apache/hudi/pull/2710#discussion_r604658298 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -1032,6 +1033,37 @@ public String getWriteMetaKeyPrefixes() { return props.getProperty(WRITE_META_KEY_PREFIXES_PROP); } + /** + * Error table configs. + */ + public boolean enableErrorTable() { +return Boolean.parseBoolean(props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_ENABLE_PROP)); + } + + public String getErrorTableBasePath() { +return props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_BASE_PATH_PROP); + } + + public String getErrorTableName() { +return props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_NAME_PROP); + } + + public int getErrorTableInsertParallelism() { Review comment: I think it is reasonable here, you need to get these values from HoodieWriteConfig -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] aditiwari01 commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?
aditiwari01 commented on issue #2748: URL: https://github.com/apache/hudi/issues/2748#issuecomment-810832656 @garyli1019 Yes. More specifically for the mentioned method, we are using a constant "SparkDataSourceUtils.PARTITIONING_COLUMNS_KEY". This was introduced in spark v2.4.2. Hence for versions <2.4.2 the compilation fails with object not found. For reference: spark v2.4.1 (https://github.com/apache/spark/blob/v2.4.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala) PARTITIONING_COLUMNS_KEY not present. spark v2.4.2 (https://github.com/apache/spark/blob/v2.4.2/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala) PARTITIONING_COLUMNS_KEY is present. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle
danny0405 commented on a change in pull request #2745: URL: https://github.com/apache/hudi/pull/2745#discussion_r604647518 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java ## @@ -180,28 +180,34 @@ public IOType getIOType() { fileWriter.close(); HoodieWriteStat stat = new HoodieWriteStat(); - stat.setPartitionPath(writeStatus.getPartitionPath()); - stat.setNumWrites(recordsWritten); - stat.setNumDeletes(recordsDeleted); - stat.setNumInserts(insertRecordsWritten); - stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT); - stat.setFileId(writeStatus.getFileId()); - stat.setPath(new Path(config.getBasePath()), path); long fileSizeInBytes = FSUtils.getFileSize(fs, path); stat.setTotalWriteBytes(fileSizeInBytes); stat.setFileSizeInBytes(fileSizeInBytes); - stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords()); Review comment: The only difference between `HoodieCreateHandle` and `FlinkCreateHandle` is how they fetch the fize size, can we extract that part out ? Because usually we override the different part. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org