[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605394316



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option 
lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) {

Review comment:
   
   When the implementation plan is confirmed, I will quickly add test
   When the program is confirmed, I will quickly add test
   

##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option 
lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) {

Review comment:
   When the program is confirmed, I will quickly add test
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393953



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, 
FileSystem fs, Config
   "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to 
ensure updates are not missed.");
 
   this.props = properties.get();
+  String kafkaCheckpointTimestamp = 
props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");

Review comment:
   KAFKA_CHECKPOINT_TIMESTAMP, I think is just a way to make it easier for 
users to set checkpoint




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393270



##
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
 return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair>> 
fetchSource() throws Exception {
+  public Pair>, Pair> fetchSource() throws Exception {

Review comment:
   Okay, I'll add this class to this PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vingov commented on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


vingov commented on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-811654731


   @yanghua - I've fixed the build, can you please merge this code?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan closed pull request #1929: [HUDI-1160] Support update partial fields for CoW table

2021-03-31 Thread GitBox


nsivabalan closed pull request #1929:
URL: https://github.com/apache/hudi/pull/1929


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #1929: [HUDI-1160] Support update partial fields for CoW table

2021-03-31 Thread GitBox


nsivabalan commented on pull request #1929:
URL: https://github.com/apache/hudi/pull/1929#issuecomment-811627648


   Closing in favor of https://github.com/apache/hudi/pull/2666


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2754:
URL: https://github.com/apache/hudi/pull/2754#issuecomment-811625350


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=h1) Report
   > Merging 
[#2754](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=desc) (dcc1fc4) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `5.27%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2754/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2754  +/-   ##
   
   - Coverage 52.04%   46.77%   -5.28% 
   + Complexity 3625 3301 -324 
   
 Files   479  479  
 Lines 2280422806   +2 
 Branches   2415 2414   -1 
   
   - Hits  1186810667-1201 
   - Misses 991111231+1320 
   + Partials   1025  908 -117 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.92% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `9.39% <0.00%> (-60.40%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh)
 | `0.00% <0.00%> (-87.04%)` | `0.00 <0.00> (-19.00)` | |
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 

[GitHub] [hudi] codecov-io commented on pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log

2021-03-31 Thread GitBox


codecov-io commented on pull request #2754:
URL: https://github.com/apache/hudi/pull/2754#issuecomment-811625350


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=h1) Report
   > Merging 
[#2754](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=desc) (dcc1fc4) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `5.57%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2754/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2754  +/-   ##
   
   - Coverage 52.04%   46.47%   -5.58% 
   + Complexity 3625 3111 -514 
   
 Files   479  457  -22 
 Lines 2280421198-1606 
 Branches   2415 2259 -156 
   
   - Hits  11868 9851-2017 
   - Misses 991110514 +603 
   + Partials   1025  833 -192 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.92% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.39% <0.00%> (-60.40%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2754?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh)
 | `0.00% <0.00%> (-87.04%)` | `0.00 <0.00> (-19.00)` | |
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2754/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 

[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report
   > Merging 
[#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (bb49f77) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc)
 (ce3e8ec) will **decrease** coverage by `42.38%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2651   +/-   ##
   
   - Coverage 51.76%   9.38%   -42.39% 
   + Complexity 3602  48 -3554 
   
 Files   476  54  -422 
 Lines 225791993-20586 
 Branches   2408 236 -2172 
   
   - Hits  11688 187-11501 
   + Misses 98741793 -8081 
   + Partials   1017  13 -1004 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.38% <0.00%> (-60.41%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `0.00% <0.00%> (-71.73%)` | `0.00 <0.00> (-56.00)` | |
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | 

[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-03-31 Thread GitBox


ssdong commented on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-811619487


   @jsbali To give out extra insights and details, as @zherenyu831 has posted 
in the beginning:
   ```
   [20210323080718__replacecommit__COMPLETED]: size : 0
   [20210323081449__replacecommit__COMPLETED]: size : 1
   [20210323082046__replacecommit__COMPLETED]: size : 1
   [20210323082758__replacecommit__COMPLETED]: size : 1
   [20210323084004__replacecommit__COMPLETED]: size : 1
   [20210323085044__replacecommit__COMPLETED]: size : 1
   [20210323085823__replacecommit__COMPLETED]: size : 1
   [20210323090550__replacecommit__COMPLETED]: size : 1
   [20210323091700__replacecommit__COMPLETED]: size : 1
   ```
   If we keep everything the same and let archive logic handling everything, it 
would fail at 0 `partitionToReplaceFileIds` against 
`20210323080718__replacecommit__COMPLETED`(the second item in the list above), 
and this is a known issue. 
   
   To make the archive work, we tried to _manually_ delete the first _empty_ 
commit file, which is `20210323080718__replacecommit__COMPLETED`(the first item 
in the list above). This has succeeded the archive, but instead, it has failed 
upon `User class threw exception: org.apache.hudi.exception.HoodieIOException: 
Could not read commit details from 
s3://xxx/data/.hoodie/20210323081449.replacecommit`(the second item in the list 
above)
   
   Now to reason through the underlying mechanism of this error, given the 
archive was successful, that means a few commit files have been placed within 
the `.archive` folder, let's say 
   ```
   [20210323081449__replacecommit__COMPLETED]: size : 1
   [20210323082046__replacecommit__COMPLETED]: size : 1
   [20210323082758__replacecommit__COMPLETED]: size : 1
   [20210323084004__replacecommit__COMPLETED]: size : 1
   [20210323085044__replacecommit__COMPLETED]: size : 1
   ```
   have been successfully moved and placed in `.archive`. At this moment, the 
timeline has been updated and there are 3 remaining commit files which are:
   ```
   [20210323085823__replacecommit__COMPLETED]: size : 1
   [20210323090550__replacecommit__COMPLETED]: size : 1
   [20210323091700__replacecommit__COMPLETED]: size : 1
   ```
   
   Now, if you pay attention to the stack trace which caused `User class threw 
exception: org.apache.hudi.exception.HoodieIOException: Could not read commit 
details from s3://xxx/data/.hoodie/20210323081449.replacecommit`, and I am just 
pasting them again:
   ```
   User class threw exception: org.apache.hudi.exception.HoodieIOException: 
Could not read commit details from 
s3://xxx/data/.hoodie/20210323081449.replacecommit
   at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:530)
   at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:194)
   at 
org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$resetFileGroupsReplaced$8(AbstractTableFileSystemView.java:217)
   at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   at 
org.apache.hudi.common.table.view.AbstractTableFileSystemView.resetFileGroupsReplaced(AbstractTableFileSystemView.java:228)
   at 
org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:106)
   at 
org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:106)
   at 
org.apache.hudi.common.table.view.AbstractTableFileSystemView.reset(AbstractTableFileSystemView.java:248)
   at 
org.apache.hudi.common.table.view.HoodieTableFileSystemView.close(HoodieTableFileSystemView.java:353)
   at 
java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)
   at 
org.apache.hudi.common.table.view.FileSystemViewManager.close(FileSystemViewManager.java:118)
   at 
org.apache.hudi.timeline.service.TimelineService.close(TimelineService.java:179)
   at 
org.apache.hudi.client.embedded.EmbeddedTimelineService.stop(EmbeddedTimelineService.java:112)
   ```
   
   After a `close` action being triggered on `TimelineService`, which is 
understandable, it propagates to `HoodieTableFileSystemView.close` and there is:
   ```
   at 
org.apache.hudi.common.table.view.AbstractTableFileSystemView.init(AbstractTableFileSystemView.java:106)
   at 
org.apache.hudi.common.table.view.HoodieTableFileSystemView.init(HoodieTableFileSystemView.java:106)
   at 

[GitHub] [hudi] tooptoop4 commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet

2021-03-31 Thread GitBox


tooptoop4 commented on issue #2609:
URL: https://github.com/apache/hudi/issues/2609#issuecomment-811614768


   @rshanmugam1 see https://prestodb.io/blog/2020/08/04/prestodb-and-hudi   and 
https://github.com/prestodb/presto/commit/9fd2459d98efd0809023b175ba53775466b74cc6
 and 
https://github.com/prestodb/presto/commit/cfb2e7aa077954a02c048e81c97a47994d329852


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1751) DeltaStream print many unnecessary warn log

2021-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1751:
-
Labels: pull-request-available  (was: )

> DeltaStream print many unnecessary warn log
> ---
>
> Key: HUDI-1751
> URL: https://issues.apache.org/jira/browse/HUDI-1751
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: lrz
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Because we add both kafka parameters and hudi configs at the same properties 
> file, such as kafka-source.properties, then when creating kafkaParams obj 
> will add some hoodie config also, which lead to the warn log printing:
> !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/76572ba9f4094fb29b018db91fbf1450/image.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report
   > Merging 
[#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (201e4ff) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc)
 (ce3e8ec) will **increase** coverage by `0.57%`.
   > The diff coverage is `70.90%`.
   
   > :exclamation: Current head 201e4ff differs from pull request most recent 
head 1735f33. Consider uploading reports for the commit 1735f33 to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2651  +/-   ##
   
   + Coverage 51.76%   52.33%   +0.57% 
   + Complexity 3602 3477 -125 
   
 Files   476  460  -16 
 Lines 2257921425-1154 
 Branches   2408 2303 -105 
   
   - Hits  1168811213 -475 
   + Misses 9874 9224 -650 
   + Partials   1017  988  -29 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.87% <0.00%> (-0.06%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `56.01% <ø> (+1.73%)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `71.29% <75.09%> (+0.41%)` | `0.00 <26.00> (ø)` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.74% <50.00%> (-0.04%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...c/main/java/org/apache/hudi/common/fs/FSUtils.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0ZTVXRpbHMuamF2YQ==)
 | `47.34% <0.00%> (-0.94%)` | `57.00 <0.00> (ø)` | |
   | 
[...rg/apache/hudi/common/table/HoodieTableConfig.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL0hvb2RpZVRhYmxlQ29uZmlnLmphdmE=)
 | `43.20% <0.00%> (-2.25%)` | `17.00 <0.00> (ø)` | |
   | 
[...pache/hudi/common/table/HoodieTableMetaClient.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL0hvb2RpZVRhYmxlTWV0YUNsaWVudC5qYXZh)
 | `66.66% <0.00%> (-1.65%)` | `43.00 <0.00> (ø)` | |
   | 
[...ecution/datasources/Spark2ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazJQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ecution/datasources/Spark3ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazNQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=)
 | `83.33% <33.33%> (-5.56%)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.42% <50.00%> (-0.30%)` | `56.00 <0.00> (ø)` | |
   | 
[...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh)
 | `81.63% <64.28%> (-6.94%)` | `0.00 <0.00> (ø)` | |
   | 

[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report
   > Merging 
[#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (201e4ff) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc)
 (ce3e8ec) will **increase** coverage by `18.71%`.
   > The diff coverage is `74.71%`.
   
   > :exclamation: Current head 201e4ff differs from pull request most recent 
head 1735f33. Consider uploading reports for the commit 1735f33 to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2651   +/-   ##
   =
   + Coverage 51.76%   70.47%   +18.71% 
   + Complexity 3602  609 -2993 
   =
 Files   476   93  -383 
 Lines 22579 3780-18799 
 Branches   2408  481 -1927 
   =
   - Hits  11688 2664 -9024 
   + Misses 9874  831 -9043 
   + Partials   1017  285  -732 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `71.29% <75.09%> (+0.41%)` | `0.00 <26.00> (ø)` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.74% <50.00%> (-0.04%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...ecution/datasources/Spark2ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazJQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ecution/datasources/Spark3ParsePartitionUtil.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvc3Bhcmsvc3FsL2V4ZWN1dGlvbi9kYXRhc291cmNlcy9TcGFyazNQYXJzZVBhcnRpdGlvblV0aWwuc2NhbGE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=)
 | `83.33% <33.33%> (-5.56%)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.42% <50.00%> (-0.30%)` | `56.00 <0.00> (ø)` | |
   | 
[...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh)
 | `81.63% <64.28%> (-6.94%)` | `0.00 <0.00> (ø)` | |
   | 
[...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=)
 | `78.78% <67.50%> (-5.36%)` | `31.00 <0.00> (+14.00)` | :arrow_down: |
   | 
[...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==)
 | `79.08% <79.08%> (ø)` | `24.00 <24.00> (?)` | |
   | 
[.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==)
 | `90.00% <88.00%> (+0.86%)` | `18.00 <1.00> (+1.00)` | |
   | 

[GitHub] [hudi] li36909 opened a new pull request #2754: [HUDI-1751] DeltaStreamer print many unnecessary warn log

2021-03-31 Thread GitBox


li36909 opened a new pull request #2754:
URL: https://github.com/apache/hudi/pull/2754


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   optimize the log print at deltastreamer
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   run deltastreamer test and check the log
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io commented on pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread GitBox


codecov-io commented on pull request #2752:
URL: https://github.com/apache/hudi/pull/2752#issuecomment-811610689


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=h1) Report
   > Merging 
[#2752](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=desc) (2152760) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2752/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2752  +/-   ##
   
   - Coverage 52.04%   52.03%   -0.01% 
   + Complexity 3625 3624   -1 
   
 Files   479  479  
 Lines 2280422808   +4 
 Branches   2415 2415  
   
   + Hits  1186811869   +1 
   - Misses 9911 9914   +3 
 Partials   1025 1025  
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `36.94% <0.00%> (-0.07%)` | `0.00 <0.00> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.94% <ø> (+0.01%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `69.73% <ø> (-0.06%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2752?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...n/java/org/apache/hudi/cli/commands/SparkMain.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NwYXJrTWFpbi5qYXZh)
 | `6.72% <0.00%> (-0.12%)` | `4.00 <0.00> (ø)` | |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.37% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | |
   | 
[...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2752/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==)
 | `79.68% <0.00%> (+1.56%)` | `26.00% <0.00%> (ø%)` | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1751) DeltaStream print many unnecessary warn log

2021-03-31 Thread lrz (Jira)
lrz created HUDI-1751:
-

 Summary: DeltaStream print many unnecessary warn log
 Key: HUDI-1751
 URL: https://issues.apache.org/jira/browse/HUDI-1751
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: lrz
 Fix For: 0.9.0


Because we add both kafka parameters and hudi configs at the same properties 
file, such as kafka-source.properties, then when creating kafkaParams obj will 
add some hoodie config also, which lead to the warn log printing:

!https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/76572ba9f4094fb29b018db91fbf1450/image.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-io commented on pull request #2753: [HUDI-1750] Fail to load user's class if user move hudi-spark-bundle jar into spark classpath

2021-03-31 Thread GitBox


codecov-io commented on pull request #2753:
URL: https://github.com/apache/hudi/pull/2753#issuecomment-811608059


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=h1) Report
   > Merging 
[#2753](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=desc) (43eb4f1) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `42.64%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2753/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2753   +/-   ##
   
   - Coverage 52.04%   9.40%   -42.65% 
   + Complexity 3625  48 -3577 
   
 Files   479  54  -425 
 Lines 228041989-20815 
 Branches   2415 236 -2179 
   
   - Hits  11868 187-11681 
   + Misses 99111789 -8122 
   + Partials   1025  13 -1012 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2753?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2753/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%> 

[jira] [Updated] (HUDI-1750) Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into spark classpath

2021-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1750:
-
Labels: pull-request-available  (was: )

> Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into 
> spark classpath
> 
>
> Key: HUDI-1750
> URL: https://issues.apache.org/jira/browse/HUDI-1750
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: lrz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
> Attachments: image-2021-04-01-10-55-43-760.png
>
>
> Hudi use Class.forName(clazzName) to load user's class, which classloader is 
> same as call,see here:
> !image-2021-04-01-10-55-43-760.png!
> if user move hudi-spark-bundle jar into spark classPath, and use --jar to add 
> customer jars, then the caller classLoader will be AppClassLoader, and the 
> customer jars will be load by spark's MutableURLClassLoader, then lead to 
> ClassNotFoundException



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] li36909 opened a new pull request #2753: [HUDI-1750] Fail to load user's class if user move hudi-spark-bundle jar into spark classpath

2021-03-31 Thread GitBox


li36909 opened a new pull request #2753:
URL: https://github.com/apache/hudi/pull/2753


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   fix classloader bug
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   veryfy the fix by move hudi-spark-bundle jar into spark jars directory 
munaly and run test
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] garyli1019 commented on a change in pull request #2721: [HUDI-1720] when query incr view of mor table which has many delete records use sparksql/hive-beeline, StackOverflowError

2021-03-31 Thread GitBox


garyli1019 commented on a change in pull request #2721:
URL: https://github.com/apache/hudi/pull/2721#discussion_r605342627



##
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/RealtimeCompactedRecordReader.java
##
@@ -95,15 +103,24 @@ public boolean next(NullWritable aVoid, ArrayWritable 
arrayWritable) throws IOEx
 // TODO(NA): Invoke preCombine here by converting arrayWritable to 
Avro. This is required since the
 // deltaRecord may not be a full record and needs values of columns 
from the parquet
 Option rec;
-if (usesCustomPayload) {
-  rec = 
deltaRecordMap.get(key).getData().getInsertValue(getWriterSchema());
-} else {
-  rec = 
deltaRecordMap.get(key).getData().getInsertValue(getReaderSchema());
+rec = buildGenericRecordwithCustomPayload(deltaRecordMap.get(key));
+// If the record is not present, this is a delete record using an 
empty payload so skip this base record
+// and move to the next record
+while (!rec.isPresent()) {
+  // if current parquet reader has no record, return false
+  if (!this.parquetReader.next(aVoid, arrayWritable)) {

Review comment:
   ok, I got confused by Spark Record Reader Iterator with this. There is 
no problem here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1750) Fail to load user's class if user move hudi-spark-bundle_2.11-0.7.0.jar into spark classpath

2021-03-31 Thread lrz (Jira)
lrz created HUDI-1750:
-

 Summary: Fail to load user's class if user move 
hudi-spark-bundle_2.11-0.7.0.jar into spark classpath
 Key: HUDI-1750
 URL: https://issues.apache.org/jira/browse/HUDI-1750
 Project: Apache Hudi
  Issue Type: Bug
Reporter: lrz
 Fix For: 0.9.0
 Attachments: image-2021-04-01-10-55-43-760.png

Hudi use Class.forName(clazzName) to load user's class, which classloader is 
same as call,see here:

!image-2021-04-01-10-55-43-760.png!

if user move hudi-spark-bundle jar into spark classPath, and use --jar to add 
customer jars, then the caller classLoader will be AppClassLoader, and the 
customer jars will be load by spark's MutableURLClassLoader, then lead to 
ClassNotFoundException



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] li36909 commented on pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread GitBox


li36909 commented on pull request #2752:
URL: https://github.com/apache/hudi/pull/2752#issuecomment-811596187


   cc @nsivabalan could you help to take a look, thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1749:
-
Labels: pull-request-available  (was: )

> Clean/Compaction/Rollback command maybe never exit when operation fail
> --
>
> Key: HUDI-1749
> URL: https://issues.apache.org/jira/browse/HUDI-1749
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: lrz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> There are two issues:
> 1) After Clean/Compaction/Rollback command finish, yarn application will 
> always show fail because the command exit directly without waitting for 
> sparkContext stop.
> 2)when Clean/Compaction/Rollback command failed because of some exception, 
> the command will never exit because of sparkContext didn't stop. This is 
> because sparkUI use jetty, and introduce non-daemon thread, and 
> sparkContext.stop will stopUI to stop the non-daemon thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] li36909 opened a new pull request #2752: [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread GitBox


li36909 opened a new pull request #2752:
URL: https://github.com/apache/hudi/pull/2752


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   Fix hung bug for clean/compaction/rollback command when operation fail.
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   verify the fix manually
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


pengzhiwei2018 commented on a change in pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#discussion_r605337987



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala
##
@@ -81,4 +82,53 @@ object HoodieWriterUtils {
 params.foreach(kv => props.setProperty(kv._1, kv._2))
 props
   }
+
+  /**
+   * Get the partition columns to stored to hoodie.properties.
+   * @param parameters
+   * @return
+   */
+  def getPartitionColumns(parameters: Map[String, String]): Option[String] = {
+val  keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY,
+  DEFAULT_KEYGENERATOR_CLASS_OPT_VAL)
+try {
+  val constructor = getClass.getClassLoader.loadClass(keyGenClass)
+.getConstructor(classOf[TypedProperties])
+  constructor.setAccessible(true)
+  val props = new TypedProperties()
+  props.putAll(parameters.asJava)
+  val keyGen = constructor.newInstance(props)

Review comment:
   Reuse the KeyGenerator except the bootstrap method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread lrz (Jira)
lrz created HUDI-1749:
-

 Summary: Clean/Compaction/Rollback command maybe never exit when 
operation fail
 Key: HUDI-1749
 URL: https://issues.apache.org/jira/browse/HUDI-1749
 Project: Apache Hudi
  Issue Type: Bug
Reporter: lrz


There are two issues:

1) After Clean/Compaction/Rollback command finish, yarn application will always 
show fail because the command exit directly without waitting for sparkContext 
stop.

2)when Clean/Compaction/Rollback command failed because of some exception, the 
command will never exit because of sparkContext didn't stop. This is because 
sparkUI use jetty, and introduce non-daemon thread, and sparkContext.stop will 
stopUI to stop the non-daemon thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1749) Clean/Compaction/Rollback command maybe never exit when operation fail

2021-03-31 Thread lrz (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lrz updated HUDI-1749:
--
Fix Version/s: 0.9.0

> Clean/Compaction/Rollback command maybe never exit when operation fail
> --
>
> Key: HUDI-1749
> URL: https://issues.apache.org/jira/browse/HUDI-1749
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: lrz
>Priority: Major
> Fix For: 0.9.0
>
>
> There are two issues:
> 1) After Clean/Compaction/Rollback command finish, yarn application will 
> always show fail because the command exit directly without waitting for 
> sparkContext stop.
> 2)when Clean/Compaction/Rollback command failed because of some exception, 
> the command will never exit because of sparkContext didn't stop. This is 
> because sparkUI use jetty, and introduce non-daemon thread, and 
> sparkContext.stop will stopUI to stop the non-daemon thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] codecov-io commented on pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running

2021-03-31 Thread GitBox


codecov-io commented on pull request #2751:
URL: https://github.com/apache/hudi/pull/2751#issuecomment-811590982


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=h1) Report
   > Merging 
[#2751](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=desc) (2de7140) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **increase** coverage by `17.69%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2751/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2751   +/-   ##
   =
   + Coverage 52.04%   69.73%   +17.69% 
   + Complexity 3625  371 -3254 
   =
 Files   479   54  -425 
 Lines 22804 1989-20815 
 Branches   2415  236 -2179 
   =
   - Hits  11868 1387-10481 
   + Misses 9911  471 -9440 
   + Partials   1025  131  -894 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.73% <ø> (-0.06%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2751?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.37% <0.00%> (-0.35%)` | `55.00% <0.00%> (-1.00%)` | |
   | 
[...sioning/clean/CleanMetadataV2MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY2xlYW4vQ2xlYW5NZXRhZGF0YVYyTWlncmF0aW9uSGFuZGxlci5qYXZh)
 | | | |
   | 
[...org/apache/hudi/common/table/log/AppendResult.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9BcHBlbmRSZXN1bHQuamF2YQ==)
 | | | |
   | 
[...oning/compaction/CompactionV2MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY29tcGFjdGlvbi9Db21wYWN0aW9uVjJNaWdyYXRpb25IYW5kbGVyLmphdmE=)
 | | | |
   | 
[...on/table/timeline/versioning/MetadataMigrator.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvTWV0YWRhdGFNaWdyYXRvci5qYXZh)
 | | | |
   | 
[...e/hudi/common/table/timeline/dto/FileGroupDTO.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL2R0by9GaWxlR3JvdXBEVE8uamF2YQ==)
 | | | |
   | 
[...di/sink/partitioner/delta/DeltaBucketAssigner.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL2RlbHRhL0RlbHRhQnVja2V0QXNzaWduZXIuamF2YQ==)
 | | | |
   | 
[...che/hudi/metadata/TimelineMergedTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvVGltZWxpbmVNZXJnZWRUYWJsZU1ldGFkYXRhLmphdmE=)
 | | | |
   | 
[...pache/hudi/io/storage/HoodieFileReaderFactory.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vc3RvcmFnZS9Ib29kaWVGaWxlUmVhZGVyRmFjdG9yeS5qYXZh)
 | | | |
   | 
[...oning/compaction/CompactionV1MigrationHandler.java](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL3ZlcnNpb25pbmcvY29tcGFjdGlvbi9Db21wYWN0aW9uVjFNaWdyYXRpb25IYW5kbGVyLmphdmE=)
 | | | |
   | ... and [415 
more](https://codecov.io/gh/apache/hudi/pull/2751/diff?src=pr=tree-more) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above 

[GitHub] [hudi] garyli1019 commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-03-31 Thread GitBox


garyli1019 commented on pull request #2722:
URL: https://github.com/apache/hudi/pull/2722#issuecomment-811589593


   @xiarixiaoyao Thanks for your contribution. Looks like you are able to 
reproduce this problem in the unit test. Is that possible to add the unit test 
to this pr as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2325: [HUDI-699]Fix CompactionCommand and add unit test for CompactionCommand

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2325:
URL: https://github.com/apache/hudi/pull/2325#issuecomment-742860619


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=h1) Report
   > Merging 
[#2325](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=desc) (24790dc) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc)
 (aa0da72) will **decrease** coverage by `42.65%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2325/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2325   +/-   ##
   
   - Coverage 52.06%   9.40%   -42.66% 
   + Complexity 3625  48 -3577 
   
 Files   479  54  -425 
 Lines 228041989-20815 
 Branches   2415 236 -2179 
   
   - Hits  11872 187-11685 
   + Misses 99071789 -8118 
   + Partials   1025  13 -1012 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.40% <ø> (-60.34%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%> 

[GitHub] [hudi] codecov-io edited a comment on pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#issuecomment-794945140


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=h1) Report
   > Merging 
[#2651](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=desc) (fd68e0f) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/ce3e8ec87083ef4cd4f33de39b6697f66ff3f277?el=desc)
 (ce3e8ec) will **decrease** coverage by `42.38%`.
   > The diff coverage is `0.00%`.
   
   > :exclamation: Current head fd68e0f differs from pull request most recent 
head 4af89c9. Consider uploading reports for the commit 4af89c9 to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2651/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2651   +/-   ##
   
   - Coverage 51.76%   9.38%   -42.39% 
   + Complexity 3602  48 -3554 
   
 Files   476  54  -422 
 Lines 225791993-20586 
 Branches   2408 236 -2172 
   
   - Hits  11688 187-11501 
   + Misses 98741793 -8081 
   + Partials   1017  13 -1004 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.38% <0.00%> (-60.41%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2651?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `0.00% <0.00%> (-71.73%)` | `0.00 <0.00> (-56.00)` | |
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2651/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 

[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


pengzhiwei2018 commented on a change in pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#discussion_r605330589



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala
##
@@ -81,4 +82,53 @@ object HoodieWriterUtils {
 params.foreach(kv => props.setProperty(kv._1, kv._2))
 props
   }
+
+  /**
+   * Get the partition columns to stored to hoodie.properties.
+   * @param parameters
+   * @return
+   */
+  def getPartitionColumns(parameters: Map[String, String]): Option[String] = {
+val  keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY,
+  DEFAULT_KEYGENERATOR_CLASS_OPT_VAL)
+try {
+  val constructor = getClass.getClassLoader.loadClass(keyGenClass)
+.getConstructor(classOf[TypedProperties])
+  constructor.setAccessible(true)
+  val props = new TypedProperties()
+  props.putAll(parameters.asJava)
+  val keyGen = constructor.newInstance(props)

Review comment:
   Hi @umehrot2 , for bootstrap in `HoodieSparkSqlWriter`, there is no 
KeyGenerator created, so we need to recreating it here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] rshanmugam1 commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet

2021-03-31 Thread GitBox


rshanmugam1 commented on issue #2609:
URL: https://github.com/apache/hudi/issues/2609#issuecomment-811582823


   Thanks very much Sudha and team. 
   
   i will look in that direction and make sure that the cause. if that is the 
case, any pointers how to fix it. or any reference how it got fixed in facebook 
version of presto would be helpful. so that we can try same in Trino. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] li36909 commented on pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running

2021-03-31 Thread GitBox


li36909 commented on pull request #2751:
URL: https://github.com/apache/hudi/pull/2751#issuecomment-811582115


   cc @nsivabalan could you help to take a look, thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1748) Read operation will possibility fail on mor table rt view when a write operations is concurrency running

2021-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1748:
-
Labels: pull-request-available  (was: )

> Read operation will possibility fail on mor table rt view when a write 
> operations is concurrency running
> 
>
> Key: HUDI-1748
> URL: https://issues.apache.org/jira/browse/HUDI-1748
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: lrz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> during reading operation, a new base file maybe produced by a writting 
> operation. then the reading will opooibility to get a NPE when getSplit. here 
> is the exception stack:
> !https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/7bacca8042104499b0991d50b4bc3f2a/image.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] li36909 opened a new pull request #2751: [HUDI-1748] Read operation will possiblity fail on mor table rt view when a write operations is concurrency running

2021-03-31 Thread GitBox


li36909 opened a new pull request #2751:
URL: https://github.com/apache/hudi/pull/2751


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   Solve read write concurrency bug on mor table rt view
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   Testing concurrency probabilistic problems in UT is difficult. I add sleep 
stability to the getRealtimeSplits method to reproduce the problem and verify 
the fix.
   
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1748) Read operation will possibility fail on mor table rt view when a write operations is concurrency running

2021-03-31 Thread lrz (Jira)
lrz created HUDI-1748:
-

 Summary: Read operation will possibility fail on mor table rt view 
when a write operations is concurrency running
 Key: HUDI-1748
 URL: https://issues.apache.org/jira/browse/HUDI-1748
 Project: Apache Hudi
  Issue Type: Bug
Reporter: lrz
 Fix For: 0.9.0


during reading operation, a new base file maybe produced by a writting 
operation. then the reading will opooibility to get a NPE when getSplit. here 
is the exception stack:

!https://wa.vision.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/2/15/qwx352829/7bacca8042104499b0991d50b4bc3f2a/image.png!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] zherenyu831 edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-03-31 Thread GitBox


zherenyu831 edited a comment on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897


   @jsbali 
   For make everything clear:
   On archiving:
   Since the first replacecommit have 0 partitionToReplaceFileIds, and it 
failed at
   
https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73
   
   with error `Positive number of partitions required`
   
   Solution is 
   Manually delete first replacecommit or  ignore deletion on  0 
partitionToReplaceFileIds replacecommit  in the code as your mentioned on your 
ticket
   
   After I deleted the first replacecommit, the archiving finished successfully 
   But I got another issue, 
   https://github.com/apache/hudi/issues/2707#issuecomment-804831651
   
   This is not related to what I deleted, as you can see, instant time is 
different
   Once the commits has been deleted on archiving process, hudi tried to load 
timeline again without reload the commits.
   
   I didn't debug more, since I want to ask the developers about how they want 
to deal with replaceFile deletion.
   seems like someone want to use cleaner to handle it.
   https://github.com/apache/hudi/issues/2707#issuecomment-804849028
   
   Unfortunately, I also found cleaner is not working well with 
insert_overwrite_table, 
   it only keep one file group
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zherenyu831 edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-03-31 Thread GitBox


zherenyu831 edited a comment on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897


   @jsbali 
   For make everything clear:
   On archiving:
   Since the first replacecommit have 0 partitionToReplaceFileIds, and it 
failed at
   
https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73
   
   with error `Positive number of partitions required`
   
   Solution is 
   Manually delete first replacecommit or  ignore deletion on  0 
partitionToReplaceFileIds replacecommit  in the code as your mentioned on your 
ticket
   
   After I deleted the first replacecommit, the archiving finished successfully 
   But I got another issue, 
   https://github.com/apache/hudi/issues/2707#issuecomment-804831651
   
   This is not related to what I deleted, as you can see, instant time is 
different
   Once the commits has been deleted on archiving process, hudi tried to load 
timeline again without refresh it.
   
   I didn't debug more, since I want to ask the developers about how they want 
to deal with replaceFile deletion.
   seems like someone want to use cleaner to handle it.
   https://github.com/apache/hudi/issues/2707#issuecomment-804849028
   
   Unfortunately, I also found cleaner is not working well with 
insert_overwrite_table, 
   it only keep one file group
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zherenyu831 commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-03-31 Thread GitBox


zherenyu831 commented on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-811565897


   @jsbali 
   For make everything clear:
   On archiving:
   Since the first replacecommit have 0 partitionToReplaceFileIds, and it 
failed at
   
https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/ReplaceArchivalHelper.java#L73
   
   with error `Positive number of partitions required`
   
   Solution is 
   Manually delete first replacecommit or  ignore deletion on  0 
partitionToReplaceFileIds replacecommit  in the code as your mentioned on your 
ticket
   
   After I deleted the first replacecommit, the archiving finished successfully 
   But I got another issue, 
   https://github.com/apache/hudi/issues/2707#issuecomment-804831651
   
   This is not related to what I deleted, commit time is different
   Once the commits has been deleted on archiving process, hudi tried to load 
timeline again without refresh it.
   
   I didn't debug more, since I want to ask the developers about how they want 
to deal with replaceFile deletion.
   seems like someone want to use cleaner to handle it.
   https://github.com/apache/hudi/issues/2707#issuecomment-804849028
   
   Unfortunately, I also found cleaner is not working well with 
insert_overwrite_table, 
   it only keep one file group
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1747) Deltastreamer incremental read is not working on the MOR table

2021-03-31 Thread Vinoth Govindarajan (Jira)
Vinoth Govindarajan created HUDI-1747:
-

 Summary: Deltastreamer incremental read is not working on the MOR 
table
 Key: HUDI-1747
 URL: https://issues.apache.org/jira/browse/HUDI-1747
 Project: Apache Hudi
  Issue Type: Bug
  Components: Common Core
Reporter: Vinoth Govindarajan


I was trying to read the MOR HUDI table incrementally using delta streamer, 
while doing that I ran into this issue where it says:
{code:java}
Found recursive reference in Avro schema, which can not be processed by 
Spark:{code}
Spark Version: 2.4

Hudi Version: 0.7.0-SNAPSHOT or the latest master

 

Full Stack Trace:
{code:java}
Found recursive reference in Avro schema, which can not be processed by Spark:
{
  "type" : "record",
  "name" : "meta",
  "fields" : [ {
"name" : "verified",
"type" : [ "null", "boolean" ],
"default" : null
  }, {
"name" : "zip",
"type" : [ "null", "string" ],
"default" : null
  }, {
"name" : "lname",
"type" : [ "null", "string" ],
"default" : null
  }]
}
  
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:75)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:95)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:81)
at 
org.apache.spark.sql.avro.SchemaConverters$.toSqlTypeHelper(SchemaConverters.scala:105)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:82)
at 
org.apache.spark.sql.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:81)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 

[GitHub] [hudi] nsivabalan closed pull request #2750: [HUDI-1754] [HOT_FIX] Revert "[HUDI-1526] Translate the api partitionBy in spark datasource (#2431) due to usage of unavailable apis with older sp

2021-03-31 Thread GitBox


nsivabalan closed pull request #2750:
URL: https://github.com/apache/hudi/pull/2750


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan closed issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


nsivabalan closed issue #2748:
URL: https://github.com/apache/hudi/issues/2748


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


nsivabalan commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811513554


   @aditiwari01 : don't think we are looking to support spark versions < 2.4.3. 
Its always been the case, and its a documentation issue. we will fix it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] rubenssoto commented on issue #2294: [SUPPORT] java.lang.IllegalArgumentException: Can not create a Path from an empty string on non partitioned COW table

2021-03-31 Thread GitBox


rubenssoto commented on issue #2294:
URL: https://github.com/apache/hudi/issues/2294#issuecomment-811477227


   @bvaradar is it expected that this bug exists on 0.7.0?
   What problem this bug cause?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] umehrot2 commented on a change in pull request #2651: [HUDI-1591] [RFC-26] Improve Hoodie Table Query Performance And Ease Of Use Fo…

2021-03-31 Thread GitBox


umehrot2 commented on a change in pull request #2651:
URL: https://github.com/apache/hudi/pull/2651#discussion_r605196012



##
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala
##
@@ -81,4 +82,53 @@ object HoodieWriterUtils {
 params.foreach(kv => props.setProperty(kv._1, kv._2))
 props
   }
+
+  /**
+   * Get the partition columns to stored to hoodie.properties.
+   * @param parameters
+   * @return
+   */
+  def getPartitionColumns(parameters: Map[String, String]): Option[String] = {
+val  keyGenClass = parameters.getOrElse(KEYGENERATOR_CLASS_OPT_KEY,
+  DEFAULT_KEYGENERATOR_CLASS_OPT_VAL)
+try {
+  val constructor = getClass.getClassLoader.loadClass(keyGenClass)
+.getConstructor(classOf[TypedProperties])
+  constructor.setAccessible(true)
+  val props = new TypedProperties()
+  props.putAll(parameters.asJava)
+  val keyGen = constructor.newInstance(props)

Review comment:
   Can't we pass the KeyGenerator already created in `HoodieSparkSqlWriter` 
and `DeltaSync` instead of recreating it again here ? Both these places already 
create `KeyGenerator` using reflection.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=h1) Report
   > Merging 
[#2747](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=desc) (61d0222) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc)
 (aa0da72) will **decrease** coverage by `1.70%`.
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head 61d0222 differs from pull request most recent 
head ee4c06f. Consider uploading reports for the commit ee4c06f to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2747/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2747  +/-   ##
   
   - Coverage 52.06%   50.35%   -1.71% 
   + Complexity 3625 3253 -372 
   
 Files   479  425  -54 
 Lines 2280420815-1989 
 Branches   2415 2179 -236 
   
   - Hits  1187210482-1390 
   + Misses 9907 9439 -468 
   + Partials   1025  894 -131 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.94% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==)
 | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | |
   | 
[...ities/schema/NullTargetSchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9OdWxsVGFyZ2V0U2NoZW1hUmVnaXN0cnlQcm92aWRlci5qYXZh)
 | | | |
   | 
[...che/hudi/utilities/sources/HiveIncrPullSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSGl2ZUluY3JQdWxsU291cmNlLmphdmE=)
 | | | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | | | |
   | 
[...ities/checkpointing/InitialCheckPointProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrUG9pbnRQcm92aWRlci5qYXZh)
 | | | |
   | 
[...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==)
 | | | |
   | 
[.../org/apache/hudi/utilities/sources/InputBatch.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSW5wdXRCYXRjaC5qYXZh)
 | | | |
   | 
[...i/utilities/deser/KafkaAvroSchemaDeserializer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2Rlc2VyL0thZmthQXZyb1NjaGVtYURlc2VyaWFsaXplci5qYXZh)
 | | | |
   | 
[...lities/checkpointing/KafkaConnectHdfsProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvS2Fma2FDb25uZWN0SGRmc1Byb3ZpZGVyLmphdmE=)
 | | | |
   | 
[...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=)

[GitHub] [hudi] codecov-io edited a comment on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-810800755


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=h1) Report
   > Merging 
[#2747](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=desc) (61d0222) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc)
 (aa0da72) will **decrease** coverage by `1.70%`.
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head 61d0222 differs from pull request most recent 
head ee4c06f. Consider uploading reports for the commit ee4c06f to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2747/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2747  +/-   ##
   
   - Coverage 52.06%   50.35%   -1.71% 
   + Complexity 3625 3253 -372 
   
 Files   479  425  -54 
 Lines 2280420815-1989 
 Branches   2415 2179 -236 
   
   - Hits  1187210482-1390 
   + Misses 9907 9439 -468 
   + Partials   1025  894 -131 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.94% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2747?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==)
 | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | |
   | 
[.../hudi/utilities/schema/SparkAvroPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TcGFya0F2cm9Qb3N0UHJvY2Vzc29yLmphdmE=)
 | | | |
   | 
[.../hudi/utilities/schema/SchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFSZWdpc3RyeVByb3ZpZGVyLmphdmE=)
 | | | |
   | 
[...ck/kafka/HoodieWriteCommitKafkaCallbackConfig.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NhbGxiYWNrL2thZmthL0hvb2RpZVdyaXRlQ29tbWl0S2Fma2FDYWxsYmFja0NvbmZpZy5qYXZh)
 | | | |
   | 
[...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh)
 | | | |
   | 
[...che/hudi/utilities/schema/SchemaPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQb3N0UHJvY2Vzc29yLmphdmE=)
 | | | |
   | 
[...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh)
 | | | |
   | 
[...e/hudi/utilities/transform/ChainedTransformer.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9DaGFpbmVkVHJhbnNmb3JtZXIuamF2YQ==)
 | | | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | | | |
   | 
[...java/org/apache/hudi/utilities/sources/Source.java](https://codecov.io/gh/apache/hudi/pull/2747/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvU291cmNlLmphdmE=)
 | | | |
   

[GitHub] [hudi] codecov-io edited a comment on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2749:
URL: https://github.com/apache/hudi/pull/2749#issuecomment-811147597


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=h1) Report
   > Merging 
[#2749](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=desc) (e97262f) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **increase** coverage by `17.74%`.
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head e97262f differs from pull request most recent 
head e0dcab7. Consider uploading reports for the commit e0dcab7 to get more 
accurate results
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2749/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#2749   +/-   ##
   =
   + Coverage 52.04%   69.78%   +17.74% 
   + Complexity 3625  372 -3253 
   =
 Files   479   54  -425 
 Lines 22804 1989-20815 
 Branches   2415  236 -2179 
   =
   - Hits  11868 1388-10480 
   + Misses 9911  471 -9440 
   + Partials   1025  130  -895 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `69.78% <ø> (ø)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh)
 | | | |
   | 
[...ache/hudi/sink/StreamWriteOperatorCoordinator.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlT3BlcmF0b3JDb29yZGluYXRvci5qYXZh)
 | | | |
   | 
[...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh)
 | | | |
   | 
[.../org/apache/hudi/exception/HoodieKeyException.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUtleUV4Y2VwdGlvbi5qYXZh)
 | | | |
   | 
[...g/apache/hudi/exception/InvalidTableException.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0ludmFsaWRUYWJsZUV4Y2VwdGlvbi5qYXZh)
 | | | |
   | 
[...org/apache/hudi/common/util/SpillableMapUtils.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvU3BpbGxhYmxlTWFwVXRpbHMuamF2YQ==)
 | | | |
   | 
[...java/org/apache/hudi/common/lock/LockProvider.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2xvY2svTG9ja1Byb3ZpZGVyLmphdmE=)
 | | | |
   | 
[...cala/org/apache/hudi/HoodieBootstrapRelation.scala](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUJvb3RzdHJhcFJlbGF0aW9uLnNjYWxh)
 | | | |
   | 
[.../hudi/hadoop/realtime/HoodieRealtimeFileSplit.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0hvb2RpZVJlYWx0aW1lRmlsZVNwbGl0LmphdmE=)
 | | | |
   | 
[...g/apache/hudi/sink/partitioner/BucketAssigner.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL0J1Y2tldEFzc2lnbmVyLmphdmE=)
 | | | |
   | ... and [415 
more](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree-more) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this 

[GitHub] [hudi] jsbali commented on pull request #2678: Added support for replace commits in commit showpartitions, commit sh…

2021-03-31 Thread GitBox


jsbali commented on pull request #2678:
URL: https://github.com/apache/hudi/pull/2678#issuecomment-811284724


   Created the jira https://issues.apache.org/jira/browse/HUDI-1746.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-1746) Added support for replace commits in hudi-cli

2021-03-31 Thread Jagmeet Bali (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312583#comment-17312583
 ] 

Jagmeet Bali commented on HUDI-1746:


https://github.com/apache/hudi/pull/2678

> Added support for replace commits in hudi-cli
> -
>
> Key: HUDI-1746
> URL: https://issues.apache.org/jira/browse/HUDI-1746
> Project: Apache Hudi
>  Issue Type: New Feature
>Reporter: Jagmeet Bali
>Priority: Minor
>
> Currently, hudi-cli doesn't support replace commits in the commit show* 
> functions. This adds the foundation for that.
> This PR still doesn't support the extraMetadata of the replace commit which 
> will be added in subsequent PR's.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-1746) Added support for replace commits in hudi-cli

2021-03-31 Thread Jagmeet Bali (Jira)
Jagmeet Bali created HUDI-1746:
--

 Summary: Added support for replace commits in hudi-cli
 Key: HUDI-1746
 URL: https://issues.apache.org/jira/browse/HUDI-1746
 Project: Apache Hudi
  Issue Type: New Feature
Reporter: Jagmeet Bali


Currently, hudi-cli doesn't support replace commits in the commit show* 
functions. This adds the foundation for that.
This PR still doesn't support the extraMetadata of the replace commit which 
will be added in subsequent PR's.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] aditiwari01 commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


aditiwari01 commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811280058


   Yes. I faced the issue with 2.4.0 version itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


nsivabalan commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811277730


   @vinothchandar : guess issue is w/ spark2.4.1. looks like the api is 
available from 2.4.2 and not available in 2.4.1.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] kimberlyamandalu commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue

2021-03-31 Thread GitBox


kimberlyamandalu commented on issue #2696:
URL: https://github.com/apache/hudi/issues/2696#issuecomment-811249839


   > @kimberlyamandalu Can you try turning off the metadata table in hoodie to 
get your pipeline unblocked ?
   > 
   > ```
   > hoodie.metadata.enable=false
   > ```
   > 
   > This looks like an exception in the metadata table. Without any more logs, 
it's hard to debug what may be going on. If you are OK to deploy a custom 
build, we can work on adding more logs to help surface the underlying issue.
   > 
   > 
https://github.com/apache/hudi/blob/release-0.7.0/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/FileSystemViewHandler.java#L378
 this is where the exception is coming from. If we can add more logs to this 
function to see why a runtime exception is being thrown, it may help to find 
the root cause.
   
   Hi @n3nash,
   Thanks for looking into this.
   I can try disabling metadata for my workload. Can we toggle metadata on and 
off for a data set? or does it need to be enabled from the time the data is 
bootstrapped? Is it something we can turn on after we have been ingesting for a 
bit and it will still work?
   
   I will be willing to deploy a custom build so we can get more details on 
this issue. I will need your help on it though as i'm not very familiar with 
how to do this.
   
   Thanks!!!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vinothchandar commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


vinothchandar commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811224728


   @aditiwari01 This is a bit of my bad. Actually we have always required Spark 
2.4 for a while now (we need `spark-avro` from there). So as long as Spark 2.4 
works and Spark 3.x works we are good. 
   
   Sorry for the false alarms
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vinothchandar commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


vinothchandar commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811202865


   @nsivabalan this would be a release blocker and we need to fix the RC if 
spark 2.x has issues. agree?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan edited a comment on issue #2648: [SUPPORT] a NPE error when reading MOR table in spark datasource

2021-03-31 Thread GitBox


nsivabalan edited a comment on issue #2648:
URL: https://github.com/apache/hudi/issues/2648#issuecomment-811188216


   @n3nash : can you please add severity label if you have sense of it. just 
reminding. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2655: [WIP] [HUDI-1615] Fixing null schema for delete operation in spark datasource

2021-03-31 Thread GitBox


nsivabalan commented on pull request #2655:
URL: https://github.com/apache/hudi/pull/2655#issuecomment-811192144


   yeah, I haven't got a chance to fix it. once its ready, will let you know. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2717: [SUPPORT] run_sync_tool support hive3.1.2 on hadoop3.1.4

2021-03-31 Thread GitBox


nsivabalan commented on issue #2717:
URL: https://github.com/apache/hudi/issues/2717#issuecomment-811181956


   sure. Closing this out. we can continue the discussion in PR. 
[Here](https://issues.apache.org/jira/browse/HUDI-1721) is the right jira 
(closed the other one as duplicate)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan closed issue #2717: [SUPPORT] run_sync_tool support hive3.1.2 on hadoop3.1.4

2021-03-31 Thread GitBox


nsivabalan closed issue #2717:
URL: https://github.com/apache/hudi/issues/2717


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2648: [SUPPORT] a NPE error when reading MOR table in spark datasource

2021-03-31 Thread GitBox


nsivabalan commented on issue #2648:
URL: https://github.com/apache/hudi/issues/2648#issuecomment-811188216


   @n3nash : can you please add severity label if you have sense of it. just 
reminding. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2609: [SUPPORT] Presto hudi query slow when compared to parquet

2021-03-31 Thread GitBox


nsivabalan commented on issue #2609:
URL: https://github.com/apache/hudi/issues/2609#issuecomment-811177897


   thanks Sudha. if you feel you are done w/ you response, feel free to remove 
the "awaiting-community-help" label 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1724) run_sync_tool support for hive3.1.2 on hadoop3.1.4

2021-03-31 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-1724:
--
Labels: sev:critical user-support-issues  (was: sev:triage 
user-support-issues)

> run_sync_tool support for hive3.1.2 on hadoop3.1.4
> --
>
> Key: HUDI-1724
> URL: https://issues.apache.org/jira/browse/HUDI-1724
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Hive Integration
>Reporter: Balaji Varadarajan
>Priority: Major
>  Labels: sev:critical, user-support-issues
>
> Context: https://github.com/apache/hudi/issues/2717



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] nsivabalan commented on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-31 Thread GitBox


nsivabalan commented on issue #2633:
URL: https://github.com/apache/hudi/issues/2633#issuecomment-811179179


   @n3nash : Can you please file a tracking jira and close this out. Do add 
labels w/ severity as appropriate to both issues and jira 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] yanghua commented on pull request #2747: [HUDI-1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


yanghua commented on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-811163812


   @vingov Thanks, but the CI has failed, would you please check the reason? If 
you not sure, you can push an empty commit to retrigger the Travis.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


nsivabalan commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-811154406


   thanks a lot for raising this @aditiwari01 . we are looking into it. 
   https://issues.apache.org/jira/browse/HUDI-1745
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io commented on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread GitBox


codecov-io commented on pull request #2749:
URL: https://github.com/apache/hudi/pull/2749#issuecomment-811147597


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=h1) Report
   > Merging 
[#2749](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=desc) (e0dcab7) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `42.64%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2749/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2749   +/-   ##
   
   - Coverage 52.04%   9.40%   -42.65% 
   + Complexity 3625  48 -3577 
   
 Files   479  54  -425 
 Lines 228041989-20815 
 Branches   2415 236 -2179 
   
   - Hits  11868 187-11681 
   + Misses 99111789 -8122 
   + Partials   1025  13 -1012 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2749?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2749/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%> 

[jira] [Updated] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api

2021-03-31 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-1745:
--
Labels: sev:critical  (was: )

> Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable 
> spark api
> -
>
> Key: HUDI-1745
> URL: https://issues.apache.org/jira/browse/HUDI-1745
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Spark Integration
>Affects Versions: 0.8.0
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: sev:critical
>
> [https://github.com/apache/hudi/issues/2748]
>  
> PR thats of interest: https://github.com/apache/hudi/pull/2431



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api

2021-03-31 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-1745:
--
Description: 
[https://github.com/apache/hudi/issues/2748]

 

PR thats of interest: [https://github.com/apache/hudi/pull/2431]

 

I see we have three options. Let me know if we have more.
Option1:
Similar to SparkRowSerDe, we might have to introduce an interface for 
translateSqlOptions and override based on spark versions. But already we have 
two sub modules for spark2 and spark3. and now we might have to add more such 
modules for diff spark2 versions which might need more thought to do it 
elegantly.
Option2:
Since this feature is added only w/ 0.8.0, and not like a more sought after 
feature, we could revert this commit and unblock ourselves for 0.8.0. Once 
release is complete, we can decide how to do about doing this and get this 
feature in for next release.
Option3:
we say that hudi does not support spark version < 2.4.4 w/ 0.8.0. Don't think 
we can go this route. But just listing it out.

  was:
[https://github.com/apache/hudi/issues/2748]

 

PR thats of interest: https://github.com/apache/hudi/pull/2431


> Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable 
> spark api
> -
>
> Key: HUDI-1745
> URL: https://issues.apache.org/jira/browse/HUDI-1745
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Spark Integration
>Affects Versions: 0.8.0
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: sev:critical
>
> [https://github.com/apache/hudi/issues/2748]
>  
> PR thats of interest: [https://github.com/apache/hudi/pull/2431]
>  
> I see we have three options. Let me know if we have more.
> Option1:
> Similar to SparkRowSerDe, we might have to introduce an interface for 
> translateSqlOptions and override based on spark versions. But already we have 
> two sub modules for spark2 and spark3. and now we might have to add more such 
> modules for diff spark2 versions which might need more thought to do it 
> elegantly.
> Option2:
> Since this feature is added only w/ 0.8.0, and not like a more sought after 
> feature, we could revert this commit and unblock ourselves for 0.8.0. Once 
> release is complete, we can decide how to do about doing this and get this 
> feature in for next release.
> Option3:
> we say that hudi does not support spark version < 2.4.4 w/ 0.8.0. Don't think 
> we can go this route. But just listing it out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-1745) Hudi compilation fails w/ spark version < 2.4.4 due to usage of unavailable spark api

2021-03-31 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1745:
-

 Summary: Hudi compilation fails w/ spark version < 2.4.4 due to 
usage of unavailable spark api
 Key: HUDI-1745
 URL: https://issues.apache.org/jira/browse/HUDI-1745
 Project: Apache Hudi
  Issue Type: Bug
  Components: Spark Integration
Affects Versions: 0.8.0
Reporter: sivabalan narayanan


[https://github.com/apache/hudi/issues/2748]

 

PR thats of interest: https://github.com/apache/hudi/pull/2431



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] nsivabalan opened a new pull request #2750: Revert "[HUDI-1526] Translate the api partitionBy in spark datasource (#2431) due to usage of incompatabile apis with older spark versions

2021-03-31 Thread GitBox


nsivabalan opened a new pull request #2750:
URL: https://github.com/apache/hudi/pull/2750


   
   This reverts commit 26da4f546275e8ab6496537743efe73510cb723d.
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-1744) [Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1744:
-
Labels: pull-request-available  (was: )

> [Rollback] rollback fail on mor table when the partition path hasn't any files
> --
>
> Key: HUDI-1744
> URL: https://issues.apache.org/jira/browse/HUDI-1744
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: lrz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> when rollback on a mor table, and if the partition path hasn't any files, 
> then will throw exception because of call rdd.flatmap with 0 as numpartitions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] li36909 commented on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread GitBox


li36909 commented on pull request #2749:
URL: https://github.com/apache/hudi/pull/2749#issuecomment-811060626


   cc @n3nash could you help to take a look, thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2745:
URL: https://github.com/apache/hudi/pull/2745#issuecomment-810102368


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=h1) Report
   > Merging 
[#2745](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=desc) (2d6a76c) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/fe16d0de7c76105775c887b700751241bc82624c?el=desc)
 (fe16d0d) will **decrease** coverage by `42.64%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2745/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2745   +/-   ##
   
   - Coverage 52.04%   9.40%   -42.65% 
   + Complexity 3625  48 -3577 
   
 Files   479  54  -425 
 Lines 228041989-20815 
 Branches   2415 236 -2179 
   
   - Hits  11868 187-11681 
   + Misses 99111789 -8122 
   + Partials   1025  13 -1012 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2745?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2745/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%> 

[GitHub] [hudi] li36909 opened a new pull request #2749: [Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread GitBox


li36909 opened a new pull request #2749:
URL: https://github.com/apache/hudi/pull/2749


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   for example, if someone write into a mor table and fail, and if this is the 
first commit, then the partition path is empty. then second time we submit a 
new write commit, it will fail because of rollback fail.
   this pr use to solve this bug.
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   add new ut
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-1744) [Rollback] rollback fail on mor table when the partition path hasn't any files

2021-03-31 Thread lrz (Jira)
lrz created HUDI-1744:
-

 Summary: [Rollback] rollback fail on mor table when the partition 
path hasn't any files
 Key: HUDI-1744
 URL: https://issues.apache.org/jira/browse/HUDI-1744
 Project: Apache Hudi
  Issue Type: Bug
Reporter: lrz
 Fix For: 0.9.0


when rollback on a mor table, and if the partition path hasn't any files, then 
will throw exception because of call rdd.flatmap with 0 as numpartitions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604841403



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option 
lastCheckpointStr, Set partitionInfoList, String topicName, Long timestamp) {

Review comment:
   Can we add tests for the new code that is added. I don't see any tests. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604840433



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, 
FileSystem fs, Config
   "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to 
ensure updates are not missed.");
 
   this.props = properties.get();
+  String kafkaCheckpointTimestamp = 
props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");

Review comment:
   Let me think more on this. Wondering if we should just rely on existing 
"HoodieDeltaStreamer.Config.checkpoint" only and add another config named 
"checkpoint.type" or something which could be set to timestamp for this 
purpose. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-03-31 Thread GitBox


nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604828288



##
File path: 
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
 return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair>> 
fetchSource() throws Exception {
+  public Pair>, Pair> fetchSource() throws Exception {

Review comment:
   actually my PR was closed as it was invalid. But 
[here](https://github.com/nsivabalan/hudi/blob/f7439e2e28748bf7b713fb72ba611f8af7bb97a1/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/ReadBatch.java)
 is the class that I added. May be you can add it in this patch only. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on pull request #2520: [HUDI-1446] Support skip bootstrapIndex's init in abstract fs view init

2021-03-31 Thread GitBox


nsivabalan commented on pull request #2520:
URL: https://github.com/apache/hudi/pull/2520#issuecomment-811001660


   btw, CI build is failed. do check it out as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #2520: [HUDI-1446] Support skip bootstrapIndex's init in abstract fs view init

2021-03-31 Thread GitBox


nsivabalan commented on a change in pull request #2520:
URL: https://github.com/apache/hudi/pull/2520#discussion_r604825768



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
##
@@ -67,6 +68,7 @@
   public static final String HOODIE_TIMELINE_LAYOUT_VERSION = 
"hoodie.timeline.layout.version";
   public static final String HOODIE_PAYLOAD_CLASS_PROP_NAME = 
"hoodie.compaction.payload.class";
   public static final String HOODIE_ARCHIVELOG_FOLDER_PROP_NAME = 
"hoodie.archivelog.folder";
+  public static final String HOODIE_BOOTSTRAP_INDEX_ENABLE = 
"hoodie.bootstrap.index.enable";

Review comment:
   may I know where do we set default value for this config to false? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] Sugamber commented on issue #2637: [SUPPORT] - Partial Update : update few columns of a table

2021-03-31 Thread GitBox


Sugamber commented on issue #2637:
URL: https://github.com/apache/hudi/issues/2637#issuecomment-810995008


   Thank you!!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle

2021-03-31 Thread GitBox


RocMarshal commented on a change in pull request #2745:
URL: https://github.com/apache/hudi/pull/2745#discussion_r604791723



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##
@@ -179,29 +181,47 @@ public IOType getIOType() {
 

Review comment:
   all of differences above here is caused by merging from latest master 
branch.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle

2021-03-31 Thread GitBox


RocMarshal commented on a change in pull request #2745:
URL: https://github.com/apache/hudi/pull/2745#discussion_r604791723



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##
@@ -179,29 +181,47 @@ public IOType getIOType() {
 

Review comment:
   all of differences above here is caused by merging from latest master 
branch.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vingov edited a comment on pull request #2747: [HUDI_1743] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


vingov edited a comment on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095


   @yanghua - I've already filled Jira and linked in the description 
https://issues.apache.org/jira/browse/HUDI-1743
   
   Updated the title as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] RocMarshal commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle

2021-03-31 Thread GitBox


RocMarshal commented on a change in pull request #2745:
URL: https://github.com/apache/hudi/pull/2745#discussion_r604785572



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##
@@ -180,28 +180,34 @@ public IOType getIOType() {
   fileWriter.close();
 
   HoodieWriteStat stat = new HoodieWriteStat();
-  stat.setPartitionPath(writeStatus.getPartitionPath());
-  stat.setNumWrites(recordsWritten);
-  stat.setNumDeletes(recordsDeleted);
-  stat.setNumInserts(insertRecordsWritten);
-  stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT);
-  stat.setFileId(writeStatus.getFileId());
-  stat.setPath(new Path(config.getBasePath()), path);
   long fileSizeInBytes = FSUtils.getFileSize(fs, path);
   stat.setTotalWriteBytes(fileSizeInBytes);
   stat.setFileSizeInBytes(fileSizeInBytes);
-  stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords());

Review comment:
   @danny0405 Thank you for your suggestions. I made some changes, please 
take a look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vingov edited a comment on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


vingov edited a comment on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095


   @yanghua - I've already filled Jira and linked in the description 
https://issues.apache.org/jira/browse/HUDI-1743
   
   Do you want me to update the title of this pull request?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] vingov commented on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


vingov commented on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-81095


   @yanghua - I've already filled Jira and linked in the description 
https://issues.apache.org/jira/browse/HUDI-1743
   
   Do you want me to update the title of this pull request?
   
   I will fix this build issue as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] codecov-io edited a comment on pull request #2742: [HUDI-1738] Emit deletes for flink MOR table streaming read

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2742:
URL: https://github.com/apache/hudi/pull/2742#issuecomment-810101347


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=h1) Report
   > Merging 
[#2742](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=desc) (e99b8c5) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/050626ad6cb8bbd06d138456ccc00dddcff2a860?el=desc)
 (050626a) will **decrease** coverage by `42.64%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2742/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2742   +/-   ##
   
   - Coverage 52.04%   9.40%   -42.65% 
   + Complexity 3625  48 -3577 
   
 Files   479  54  -425 
 Lines 228041989-20815 
 Branches   2415 236 -2179 
   
   - Hits  11868 187-11681 
   + Misses 99111789 -8122 
   + Partials   1025  13 -1012 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.40% <ø> (-60.39%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2742?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2742/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%> 

[GitHub] [hudi] codecov-io edited a comment on pull request #2325: [HUDI-699]Fix CompactionCommand and add unit test for CompactionCommand

2021-03-31 Thread GitBox


codecov-io edited a comment on pull request #2325:
URL: https://github.com/apache/hudi/pull/2325#issuecomment-742860619


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=h1) Report
   > Merging 
[#2325](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=desc) (9cb7bf6) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/aa0da72c59cb3764205f90b025b24d1640727795?el=desc)
 (aa0da72) will **increase** coverage by `0.25%`.
   > The diff coverage is `27.58%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2325/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2325  +/-   ##
   
   + Coverage 52.06%   52.31%   +0.25% 
   - Complexity 3625 3645  +20 
   
 Files   479  479  
 Lines 2280422836  +32 
 Branches   2415 2416   +1 
   
   + Hits  1187211946  +74 
   + Misses 9907 9858  -49 
   - Partials   1025 1032   +7 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `40.37% <38.09%> (+3.35%)` | `0.00 <3.00> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | `0.00 <ø> (ø)` | |
   | hudicommon | `50.81% <0.00%> (-0.16%)` | `0.00 <0.00> (ø)` | |
   | hudiflink | `56.01% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.44% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `70.87% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `45.47% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `69.78% <ø> (+0.05%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2325?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...a/org/apache/hudi/cli/HoodieTableHeaderFields.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZVRhYmxlSGVhZGVyRmllbGRzLmphdmE=)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...n/java/org/apache/hudi/cli/commands/SparkMain.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NwYXJrTWFpbi5qYXZh)
 | `6.95% <0.00%> (+0.11%)` | `4.00 <0.00> (ø)` | |
   | 
[.../common/table/timeline/HoodieArchivedTimeline.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFyY2hpdmVkVGltZWxpbmUuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...i/common/table/timeline/TimelineMetadataUtils.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL1RpbWVsaW5lTWV0YWRhdGFVdGlscy5qYXZh)
 | `70.17% <0.00%> (-2.56%)` | `17.00 <0.00> (ø)` | |
   | 
[...rg/apache/hudi/cli/commands/CompactionCommand.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0NvbXBhY3Rpb25Db21tYW5kLmphdmE=)
 | `30.18% <52.17%> (+29.38%)` | `22.00 <3.00> (+20.00)` | |
   | 
[...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==)
 | `79.31% <0.00%> (-10.35%)` | `15.00% <0.00%> (-1.00%)` | |
   | 
[...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==)
 | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   | 
[...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2325/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=)
 | `71.72% <0.00%> (+0.34%)` | `56.00% <0.00%> (+1.00%)` | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] yanghua commented on pull request #2747: [MINOR] Added support for SqlFileBasedTransformer

2021-03-31 Thread GitBox


yanghua commented on pull request #2747:
URL: https://github.com/apache/hudi/pull/2747#issuecomment-810887489


   @vingov Thanks for your contribution. IMO, this is a feature, it would be 
better to file a Jira ticket to track it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] yanghua merged pull request #2746: [MINOR] Delete useless UpsertPartitioner for flink integration

2021-03-31 Thread GitBox


yanghua merged pull request #2746:
URL: https://github.com/apache/hudi/pull/2746


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (aa0da72 -> fe16d0d)

2021-03-31 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository.

vinoyang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from aa0da72  Preparation for Avro update (#2650)
 add fe16d0d  [MINOR] Delete useless UpsertPartitioner for flink 
integration (#2746)

No new revisions were added by this update.

Summary of changes:
 .../table/action/commit/UpsertPartitioner.java | 319 -
 1 file changed, 319 deletions(-)
 delete mode 100644 
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java


[GitHub] [hudi] liujinhui1994 commented on pull request #2710: [RFC-20][HUDI-648] Implement error log/table for Datasource/DeltaStreamer/WriteClient/Compaction writes

2021-03-31 Thread GitBox


liujinhui1994 commented on pull request #2710:
URL: https://github.com/apache/hudi/pull/2710#issuecomment-810842963


   @lw309637554  Please help review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2710: [RFC-20][HUDI-648] Implement error log/table for Datasource/DeltaStreamer/WriteClient/Compaction writes

2021-03-31 Thread GitBox


liujinhui1994 commented on a change in pull request #2710:
URL: https://github.com/apache/hudi/pull/2710#discussion_r604658298



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##
@@ -1032,6 +1033,37 @@ public String getWriteMetaKeyPrefixes() {
 return props.getProperty(WRITE_META_KEY_PREFIXES_PROP);
   }
 
+  /**
+   * Error table configs.
+   */
+  public boolean enableErrorTable() {
+return 
Boolean.parseBoolean(props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_ENABLE_PROP));
+  }
+
+  public String getErrorTableBasePath() {
+return 
props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_BASE_PATH_PROP);
+  }
+
+  public String getErrorTableName() {
+return props.getProperty(HoodieErrorTableConfig.ERROR_TABLE_NAME_PROP);
+  }
+
+  public int getErrorTableInsertParallelism() {

Review comment:
   I think it is reasonable here, you need to get these values from 
HoodieWriteConfig
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] aditiwari01 commented on issue #2748: Are we stopping support for spark <2.4.4 with hudi 0.8.0?

2021-03-31 Thread GitBox


aditiwari01 commented on issue #2748:
URL: https://github.com/apache/hudi/issues/2748#issuecomment-810832656


   @garyli1019  Yes. More specifically for the mentioned method, we are using a 
constant "SparkDataSourceUtils.PARTITIONING_COLUMNS_KEY". 
   
   This was introduced in spark v2.4.2. Hence for versions <2.4.2 the 
compilation fails with object not found.
   
   For reference:
   spark v2.4.1 
(https://github.com/apache/spark/blob/v2.4.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala)
  PARTITIONING_COLUMNS_KEY not present.
   
   spark v2.4.2 
(https://github.com/apache/spark/blob/v2.4.2/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala)
 PARTITIONING_COLUMNS_KEY is present.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] danny0405 commented on a change in pull request #2745: [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle

2021-03-31 Thread GitBox


danny0405 commented on a change in pull request #2745:
URL: https://github.com/apache/hudi/pull/2745#discussion_r604647518



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java
##
@@ -180,28 +180,34 @@ public IOType getIOType() {
   fileWriter.close();
 
   HoodieWriteStat stat = new HoodieWriteStat();
-  stat.setPartitionPath(writeStatus.getPartitionPath());
-  stat.setNumWrites(recordsWritten);
-  stat.setNumDeletes(recordsDeleted);
-  stat.setNumInserts(insertRecordsWritten);
-  stat.setPrevCommit(HoodieWriteStat.NULL_COMMIT);
-  stat.setFileId(writeStatus.getFileId());
-  stat.setPath(new Path(config.getBasePath()), path);
   long fileSizeInBytes = FSUtils.getFileSize(fs, path);
   stat.setTotalWriteBytes(fileSizeInBytes);
   stat.setFileSizeInBytes(fileSizeInBytes);
-  stat.setTotalWriteErrors(writeStatus.getTotalErrorRecords());

Review comment:
   The only difference between `HoodieCreateHandle` and `FlinkCreateHandle` 
is how they fetch the fize size, can we extract that part out ? Because usually 
we override the different part.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >