[jira] [Assigned] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding reassigned HUDI-1181: -- Assignee: Wenning Ding > Decimal type display issue for record key field >

[GitHub] [hudi] zhedoubushishi opened a new pull request #1953: [HUDI-1181] Fix decimal type display issue for record key field

2020-08-11 Thread GitBox
zhedoubushishi opened a new pull request #1953: URL: https://github.com/apache/hudi/pull/1953 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1181: - Labels: pull-request-available (was: ) > Decimal type display issue for record key field >

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using ```fixed_len_byte_array``` decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using ```fixed_len_byte_array``` decimal type as Hudi record key, Hudi would not

[jira] [Created] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-1181: -- Summary: Decimal type display issue for record key field Key: HUDI-1181 URL: https://issues.apache.org/jira/browse/HUDI-1181 Project: Apache Hudi Issue Type:

[GitHub] [hudi] tooptoop4 commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
tooptoop4 commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-672595024 @bschell which PR fixes it? This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Created] (HUDI-1180) Upgrade HBase to 2.3.3

2020-08-11 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-1180: -- Summary: Upgrade HBase to 2.3.3 Key: HUDI-1180 URL: https://issues.apache.org/jira/browse/HUDI-1180 Project: Apache Hudi Issue Type: Improvement

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #367

2020-08-11 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.58 KB...] cdi-api-1.0.jar cdi-api.license commons-cli-1.4.jar commons-cli.license commons-io-2.5.jar commons-io.license

[GitHub] [hudi] vinothchandar commented on issue #1837: [SUPPORT]S3 file listing causing compaction to get eventually slow

2020-08-11 Thread GitBox
vinothchandar commented on issue #1837: URL: https://github.com/apache/hudi/issues/1837#issuecomment-672560925 and this is the last such place. (cleaner, rollback are all incremental now) . cc @prashantwason to the rescue ;)

[GitHub] [hudi] bvaradar commented on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
bvaradar commented on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672558523 @saumyasuhagiya : This is a very old ticket about hudi 0.4.x version. Are you using 0.4.x or 0.5.x. If it is newer, please open a new ticket with complete context.

[GitHub] [hudi] saumyasuhagiya edited a comment on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
saumyasuhagiya edited a comment on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672555379 @malanb5 @n3nash I have tried that as well still its failing. I am using hudi spark bundle and above dependency on databricks cluster.

[GitHub] [hudi] saumyasuhagiya commented on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
saumyasuhagiya commented on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672555379 @malanb5 @n3nash I have tried that as well still its failing. I am using hudi spark bundle and above dependency.

[GitHub] [hudi] bvaradar opened a new pull request #1952: [Not For Merging] Debug integ Tests

2020-08-11 Thread GitBox
bvaradar opened a new pull request #1952: URL: https://github.com/apache/hudi/pull/1952 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468975269 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -125,49 +130,58 @@ public

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468884350 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -177,4 +191,26 @@ private long

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468883660 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java ## @@ -55,21 +51,22 @@ public SimpleKeyGenerator(TypedProperties

[GitHub] [hudi] bvaradar commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
bvaradar commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-672433664 @jcunhafonte : @bschell confirmed it works in master. Can you try using master or wait for 0.6 (Release should happen in a weeks time).

[GitHub] [hudi] bvaradar closed issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
bvaradar closed issue #1813: URL: https://github.com/apache/hudi/issues/1813 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Closed] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan closed HUDI-1146. > DeltaStreamer fails to start when No updated records + schemaProvider not > supplied >

[jira] [Closed] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan closed HUDI-1091. > Handle empty input batch gracefully in ParquetDFSSource >

[jira] [Resolved] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-1146. -- Resolution: Fixed > DeltaStreamer fails to start when No updated records +

[jira] [Updated] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1146: - Status: In Progress (was: Open) > DeltaStreamer fails to start when No updated records +

[jira] [Updated] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1146: - Status: Open (was: New) > DeltaStreamer fails to start when No updated records +

[jira] [Assigned] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-1146: Assignee: Balaji Varadarajan > DeltaStreamer fails to start when No updated

[jira] [Commented] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175918#comment-17175918 ] Balaji Varadarajan commented on HUDI-1091: -- [~bschell] confirmed this is resolved in master.

[jira] [Resolved] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-1091. -- Resolution: Fixed > Handle empty input batch gracefully in ParquetDFSSource >

[jira] [Updated] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1091: - Status: In Progress (was: Open) > Handle empty input batch gracefully in

[GitHub] [hudi] bvaradar commented on issue #1837: [SUPPORT]S3 file listing causing compaction to get eventually slow

2020-08-11 Thread GitBox
bvaradar commented on issue #1837: URL: https://github.com/apache/hudi/issues/1837#issuecomment-672430487 Thanks @steveloughran : Good to know. We are looking at an approach using consolidated metadata to avoid file listing (RFC-15) in the first place. @umehrot2 : What are your thoughts

[GitHub] [hudi] bvaradar commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
bvaradar commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672406725 With bulk insert, the parallelism configuration determines the lower bound on the number of files. Since, you started with bulk insert, you are seeing that many number of files. Hudi

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468934922 ## File path: hudi-client/src/main/java/org/apache/hudi/io/storage/HoodieRowParquetWriteSupport.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468932578 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] rubenssoto commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
rubenssoto commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672378827 Hi, With bulk_insert my data was organized very well, so I started a streaming job with upsert on the same data.

[GitHub] [hudi] umehrot2 commented on issue #1936: Hudi Query Error

2020-08-11 Thread GitBox
umehrot2 commented on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-672374251 @harishchanderramesh What is the configured s3 path or your hudi table ? does it start with `s3a://` or `s3://` ? If it starts with `s3a` you may want to try using `s3://` once.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468901418 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468900813 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468900813 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] rubenssoto commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
rubenssoto commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672306115 Thank you so much for your help, it worked. Last question, Hudi organized data very well by files, but created some small files, is there any way to solve?

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468879180 ## File path: hudi-client/src/main/java/org/apache/hudi/client/model/HoodieInternalRow.java ## @@ -0,0 +1,243 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468885750 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468885195 ## File path: hudi-client/src/main/java/org/apache/hudi/io/storage/HoodieRowParquetWriteSupport.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468884350 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -177,4 +191,26 @@ private long

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468883660 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java ## @@ -55,21 +51,22 @@ public SimpleKeyGenerator(TypedProperties

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468882653 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468881817 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468879180 ## File path: hudi-client/src/main/java/org/apache/hudi/client/model/HoodieInternalRow.java ## @@ -0,0 +1,243 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
vinothchandar commented on pull request #1944: URL: https://github.com/apache/hudi/pull/1944#issuecomment-672288068 in some sense, due to the bundling changes, this feel very last minute to validate more. We have to rely on 1-2 rounds of RC testing to weed things out.

[GitHub] [hudi] vinothchandar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468873425 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: ah. did

[jira] [Updated] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1179: - Fix Version/s: 0.6.1 > Add Row tests to all key generator test classes >

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468868307 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[jira] [Created] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1179: - Summary: Add Row tests to all key generator test classes Key: HUDI-1179 URL: https://issues.apache.org/jira/browse/HUDI-1179 Project: Apache Hudi

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468867821 ## File path: hudi-spark/src/test/scala/org/apache/hudi/TestDataSourceDefaults.scala ## @@ -34,13 +36,28 @@ import org.scalatest.Assertions.fail class

[jira] [Assigned] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1179: - Assignee: sivabalan narayanan > Add Row tests to all key generator test classes

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468866921 ## File path: hudi-client/src/main/java/org/apache/hudi/keygen/KeyGenerator.java ## @@ -51,4 +53,32 @@ protected KeyGenerator(TypedProperties config) {

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468866012 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] bvaradar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468857201 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -670,7 +670,9 @@ public Builder withPath(String basePath) {

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell @vinothchandar : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark

[GitHub] [hudi] nsivabalan commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I have it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark version

[GitHub] [hudi] nsivabalan edited a comment on pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1951: URL: https://github.com/apache/hudi/pull/1951#issuecomment-672261179 Compilation error: ``` [ERROR]

[GitHub] [hudi] nsivabalan commented on pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan commented on pull request #1951: URL: https://github.com/apache/hudi/pull/1951#issuecomment-672261179 Compilation error: ``` [INFO] BUILD FAILURE [INFO] [INFO] Total time: 36.866 s [INFO]

[GitHub] [hudi] nsivabalan opened a new pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan opened a new pull request #1951: URL: https://github.com/apache/hudi/pull/1951 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] nsivabalan opened a new pull request #1950: [WIP HUDI 1040] Supporting Spark 3

2020-08-11 Thread GitBox
nsivabalan opened a new pull request #1950: URL: https://github.com/apache/hudi/pull/1950 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Commented] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Brandon Scheller (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175818#comment-17175818 ] Brandon Scheller commented on HUDI-1146: Fixed by https://github.com/apache/hudi/pull/1921 >

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468819625 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468818565 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: Here is the

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468799779 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468799403 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468797799 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: I remember its

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468797076 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468745957 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468783171 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieHiveUtils.java ## @@ -38,6 +40,9 @@ public static final String

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468782242 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java ## @@ -63,6 +62,7 @@ * that does not correspond to a

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468782242 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java ## @@ -63,6 +62,7 @@ * that does not correspond to a

[GitHub] [hudi] bhasudha opened a new pull request #1949: [MINOR] Fix release script for onetime uploading of gpgkeys

2020-08-11 Thread GitBox
bhasudha opened a new pull request #1949: URL: https://github.com/apache/hudi/pull/1949 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] bvaradar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468723660 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] bschell commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
bschell commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672067962 @vinothchandar While this works, the reflection does hurt performance as this is a frequently used path. I was looking into any better options to workaround the performance hit.

[GitHub] [hudi] bvaradar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468702061 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] vinothchandar commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
vinothchandar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672053354 @bschell is this tested and ready to go? would like to get it into the RC if possible This is an

[GitHub] [hudi] vinothchandar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468693275 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: why are we

[GitHub] [hudi] tooptoop4 opened a new issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

2020-08-11 Thread GitBox
tooptoop4 opened a new issue #1948: URL: https://github.com/apache/hudi/issues/1948 /home/ec2-user/spark_home/bin/spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer --jars "/home/ec2-user/spark-avro_2.11-2.4.6.jar" --master spark://redact:7077 --deploy-mode

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468656221 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -105,6 +104,22 @@ private[hudi] object HoodieSparkSqlWriter {

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468656221 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -105,6 +104,22 @@ private[hudi] object HoodieSparkSqlWriter {

[GitHub] [hudi] wangxianghu commented on pull request #1901: [HUDI-532]Add java doc for hudi test suite test classes

2020-08-11 Thread GitBox
wangxianghu commented on pull request #1901: URL: https://github.com/apache/hudi/pull/1901#issuecomment-671970849 @yanghua this pr is ready for review now :) This is an automated message from the Apache Git Service. To

[GitHub] [hudi] wangxianghu commented on pull request #1900: [HUDI-531]Add java doc for hudi test suite general classes

2020-08-11 Thread GitBox
wangxianghu commented on pull request #1900: URL: https://github.com/apache/hudi/pull/1900#issuecomment-671971132 @yanghua this pr is ready for review now :) This is an automated message from the Apache Git Service. To

[GitHub] [hudi] vinothchandar closed pull request #1512: [HUDI-763] Add hoodie.table.base.file.format option to hoodie.properties file

2020-08-11 Thread GitBox
vinothchandar closed pull request #1512: URL: https://github.com/apache/hudi/pull/1512 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468565839 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,163 @@ +/* + * Licensed to the Apache Software

[jira] [Assigned] (HUDI-1178) Test Flakiness in CI

2020-08-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1178: - Assignee: Balaji Varadarajan > Test Flakiness in CI > > >

[jira] [Updated] (HUDI-1178) Test Flakiness in CI (ITTestHoodieSanity.testRunHoodieJavaAppOnSinglePartitionKeyCOWTable)

2020-08-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1178: -- Summary: Test Flakiness in CI

[jira] [Created] (HUDI-1178) Test Flakiness in CI

2020-08-11 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1178: - Summary: Test Flakiness in CI Key: HUDI-1178 URL: https://issues.apache.org/jira/browse/HUDI-1178 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] nsivabalan edited a comment on pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1834: URL: https://github.com/apache/hudi/pull/1834#issuecomment-671892573 https://github.com/apache/hudi/pull/1834#discussion_r461939866 : bcoz, this is for Row where as existing WriteStats is for HoodieRecords. Guess we should have

[GitHub] [hudi] nsivabalan commented on pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on pull request #1834: URL: https://github.com/apache/hudi/pull/1834#issuecomment-671892573 https://github.com/apache/hudi/pull/1834#discussion_r461939866 : bcoz, this is for Row where as existing WriteStats is for HoodieRecords.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468512749 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/GlobalDeleteKeyGenerator.java ## @@ -54,12 +51,17 @@ public String

[jira] [Updated] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Affects Version/s: 0.6.0 > fix key generator bug > -- > > Key:

[jira] [Updated] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Status: Open (was: New) > fix key generator bug > -- > > Key:

[jira] [Updated] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Status: In Progress (was: Open) > fix key generator bug > -- > > Key:

  1   2   >