[GitHub] [hudi] bhasudha opened a new pull request #1949: [MINOR] Fix release script for onetime uploading of gpgkeys

2020-08-11 Thread GitBox
bhasudha opened a new pull request #1949: URL: https://github.com/apache/hudi/pull/1949 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468799779 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468866921 ## File path: hudi-client/src/main/java/org/apache/hudi/keygen/KeyGenerator.java ## @@ -51,4 +53,32 @@ protected KeyGenerator(TypedProperties config) {

[jira] [Assigned] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1179: - Assignee: sivabalan narayanan > Add Row tests to all key generator test classes

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468883660 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java ## @@ -55,21 +51,22 @@ public SimpleKeyGenerator(TypedProperties

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468882653 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468783171 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieHiveUtils.java ## @@ -38,6 +40,9 @@ public static final String

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468797799 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: I remember its

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468797076 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] bvaradar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468723660 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] vinothchandar commented on pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
vinothchandar commented on pull request #1944: URL: https://github.com/apache/hudi/pull/1944#issuecomment-672288068 in some sense, due to the bundling changes, this feel very last minute to validate more. We have to rely on 1-2 rounds of RC testing to weed things out.

[GitHub] [hudi] vinothchandar commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468873425 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: ah. did

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468881817 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468799403 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468818565 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -159,5 +172,24 @@ avro compile + + Review comment: Here is the

[GitHub] [hudi] nsivabalan opened a new pull request #1950: [WIP HUDI 1040] Supporting Spark 3

2020-08-11 Thread GitBox
nsivabalan opened a new pull request #1950: URL: https://github.com/apache/hudi/pull/1950 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] bvaradar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468857201 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -670,7 +670,9 @@ public Builder withPath(String basePath) {

[jira] [Updated] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1179: - Fix Version/s: 0.6.1 > Add Row tests to all key generator test classes >

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468819625 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] nsivabalan opened a new pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan opened a new pull request #1951: URL: https://github.com/apache/hudi/pull/1951 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] nsivabalan commented on pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan commented on pull request #1951: URL: https://github.com/apache/hudi/pull/1951#issuecomment-672261179 Compilation error: ``` [INFO] BUILD FAILURE [INFO] [INFO] Total time: 36.866 s [INFO]

[GitHub] [hudi] nsivabalan edited a comment on pull request #1951: [WIP HUDI 1040 Part2] Upgrading to spark 3.0.0

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1951: URL: https://github.com/apache/hudi/pull/1951#issuecomment-672261179 Compilation error: ``` [ERROR]

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell @vinothchandar : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not

[GitHub] [hudi] nsivabalan commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I have it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark version

[GitHub] [hudi] nsivabalan edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-08-11 Thread GitBox
nsivabalan edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-672262343 @bschell : I gave it a shot on this. I don't have permission to push to your branch to update this PR. Diff1: adding support to spark 3. but does not upgrade spark

[jira] [Created] (HUDI-1179) Add Row tests to all key generator test classes

2020-08-11 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1179: - Summary: Add Row tests to all key generator test classes Key: HUDI-1179 URL: https://issues.apache.org/jira/browse/HUDI-1179 Project: Apache Hudi

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468868307 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468867821 ## File path: hudi-spark/src/test/scala/org/apache/hudi/TestDataSourceDefaults.scala ## @@ -34,13 +36,28 @@ import org.scalatest.Assertions.fail class

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468745957 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468879180 ## File path: hudi-client/src/main/java/org/apache/hudi/client/model/HoodieInternalRow.java ## @@ -0,0 +1,243 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468782242 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java ## @@ -63,6 +62,7 @@ * that does not correspond to a

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468782242 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java ## @@ -63,6 +62,7 @@ * that does not correspond to a

[jira] [Commented] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Brandon Scheller (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175818#comment-17175818 ] Brandon Scheller commented on HUDI-1146: Fixed by https://github.com/apache/hudi/pull/1921 >

[GitHub] [hudi] umehrot2 commented on a change in pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-11 Thread GitBox
umehrot2 commented on a change in pull request #1944: URL: https://github.com/apache/hudi/pull/1944#discussion_r468866012 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -76,7 +76,12 @@ org.apache.hbase:hbase-common

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468932578 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] bvaradar commented on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
bvaradar commented on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672558523 @saumyasuhagiya : This is a very old ticket about hudi 0.4.x version. Are you using 0.4.x or 0.5.x. If it is newer, please open a new ticket with complete context.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468879180 ## File path: hudi-client/src/main/java/org/apache/hudi/client/model/HoodieInternalRow.java ## @@ -0,0 +1,243 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468885750 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] bvaradar commented on issue #1837: [SUPPORT]S3 file listing causing compaction to get eventually slow

2020-08-11 Thread GitBox
bvaradar commented on issue #1837: URL: https://github.com/apache/hudi/issues/1837#issuecomment-672430487 Thanks @steveloughran : Good to know. We are looking at an approach using consolidated metadata to avoid file listing (RFC-15) in the first place. @umehrot2 : What are your thoughts

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468885195 ## File path: hudi-client/src/main/java/org/apache/hudi/io/storage/HoodieRowParquetWriteSupport.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468884350 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -177,4 +191,26 @@ private long

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468900813 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468900813 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] bvaradar commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
bvaradar commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672406725 With bulk insert, the parallelism configuration determines the lower bound on the number of files. Since, you started with bulk insert, you are seeing that many number of files. Hudi

[jira] [Closed] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan closed HUDI-1091. > Handle empty input batch gracefully in ParquetDFSSource >

[jira] [Resolved] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-1146. -- Resolution: Fixed > DeltaStreamer fails to start when No updated records +

[jira] [Updated] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1146: - Status: In Progress (was: Open) > DeltaStreamer fails to start when No updated records +

[jira] [Updated] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1146: - Status: Open (was: New) > DeltaStreamer fails to start when No updated records +

[jira] [Assigned] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-1146: Assignee: Balaji Varadarajan > DeltaStreamer fails to start when No updated

[jira] [Closed] (HUDI-1146) DeltaStreamer fails to start when No updated records + schemaProvider not supplied

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan closed HUDI-1146. > DeltaStreamer fails to start when No updated records + schemaProvider not > supplied >

[GitHub] [hudi] bvaradar closed issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
bvaradar closed issue #1813: URL: https://github.com/apache/hudi/issues/1813 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] bvaradar commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
bvaradar commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-672433664 @jcunhafonte : @bschell confirmed it works in master. Can you try using master or wait for 0.6 (Release should happen in a weeks time).

[GitHub] [hudi] rubenssoto commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
rubenssoto commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672378827 Hi, With bulk_insert my data was organized very well, so I started a streaming job with upsert on the same data.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468883660 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java ## @@ -55,21 +51,22 @@ public SimpleKeyGenerator(TypedProperties

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468884350 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -177,4 +191,26 @@ private long

[GitHub] [hudi] saumyasuhagiya edited a comment on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
saumyasuhagiya edited a comment on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672555379 @malanb5 @n3nash I have tried that as well still its failing. I am using hudi spark bundle and above dependency on databricks cluster.

[GitHub] [hudi] saumyasuhagiya commented on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-11 Thread GitBox
saumyasuhagiya commented on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-672555379 @malanb5 @n3nash I have tried that as well still its failing. I am using hudi spark bundle and above dependency.

[GitHub] [hudi] umehrot2 commented on issue #1936: Hudi Query Error

2020-08-11 Thread GitBox
umehrot2 commented on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-672374251 @harishchanderramesh What is the configured s3 path or your hudi table ? does it start with `s3a://` or `s3://` ? If it starts with `s3a` you may want to try using `s3://` once.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468934922 ## File path: hudi-client/src/main/java/org/apache/hudi/io/storage/HoodieRowParquetWriteSupport.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] bvaradar opened a new pull request #1952: [Not For Merging] Debug integ Tests

2020-08-11 Thread GitBox
bvaradar opened a new pull request #1952: URL: https://github.com/apache/hudi/pull/1952 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] rubenssoto commented on issue #1902: [SUPPORT] Hudi dont put the same day in the same file

2020-08-11 Thread GitBox
rubenssoto commented on issue #1902: URL: https://github.com/apache/hudi/issues/1902#issuecomment-672306115 Thank you so much for your help, it worked. Last question, Hudi organized data very well by files, but created some small files, is there any way to solve?

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468901418 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software

[jira] [Commented] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175918#comment-17175918 ] Balaji Varadarajan commented on HUDI-1091: -- [~bschell] confirmed this is resolved in master.

[jira] [Resolved] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-1091. -- Resolution: Fixed > Handle empty input batch gracefully in ParquetDFSSource >

[jira] [Updated] (HUDI-1091) Handle empty input batch gracefully in ParquetDFSSource

2020-08-11 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1091: - Status: In Progress (was: Open) > Handle empty input batch gracefully in

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-11 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468975269 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/TimestampBasedKeyGenerator.java ## @@ -125,49 +130,58 @@ public

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using ```fixed_len_byte_array``` decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using ```fixed_len_byte_array``` decimal type as Hudi record key, Hudi would not

[jira] [Assigned] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding reassigned HUDI-1181: -- Assignee: Wenning Ding > Decimal type display issue for record key field >

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #367

2020-08-11 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.58 KB...] cdi-api-1.0.jar cdi-api.license commons-cli-1.4.jar commons-cli.license commons-io-2.5.jar commons-io.license

[GitHub] [hudi] vinothchandar commented on issue #1837: [SUPPORT]S3 file listing causing compaction to get eventually slow

2020-08-11 Thread GitBox
vinothchandar commented on issue #1837: URL: https://github.com/apache/hudi/issues/1837#issuecomment-672560925 and this is the last such place. (cleaner, rollback are all incremental now) . cc @prashantwason to the rescue ;)

[GitHub] [hudi] tooptoop4 commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-11 Thread GitBox
tooptoop4 commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-672595024 @bschell which PR fixes it? This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenning Ding updated HUDI-1181: --- Description: When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would not

[jira] [Created] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-1181: -- Summary: Decimal type display issue for record key field Key: HUDI-1181 URL: https://issues.apache.org/jira/browse/HUDI-1181 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-1181) Decimal type display issue for record key field

2020-08-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1181: - Labels: pull-request-available (was: ) > Decimal type display issue for record key field >

[GitHub] [hudi] zhedoubushishi opened a new pull request #1953: [HUDI-1181] Fix decimal type display issue for record key field

2020-08-11 Thread GitBox
zhedoubushishi opened a new pull request #1953: URL: https://github.com/apache/hudi/pull/1953 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[jira] [Created] (HUDI-1180) Upgrade HBase to 2.3.3

2020-08-11 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-1180: -- Summary: Upgrade HBase to 2.3.3 Key: HUDI-1180 URL: https://issues.apache.org/jira/browse/HUDI-1180 Project: Apache Hudi Issue Type: Improvement

[jira] [Created] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
liujinhui created HUDI-1177: --- Summary: fix key generator bug Key: HUDI-1177 URL: https://issues.apache.org/jira/browse/HUDI-1177 Project: Apache Hudi Issue Type: Bug Reporter:

[GitHub] [hudi] hddong opened a new pull request #1946: [HUDI-1176]Support log4j2 config

2020-08-11 Thread GitBox
hddong opened a new pull request #1946: URL: https://github.com/apache/hudi/pull/1946 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Updated] (HUDI-1176) Support log4j2 config

2020-08-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1176: - Labels: pull-request-available (was: ) > Support log4j2 config > - > >

[jira] [Resolved] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui resolved HUDI-1173. - Assignee: liujinhui Resolution: Fixed Has been merged into master > fix hudi-prometheus pom

[jira] [Closed] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui closed HUDI-1173. --- Has been merged into master > fix hudi-prometheus pom dependency > -- > >

[GitHub] [hudi] bvaradar commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1870: URL: https://github.com/apache/hudi/pull/1870#discussion_r468375136 ## File path: hudi-client/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java ## @@ -82,40 +83,45 @@ HoodieCleanerPlan

[jira] [Updated] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1173: Status: In Progress (was: Open) > fix hudi-prometheus pom dependency > --

[jira] [Issue Comment Deleted] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1173: Comment: was deleted (was: Has been merged into master) > fix hudi-prometheus pom dependency >

[jira] [Updated] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1173: Status: Open (was: New) > fix hudi-prometheus pom dependency > -- > >

[GitHub] [hudi] bvaradar commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1870: URL: https://github.com/apache/hudi/pull/1870#discussion_r468382842 ## File path: hudi-client/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java ## @@ -82,40 +83,45 @@ HoodieCleanerPlan

[GitHub] [hudi] bvaradar commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-11 Thread GitBox
bvaradar commented on a change in pull request #1870: URL: https://github.com/apache/hudi/pull/1870#discussion_r468383431 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java ## @@ -52,6 +52,8 @@ public static final String

[jira] [Commented] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175419#comment-17175419 ] liujinhui commented on HUDI-1177: - {code:java} Exception in thread "main" org.apache.spark.SparkException:

[hudi] branch master updated: [HUDI-808] Support cleaning bootstrap source data (#1870)

2020-08-11 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 8b928e9 [HUDI-808] Support cleaning bootstrap

[GitHub] [hudi] bvaradar merged pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-11 Thread GitBox
bvaradar merged pull request #1870: URL: https://github.com/apache/hudi/pull/1870 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Commented] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175416#comment-17175416 ] liujinhui commented on HUDI-1177: - Exception in thread "main" org.apache.spark.SparkException: Task not

[jira] [Issue Comment Deleted] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Comment: was deleted (was: Exception in thread "main" org.apache.spark.SparkException: Task not

[jira] [Commented] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175418#comment-17175418 ] liujinhui commented on HUDI-1177: - {code:java} //代码占位符 {code} Exception in thread "main"

[jira] [Issue Comment Deleted] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Comment: was deleted (was: {code:java} //代码占位符 {code} Exception in thread "main"

[GitHub] [hudi] cun123 opened a new issue #1947: datadog monitor hudi

2020-08-11 Thread GitBox
cun123 opened a new issue #1947: URL: https://github.com/apache/hudi/issues/1947 Hi: this is my configuration: "dataFrameWrite.option("hoodie.metrics.on",true). option("hoodie.metrics.reporter.type","DATADOG").

[jira] [Updated] (HUDI-1177) fix key generator bug

2020-08-11 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-1177: Status: Open (was: New) > fix key generator bug > -- > > Key:

  1   2   >