[GitHub] [incubator-hudi] yanghua commented on pull request #1572: [HUDI-836] Implement datadog metrics reporter

2020-05-17 Thread GitBox
yanghua commented on pull request #1572: URL: https://github.com/apache/incubator-hudi/pull/1572#issuecomment-629960658 Will do a final check. This is an automated message from the Apache Git Service. To respond to the

[jira] [Updated] (HUDI-906) Sudha: Create gpg key and add to KEYS file

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-906: --- Fix Version/s: (was: 0.5.0) > Sudha: Create gpg key and add to KEYS file >

[jira] [Assigned] (HUDI-906) Sudha: Create gpg key and add to KEYS file

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha reassigned HUDI-906: -- Assignee: Bhavani Sudha (was: Nishith Agarwal) > Sudha: Create gpg key and add to KEYS file >

[GitHub] [incubator-hudi] bvaradar commented on pull request #1639: [MINOR] Fix apache-rat violations

2020-05-17 Thread GitBox
bvaradar commented on pull request #1639: URL: https://github.com/apache/incubator-hudi/pull/1639#issuecomment-629957097 For the other classes, the underlying problem is that apache-rat was not enabled for hudi-utilities bundle

[GitHub] [incubator-hudi] leesf commented on pull request #1095: [HUDI-210] Implement prometheus metrics reporter

2020-05-17 Thread GitBox
leesf commented on pull request #1095: URL: https://github.com/apache/incubator-hudi/pull/1095#issuecomment-629941260 > What is the status of this PR? Is it ready to merge? Hi @piyushrl Thanks for the interests, now @xushiyan is taking over the PR.

[jira] [Created] (HUDI-905) Support native filter pushdown for Spark Datasource

2020-05-17 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-905: --- Summary: Support native filter pushdown for Spark Datasource Key: HUDI-905 URL: https://issues.apache.org/jira/browse/HUDI-905 Project: Apache Hudi (incubating)

[jira] [Commented] (HUDI-890) Prepare for 0.5.3 patch release

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109805#comment-17109805 ] Yanjia Gary Li commented on HUDI-890: - Hi [~bhavanisudha] , #1602 HUDI-494 fix incorrect record size

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #281

2020-05-17 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.35 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-494: Fix Version/s: (was: 0.5.3) > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[GitHub] [incubator-hudi] piyushrl commented on pull request #1095: [HUDI-210] Implement prometheus metrics reporter

2020-05-17 Thread GitBox
piyushrl commented on pull request #1095: URL: https://github.com/apache/incubator-hudi/pull/1095#issuecomment-629926575 What is the status of this PR? Is it ready to merge? This is an automated message from the Apache Git

[GitHub] [incubator-hudi] hddong commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-17 Thread GitBox
hddong commented on pull request #1558: URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-629921612 @yanghua : Sure, I'll discuss with @pratyakshsharma to make it success. This is an automated message from

[GitHub] [incubator-hudi] yanghua commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-17 Thread GitBox
yanghua commented on pull request #1558: URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-629915198 > @yanghua I am unable to run integration tests defined in hudi-cli package on my local. One of the tests from ITTestRepairsCommand is continuously failing in travis

[incubator-hudi] branch hudi_test_suite_refactor updated (e9ee88c -> c9f6aa6)

2020-05-17 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. discard e9ee88c [HUDI-394] Provide a basic implementation of test suite add c9f6aa6

[incubator-hudi] branch master updated: [HUDI-407] Adding Simple Index to Hoodie. (#1402)

2020-05-17 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 29edf4b [HUDI-407] Adding Simple Index

[GitHub] [incubator-hudi] vinothchandar merged pull request #1402: [HUDI-407] Adding Simple Index

2020-05-17 Thread GitBox
vinothchandar merged pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-17 Thread GitBox
vinothchandar commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-629898227 Let’s push this down the line for 0.6.0 @garyli1019 we probably need an alternative strategy here that is more aggressive. But I see bloom filter as part of

[GitHub] [incubator-hudi] leesf commented on pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
leesf commented on pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#issuecomment-629897867 > Merging #1633 into master will decrease coverage by 55.07%. The diff coverage is 41.17%. please take a look @bvaradar

[GitHub] [incubator-hudi] vinothchandar merged pull request #1636: [HUDI-895] Remove unnecessary listing .hoodie folder when using timeline server

2020-05-17 Thread GitBox
vinothchandar merged pull request #1636: URL: https://github.com/apache/incubator-hudi/pull/1636 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[incubator-hudi] branch master updated: [HUDI-895] Remove unnecessary listing .hoodie folder when using timeline server (#1636)

2020-05-17 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 3c9da2e [HUDI-895] Remove unnecessary

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-17 Thread GitBox
garyli1019 commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-629887094 We can push this to 0.6.0 if you guys prefer to have more discussions. If there is anything I can help please let me know.

[GitHub] [incubator-hudi] nsivabalan commented on pull request #1602: [HUDI-494] fix incorrect record size estimation

2020-05-17 Thread GitBox
nsivabalan commented on pull request #1602: URL: https://github.com/apache/incubator-hudi/pull/1602#issuecomment-629884272 @vinothchandar : are we proceeding with the patch. I haven't started looking at it yet, but if we plan to get it into 0.5.3, we need to get this resolved asap.

[GitHub] [incubator-hudi] n3nash edited a comment on pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-17 Thread GitBox
n3nash edited a comment on pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#issuecomment-629876523 @v3nkatesh The rate limiter looks good to me but it's still inspired from guava. I'll let @vinothchandar comment since he felt strongly about implementing our own.

[GitHub] [incubator-hudi] n3nash commented on pull request #1484: [HUDI-316] : Hbase qps repartition writestatus

2020-05-17 Thread GitBox
n3nash commented on pull request #1484: URL: https://github.com/apache/incubator-hudi/pull/1484#issuecomment-629876523 @v3nkatesh The rate limiter looks good to me but it's still inspired from guava. I'll let @vinothchandar since he felt strongly about implementing our own.

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-17 Thread GitBox
nsivabalan commented on a change in pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#discussion_r426313249 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/CustomKeyGenerator.java ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-17 Thread GitBox
nsivabalan commented on a change in pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#discussion_r426313186 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/CustomKeyGenerator.java ## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache

[GitHub] [incubator-hudi] n3nash commented on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-05-17 Thread GitBox
n3nash commented on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-629869173 @yanghua addressed comments, rebased. This is an automated message from the Apache Git Service. To

[incubator-hudi] branch hudi_test_suite_refactor updated (bbd4429 -> e9ee88c)

2020-05-17 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. discard bbd4429 [HUDI-394] Provide a basic implementation of test suite add 25e0b75

[jira] [Updated] (HUDI-890) Prepare for 0.5.3 patch release

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-890: --- Description: The following commits are included in this release. * #1372 HUDI-652 Decouple

[incubator-hudi] branch hudi_test_suite_refactor updated (6f4547d -> bbd4429)

2020-05-17 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. omit 6f4547d [MINOR] Code cleanup for dag package omit 33590b7 [MINOR] Code cleanup for

[jira] [Updated] (HUDI-890) Prepare for 0.5.3 patch release

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-890: --- Description: The following commits are included in this release. * #1372 HUDI-652 Decouple

[jira] [Updated] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-889: --- Fix Version/s: 0.5.3 0.6.0 > Writer supports useJdbc configuration when hive

[jira] [Updated] (HUDI-894) Allow ability to use hive metastore thrift connection to register tables

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-894: --- Fix Version/s: 0.6.0 > Allow ability to use hive metastore thrift connection to register tables >

[jira] [Updated] (HUDI-789) Adjust logic of upsert in HDFSParquetImporter

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-789: --- Fix Version/s: 0.5.3 > Adjust logic of upsert in HDFSParquetImporter >

[jira] [Updated] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-528: --- Fix Version/s: 0.6.0 > Incremental Pull fails when latest commit is empty >

[jira] [Updated] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-902: --- Fix Version/s: 0.6.0 > Avoid exception for getting SchemaProvider when no new input data >

[jira] [Updated] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-494: --- Fix Version/s: 0.6.0 > [DEBUGGING] Huge amount of tasks when writing files into HDFS >

[jira] [Updated] (HUDI-863) nested structs containing decimal types lead to null pointer exception

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-863: --- Fix Version/s: 0.5.3 > nested structs containing decimal types lead to null pointer exception >

[jira] [Updated] (HUDI-616) Parquet files not getting created on DFS docker instance but on local FS in TestHoodieDeltaStreamer

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-616: --- Fix Version/s: 0.5.3 > Parquet files not getting created on DFS docker instance but on local FS in >

[jira] [Updated] (HUDI-742) Fix java.lang.NoSuchMethodError: java.lang.Math.floorMod(JI)I

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-742: --- Fix Version/s: 0.5.3 > Fix java.lang.NoSuchMethodError: java.lang.Math.floorMod(JI)I >

[jira] [Updated] (HUDI-852) Add validation to check Table name when Append Mode is used in DataSource writer

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-852: --- Fix Version/s: 0.5.3 > Add validation to check Table name when Append Mode is used in DataSource >

[jira] [Updated] (HUDI-400) Add more checks to TestCompactionUtils#testUpgradeDowngrade

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-400: --- Fix Version/s: 0.5.3 0.6.0 > Add more checks to

[jira] [Updated] (HUDI-716) Exception: Not an Avro data file when running HoodieCleanClient.runClean

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-716: --- Fix Version/s: 0.5.3 > Exception: Not an Avro data file when running HoodieCleanClient.runClean >

[jira] [Updated] (HUDI-782) Add support for aliyun OSS

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-782: --- Fix Version/s: 0.5.3 > Add support for aliyun OSS > -- > >

[jira] [Updated] (HUDI-607) Hive sync fails to register tables partitioned by Date Type column

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-607: --- Fix Version/s: 0.5.3 > Hive sync fails to register tables partitioned by Date Type column >

[jira] [Updated] (HUDI-539) RO Path filter does not pick up hadoop configs from the spark context

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-539: --- Fix Version/s: 0.5.3 > RO Path filter does not pick up hadoop configs from the spark context >

[jira] [Updated] (HUDI-850) Avoid unnecessary listings in incremental cleaning mode

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-850: --- Fix Version/s: 0.5.3 > Avoid unnecessary listings in incremental cleaning mode >

[jira] [Updated] (HUDI-724) Parallelize GetSmallFiles For Partitions

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-724: --- Fix Version/s: 0.5.3 > Parallelize GetSmallFiles For Partitions >

[jira] [Updated] (HUDI-713) Datasource Writer throws error on resolving array of struct fields

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-713: --- Fix Version/s: 0.5.3 > Datasource Writer throws error on resolving array of struct fields >

[jira] [Updated] (HUDI-738) Add error msg in DeltaStreamer if `filterDupes=true` is enabled for `operation=UPSERT`.

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-738: --- Fix Version/s: 0.5.3 > Add error msg in DeltaStreamer if `filterDupes=true` is enabled for >

[jira] [Updated] (HUDI-799) DeltaStreamer must use appropriate FS when loading configs

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-799: --- Fix Version/s: 0.5.3 > DeltaStreamer must use appropriate FS when loading configs >

[jira] [Updated] (HUDI-681) Remove the dependency of EmbeddedTimelineService from HoodieReadClient

2020-05-17 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha updated HUDI-681: --- Fix Version/s: 0.5.3 0.6.0 > Remove the dependency of EmbeddedTimelineService from

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1639: [MINOR] Fix apache-rat violations

2020-05-17 Thread GitBox
codecov-io edited a comment on pull request #1639: URL: https://github.com/apache/incubator-hudi/pull/1639#issuecomment-629859704 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1639?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io commented on pull request #1639: [MINOR] Fix apache-rat violations

2020-05-17 Thread GitBox
codecov-io commented on pull request #1639: URL: https://github.com/apache/incubator-hudi/pull/1639#issuecomment-629859704 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1639?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] jfrazee opened a new pull request #1639: [MINOR] Fix apache-rat violations

2020-05-17 Thread GitBox
jfrazee opened a new pull request #1639: URL: https://github.com/apache/incubator-hudi/pull/1639 This fixes a few apache-rat violations and adds exclusions for the GitHub PR template type stuff. Note there is already a general attribution for Twitter in the NOTICE so I don't think we need

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
codecov-io edited a comment on pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#issuecomment-629848790 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1633?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io commented on pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
codecov-io commented on pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#issuecomment-629848790 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1633?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] bvaradar commented on pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
bvaradar commented on pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#issuecomment-629844386 @leesf : Addressed review comments. This is an automated message from the Apache Git Service. To

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
bvaradar commented on a change in pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#discussion_r426293139 ## File path: hudi-client/src/test/java/org/apache/hudi/client/TestHoodieClientOnCopyOnWriteStorage.java ## @@ -988,6 +988,44 @@ public void

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-17 Thread GitBox
bvaradar commented on a change in pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#discussion_r426290846 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -322,7 +327,15 @@ private void

[jira] [Updated] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-17 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-110: Status: In Progress (was: Open) > Better defaults for Partition extractor for Spark DataSOurce and

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1624: [HUDI-706]Add unit test for SavepointsCommand

2020-05-17 Thread GitBox
codecov-io edited a comment on pull request #1624: URL: https://github.com/apache/incubator-hudi/pull/1624#issuecomment-629604493 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1624?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] xushiyan commented on pull request #1592: [Hudi-69] Spark Datasource for MOR table

2020-05-17 Thread GitBox
xushiyan commented on pull request #1592: URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-629812714 @garyli1019 sadly to see this weird NPE persists. It would be helpful to have debug mode in travis and ssh into the container and then investigate.

[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1565: [HUDI-73]: implemented vanilla AvroKafkaSource

2020-05-17 Thread GitBox
pratyakshsharma commented on pull request #1565: URL: https://github.com/apache/incubator-hudi/pull/1565#issuecomment-629811530 So handling schema evolutions without schema-registry is going to be really tricky. I tried googling around this stuff, and found the below 2 links. These might

[GitHub] [incubator-hudi] hddong commented on a change in pull request #1624: [HUDI-706]Add unit test for SavepointsCommand

2020-05-17 Thread GitBox
hddong commented on a change in pull request #1624: URL: https://github.com/apache/incubator-hudi/pull/1624#discussion_r426268423 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java ## @@ -281,6 +292,19 @@ private static int

[GitHub] [incubator-hudi] nsivabalan edited a comment on issue #1625: [SUPPORT] MOR upsert table grows in size when ingesting same records

2020-05-17 Thread GitBox
nsivabalan edited a comment on issue #1625: URL: https://github.com/apache/incubator-hudi/issues/1625#issuecomment-629790932 @rolandjohann : I couldn't repro the ever growing hudi table. May be I am missing something. Can you try my below code and let us know what do you see. @bvaradar

[jira] [Closed] (HUDI-794) Add support for maintaining separate table wise configs in HoodieDeltaStreamer similar to HoodieMultiTableDeltaStreamer

2020-05-17 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma closed HUDI-794. - Resolution: Fixed This feature is only over complicating the flow. Closing it.  > Add support for

[GitHub] [incubator-hudi] nsivabalan commented on issue #1625: [SUPPORT] MOR upsert table grows in size when ingesting same records

2020-05-17 Thread GitBox
nsivabalan commented on issue #1625: URL: https://github.com/apache/incubator-hudi/issues/1625#issuecomment-629790932 @bvaradar : Tried to reproduce locally and couldn't. Are there chance of some data skewness? @rolandjohann : I couldn't repro the ever growing hudi table. May be I am

[jira] [Commented] (HUDI-859) Improve documentation around key generators

2020-05-17 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109452#comment-17109452 ] Pratyaksh Sharma commented on HUDI-859: --- [~hongdongdong] Do you want to work on this, or I should get

[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1597: [WIP] Added a MultiFormatTimestampBasedKeyGenerator that allows for multipl…

2020-05-17 Thread GitBox
pratyakshsharma commented on pull request #1597: URL: https://github.com/apache/incubator-hudi/pull/1597#issuecomment-629788395 Have included the changes from this PR into https://github.com/apache/incubator-hudi/pull/1433. I guess we can close this now @bhasudha @vinothchandar

[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-17 Thread GitBox
pratyakshsharma commented on pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-629787509 @nsivabalan I have tried to include the changes from https://github.com/apache/incubator-hudi/pull/1597 as well in this. Please take a pass.

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1469: [HUDI-686] Implement BloomIndexV2 that does not depend on memory caching

2020-05-17 Thread GitBox
nsivabalan commented on a change in pull request #1469: URL: https://github.com/apache/incubator-hudi/pull/1469#discussion_r426250288 ## File path: hudi-client/src/main/java/org/apache/hudi/index/bloom/HoodieGlobalBloomIndexV2.java ## @@ -0,0 +1,223 @@ +/* + * Licensed to the

[GitHub] [incubator-hudi] bhasudha commented on pull request #1592: [Hudi-69] Spark Datasource for MOR table

2020-05-17 Thread GitBox
bhasudha commented on pull request #1592: URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-629781283 @garyli1019 taking a look at the PR. will get back soon. This is an automated message from the Apache

[jira] [Resolved] (HUDI-714) Add javadoc, comments to hudi write method link

2020-05-17 Thread leesf (Jira)
[ https://issues.apache.org/jira/browse/HUDI-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf resolved HUDI-714. Fix Version/s: 0.6.0 Resolution: Fixed Fixed via master: 25a0080b2f6ddce0e528b2a72aea33a565f0e565 > Add

[jira] [Commented] (HUDI-859) Improve documentation around key generators

2020-05-17 Thread hong dongdong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109353#comment-17109353 ] hong dongdong commented on HUDI-859: [~shivnarayan]: I'll discuss with [~Pratyaksh]  for this. >