[GitHub] [hudi] garyli1019 commented on issue #1771: [SUPPORT] https://hudi.apache.org/docs/configurations.html does not mention GLOBAL_BLOOM in hoodie.index.type section

2020-06-29 Thread GitBox
garyli1019 commented on issue #1771: URL: https://github.com/apache/hudi/issues/1771#issuecomment-651537179 this answer prob help. https://github.com/apache/hudi/issues/1745#issuecomment-646581422 This is an automated

[GitHub] [hudi] garyli1019 commented on issue #1773: [SUPPORT] NPE about MOR config?

2020-06-29 Thread GitBox
garyli1019 commented on issue #1773: URL: https://github.com/apache/hudi/issues/1773#issuecomment-651535976 Hi, would you share more info about this issue? Stack tracing and your Hudi config will help. This is an automated

[jira] [Updated] (HUDI-703) Add unit test for HoodieSyncCommand

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-703: Labels: pull-request-available (was: ) > Add unit test for HoodieSyncCommand >

[GitHub] [hudi] garyli1019 commented on a change in pull request #1722: [HUDI-69] Support Spark Datasource for MOR table

2020-06-29 Thread GitBox
garyli1019 commented on a change in pull request #1722: URL: https://github.com/apache/hudi/pull/1722#discussion_r447407892 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -65,7 +66,7 @@ object DataSourceReadOptions { * This eases

[GitHub] [hudi] hddong opened a new pull request #1774: [HUDI-703]Add unit test for HoodieSyncCommand

2020-06-29 Thread GitBox
hddong opened a new pull request #1774: URL: https://github.com/apache/hudi/pull/1774 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #324

2020-06-29 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.29 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[jira] [Updated] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-29 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-839: Status: Patch Available (was: In Progress) > Implement rollbacks using marker files instead of

[jira] [Created] (HUDI-1061) Hudi CLI savepoint command fail because of spark conf loading issue

2020-06-29 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-1061: -- Summary: Hudi CLI savepoint command fail because of spark conf loading issue Key: HUDI-1061 URL: https://issues.apache.org/jira/browse/HUDI-1061 Project: Apache Hudi

[GitHub] [hudi] umehrot2 commented on pull request #1702: Bootstrap datasource changes

2020-06-29 Thread GitBox
umehrot2 commented on pull request #1702: URL: https://github.com/apache/hudi/pull/1702#issuecomment-651401659 @garyli1019 thank you for your inputs. Sorry, I had been busy with oncall and other projects. Let me try to catch up and process your comments.

[GitHub] [hudi] umehrot2 commented on pull request #1702: Bootstrap datasource changes

2020-06-29 Thread GitBox
umehrot2 commented on pull request #1702: URL: https://github.com/apache/hudi/pull/1702#issuecomment-651401043 > @umehrot2 does this PR some of @bvaradar 's changes included? @vinothchandar yes it does. I had put some stuff in just for ease of reviewing becuase this utilizes some of

[GitHub] [hudi] umehrot2 commented on a change in pull request #1768: [HUDI-1054][Peformance] Several performance fixes during finalizing writes

2020-06-29 Thread GitBox
umehrot2 commented on a change in pull request #1768: URL: https://github.com/apache/hudi/pull/1768#discussion_r447264716 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java ## @@ -199,16 +201,40 @@ public static String getRelativePartitionPath(Path

[GitHub] [hudi] umehrot2 commented on a change in pull request #1768: [HUDI-1054][Peformance] Several performance fixes during finalizing writes

2020-06-29 Thread GitBox
umehrot2 commented on a change in pull request #1768: URL: https://github.com/apache/hudi/pull/1768#discussion_r447264262 ## File path: hudi-common/pom.xml ## @@ -147,6 +147,16 @@ test + + + org.apache.spark + spark-core_${scala.binary.version}

[jira] [Commented] (HUDI-739) HoodieIOException: Could not delete in-flight instant

2020-06-29 Thread t oo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148123#comment-17148123 ] t oo commented on HUDI-739: --- I am facing similar error on hudi 0.5.3 & spark 2.4.6 with s3, in this case the

[jira] [Updated] (HUDI-1060) Create plugin for bootstrapping iceberg, delta and hudi tables

2020-06-29 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1060: - Status: Open (was: New) > Create plugin for bootstrapping iceberg, delta and hudi tables

[jira] [Created] (HUDI-1060) Create plugin for bootstrapping iceberg, delta and hudi tables

2020-06-29 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1060: Summary: Create plugin for bootstrapping iceberg, delta and hudi tables Key: HUDI-1060 URL: https://issues.apache.org/jira/browse/HUDI-1060 Project: Apache

[jira] [Commented] (HUDI-983) Add Metrics section to asf-site

2020-06-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148082#comment-17148082 ] Raymond Xu commented on HUDI-983: - [~shenhong] Thank you for the PR. Are you referring to adding screenshot

[GitHub] [hudi] afeldman1 commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
afeldman1 commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r447178056 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand.

[jira] [Updated] (HUDI-684) Introduce abstraction for writing and reading and compacting from FileGroups

2020-06-29 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason updated HUDI-684: Status: Closed (was: Patch Available) > Introduce abstraction for writing and reading and

[GitHub] [hudi] afeldman1 commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
afeldman1 commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r447175985 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand.

[jira] [Resolved] (HUDI-959) HoodieTable abstraction with pluggable Base and Log formats

2020-06-29 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason resolved HUDI-959. - Resolution: Fixed > HoodieTable abstraction with pluggable Base and Log formats >

[GitHub] [hudi] afeldman1 commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
afeldman1 commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r447174840 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand.

[GitHub] [hudi] tooptoop4 edited a comment on issue #1243: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.apac

2020-06-29 Thread GitBox
tooptoop4 edited a comment on issue #1243: URL: https://github.com/apache/hudi/issues/1243#issuecomment-651281125 i face this issue in 0.5.3, for my custom jar importing hudi-client it is bringing in hbase-client/hadoop* and then old avro 1.7 [INFO] +-

[GitHub] [hudi] afeldman1 commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
afeldman1 commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r447174171 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand.

[GitHub] [hudi] tooptoop4 commented on issue #1243: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieException: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.apache.parq

2020-06-29 Thread GitBox
tooptoop4 commented on issue #1243: URL: https://github.com/apache/hudi/issues/1243#issuecomment-651281125 i face this issue in 0.5.3, hudi-client is bringing in hbase-client/hadoop* and then old avro 1.7 This is an

[jira] [Commented] (HUDI-1058) Make delete marker configurable

2020-06-29 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148035#comment-17148035 ] Raymond Xu commented on HUDI-1058: -- [~shenhong] Please go ahead. Thanks! cc [~shivnarayan] > Make

[GitHub] [hudi] tooptoop4 opened a new issue #1773: [SUPPORT] NPE about MOR config?

2020-06-29 Thread GitBox
tooptoop4 opened a new issue #1773: URL: https://github.com/apache/hudi/issues/1773 i'm running spark 2.4.6 and hudi 0.5.3, hit below error: Exception in thread "main" java.lang.NullPointerException at

[jira] [Created] (HUDI-1059) Add test to verify partition path gets updated with global bloom

2020-06-29 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1059: - Summary: Add test to verify partition path gets updated with global bloom Key: HUDI-1059 URL: https://issues.apache.org/jira/browse/HUDI-1059 Project:

[jira] [Updated] (HUDI-1059) Add test to verify partition path gets updated with global bloom

2020-06-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1059: -- Fix Version/s: 0.6.0 > Add test to verify partition path gets updated with global bloom

[jira] [Updated] (HUDI-1059) Add test to verify partition path gets updated with global bloom

2020-06-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1059: -- Status: Open (was: New) > Add test to verify partition path gets updated with global

[jira] [Assigned] (HUDI-1059) Add test to verify partition path gets updated with global bloom

2020-06-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1059: - Assignee: sivabalan narayanan > Add test to verify partition path gets updated

[GitHub] [hudi] zuyanton commented on pull request #1765: [HUDI-1049] 0.5.3 Patch - In inline compaction mode, previously failed compactions needs to be retried before new compactions

2020-06-29 Thread GitBox
zuyanton commented on pull request #1765: URL: https://github.com/apache/hudi/pull/1765#issuecomment-651199789 I was running this bug fix on two large tables updated every 10 minutes for 3 days. I don't see any lingering compactions that are INFLIGHT mode. also ran this code change on

[jira] [Updated] (HUDI-978) Specify version information for each component separately

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-978: Labels: pull-request-available (was: ) > Specify version information for each component separately

[GitHub] [hudi] hddong opened a new pull request #1772: [HUDI-978]Specify version information for each component separately

2020-06-29 Thread GitBox
hddong opened a new pull request #1772: URL: https://github.com/apache/hudi/pull/1772 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Commented] (HUDI-1058) Make delete marker configurable

2020-06-29 Thread Hong Shen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147826#comment-17147826 ] Hong Shen commented on HUDI-1058: - If you haven't started yet, I can take this. > Make delete marker

[GitHub] [hudi] leesf commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
leesf commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r446967990 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand. ##

[GitHub] [hudi] leesf commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
leesf commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r446966748 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand. ##

[GitHub] [hudi] leesf commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
leesf commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r446965939 ## File path: docs/_docs/2_3_querying_data.md ## @@ -136,6 +136,16 @@ The Spark Datasource API is a popular way of authoring Spark ETL pipelines. Hudi

[GitHub] [hudi] leesf commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
leesf commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r446965016 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand. ##

[GitHub] [hudi] leesf commented on a change in pull request #1761: [MINOR] Add documentation for using multi-column table keys and for n…

2020-06-29 Thread GitBox
leesf commented on a change in pull request #1761: URL: https://github.com/apache/hudi/pull/1761#discussion_r446961857 ## File path: docs/_docs/2_2_writing_data.md ## @@ -176,15 +176,49 @@ In some cases, you may want to migrate your existing table into Hudi beforehand. ##

[GitHub] [hudi] tooptoop4 opened a new issue #1771: [SUPPORT] https://hudi.apache.org/docs/configurations.html does not mention GLOBAL_BLOOM in hoodie.index.type section

2020-06-29 Thread GitBox
tooptoop4 opened a new issue #1771: URL: https://github.com/apache/hudi/issues/1771 hoodie.index.type does not mention global or bucket option in enum list? side note: if i have a COW table that was written with BLOOM can i in future start writing new inserts/updates to it