[GitHub] [incubator-hudi] leesf commented on issue #928: [HUDI-265] Failed to delete tmp dirs created in unit tests
leesf commented on issue #928: [HUDI-265] Failed to delete tmp dirs created in unit tests URL: https://github.com/apache/incubator-hudi/pull/928#issuecomment-537838407 Any other concerns to be merged? @vinothchandar This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HUDI-286) Remove or hide tags from Hudi official web site
[ https://issues.apache.org/jira/browse/HUDI-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943415#comment-16943415 ] vinoyang commented on HUDI-286: --- [~vinoth] Yes, we do not actively use tags in the docs. However, a "getting_started" tag is shown on the first page of the web site. When visitors click it, the URL redirects to a non-effect page. IMO, if we remove or hide the element from the page, it would be better. > Remove or hide tags from Hudi official web site > --- > > Key: HUDI-286 > URL: https://issues.apache.org/jira/browse/HUDI-286 > Project: Apache Hudi (incubating) > Issue Type: Wish > Components: Docs >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Currently, Hudi's doc did not provide a tag HTML page. While we provided a > hyper link to an unknown URL, e.g. > [getting_started|http://hudi.apache.org/tag_getting_started.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment
[ https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943424#comment-16943424 ] leesf commented on HUDI-288: Yes, and will feedback after a closer look at the details of current data flow from kafka to hudi. > Add support for ingesting multiple kafka streams in a single DeltaStreamer > deployment > - > > Key: HUDI-288 > URL: https://issues.apache.org/jira/browse/HUDI-288 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: deltastreamer >Reporter: Vinoth Chandar >Priority: Major > > https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@ > has all the context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-217) Provide a unified resource management class to standardize the resource allocation and release for hudi client test cases
[ https://issues.apache.org/jira/browse/HUDI-217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-217. - Resolution: Done Done via master: 895d732a1423f3a113d9630aa3c56e4b0837effd > Provide a unified resource management class to standardize the resource > allocation and release for hudi client test cases > - > > Key: HUDI-217 > URL: https://issues.apache.org/jira/browse/HUDI-217 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: Testing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, resource allocation and release are very confused in the test > cases of hudi client module. We should provide a unified class to manage the > resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest
vinoyang created HUDI-290: - Summary: Normalize Test class name of HoodieWriteConfigTest Key: HUDI-290 URL: https://issues.apache.org/jira/browse/HUDI-290 Project: Apache Hudi (incubating) Issue Type: Sub-task Components: Testing Reporter: vinoyang Assignee: vinoyang In general, a test case name start with {{Test}}. It would be better to rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest
[ https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943430#comment-16943430 ] vinoyang commented on HUDI-290: --- [~vinoth] WDYT? > Normalize Test class name of HoodieWriteConfigTest > -- > > Key: HUDI-290 > URL: https://issues.apache.org/jira/browse/HUDI-290 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: Testing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > In general, a test case name start with {{Test}}. It would be better to > rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-64) Estimation of compression ratio & other dynamic storage knobs based on historical stats
[ https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943434#comment-16943434 ] vinoyang commented on HUDI-64: -- [~vinoth] I'd like to take this ticket. I am in China's National Day holiday, and I may have time after October 8th. > Estimation of compression ratio & other dynamic storage knobs based on > historical stats > --- > > Key: HUDI-64 > URL: https://issues.apache.org/jira/browse/HUDI-64 > Project: Apache Hudi (incubating) > Issue Type: New Feature > Components: Storage Management, Write Client >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Roughly along the likes of. [https://github.com/uber/hudi/issues/270] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-289) Implement a long running test for Hudi writing and querying end-end
[ https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943435#comment-16943435 ] vinoyang commented on HUDI-289: --- [~vinoth] OK, I'd like to take this ticket. > Implement a long running test for Hudi writing and querying end-end > --- > > Key: HUDI-289 > URL: https://issues.apache.org/jira/browse/HUDI-289 > Project: Apache Hudi (incubating) > Issue Type: Test > Components: Usability >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > We would need an equivalent of an end-end test which runs some workload for > few hours atleast, triggers various actions like commit, deltacopmmit, > rollback, compaction and ensures correctness of code before every release > P.S: Learn from all the CSS issues managing compaction.. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] leesf opened a new pull request #936: [HUDI-285] Implement HoodieStorageWriter based on actual file type
leesf opened a new pull request #936: [HUDI-285] Implement HoodieStorageWriter based on actual file type URL: https://github.com/apache/incubator-hudi/pull/936 see jira: https://jira.apache.org/jira/projects/HUDI/issues/HUDI-285 CC @vinothchandar This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HUDI-285) Implement HoodieStorageWriter based on actual file type
[ https://issues.apache.org/jira/browse/HUDI-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-285: Labels: pull-request-available (was: ) > Implement HoodieStorageWriter based on actual file type > --- > > Key: HUDI-285 > URL: https://issues.apache.org/jira/browse/HUDI-285 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: Write Client >Reporter: leesf >Assignee: leesf >Priority: Major > Labels: pull-request-available > Fix For: 0.5.1 > > > Currently the _getStorageWriter_ method in HoodieStorageWriterFactory to get > HoodieStorageWriter is hard code to HoodieParquetWriter since currently only > parquet is supported for HoodieStorageWriter. However, it is better to > implement HoodieStorageWriter based on actual file type for extension. > cc [~vinoth] [~vbalaji] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment
[ https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf reassigned HUDI-288: -- Assignee: leesf > Add support for ingesting multiple kafka streams in a single DeltaStreamer > deployment > - > > Key: HUDI-288 > URL: https://issues.apache.org/jira/browse/HUDI-288 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: deltastreamer >Reporter: Vinoth Chandar >Assignee: leesf >Priority: Major > > https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@ > has all the context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest
[ https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943558#comment-16943558 ] leesf commented on HUDI-290: +1 rename to TestHoodieWriteConfig, and i see many UTs start already start with Test... Also please check other UTs name not started with Test in the project. Thanks. > Normalize Test class name of HoodieWriteConfigTest > -- > > Key: HUDI-290 > URL: https://issues.apache.org/jira/browse/HUDI-290 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: Testing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > In general, a test case name start with {{Test}}. It would be better to > rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] tweise commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.
tweise commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules. URL: https://github.com/apache/incubator-hudi/pull/935#issuecomment-537975855 Did you verify that the files are automatically included into the jar? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest
[ https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943558#comment-16943558 ] leesf edited comment on HUDI-290 at 10/3/19 2:53 PM: - +1 rename to TestHoodieWriteConfig, and i see many UT names already start with Test... Also please check other UT names not started with Test in the project. Thanks. was (Author: xleesf): +1 rename to TestHoodieWriteConfig, and i see many UTs start already start with Test... Also please check other UTs name not started with Test in the project. Thanks. > Normalize Test class name of HoodieWriteConfigTest > -- > > Key: HUDI-290 > URL: https://issues.apache.org/jira/browse/HUDI-290 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: Testing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > In general, a test case name start with {{Test}}. It would be better to > rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] bvaradar commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.
bvaradar commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules. URL: https://github.com/apache/incubator-hudi/pull/935#issuecomment-538009259 > Did you verify that the files are automatically included into the jar? Yes, They are present in the jars. For e:g: varadarb-C02SH0P1G8WL:target varadarb$ jar tf hudi-common-0.5.1-SNAPSHOT.jar | grep META-INF META-INF/ META-INF/LICENSE META-INF/NOTICE . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] bvaradar merged pull request #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.
bvaradar merged pull request #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules. URL: https://github.com/apache/incubator-hudi/pull/935 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (HUDI-287) Remove LICENSE and NOTICE files in hoodie child modules
[ https://issues.apache.org/jira/browse/HUDI-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-287. - Resolution: Fixed > Remove LICENSE and NOTICE files in hoodie child modules > --- > > Key: HUDI-287 > URL: https://issues.apache.org/jira/browse/HUDI-287 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: asf-migration >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This was earlier added to ensure LICENSE and NOTICE files are present in > generated jars. In the earlier pom setup, hudi parent-pom was not linked to > apache parent pom. There were no "generate-resource-bundle" plugin present in > parent hudi pom to automatically generate LICENSE and NOTICE files in jars > and we resorted to manually storing the LICENSE/NOTICE files in each > submodule > > With Apache parent pom, the NOTICE and LICENSE files are automatically setup > for each jar added. This would mean that we can safely remove all LICENSE > and NOTICE files in each submodules. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-287) Remove LICENSE and NOTICE files in hoodie child modules
[ https://issues.apache.org/jira/browse/HUDI-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan closed HUDI-287. --- > Remove LICENSE and NOTICE files in hoodie child modules > --- > > Key: HUDI-287 > URL: https://issues.apache.org/jira/browse/HUDI-287 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: asf-migration >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This was earlier added to ensure LICENSE and NOTICE files are present in > generated jars. In the earlier pom setup, hudi parent-pom was not linked to > apache parent pom. There were no "generate-resource-bundle" plugin present in > parent hudi pom to automatically generate LICENSE and NOTICE files in jars > and we resorted to manually storing the LICENSE/NOTICE files in each > submodule > > With Apache parent pom, the NOTICE and LICENSE files are automatically setup > for each jar added. This would mean that we can safely remove all LICENSE > and NOTICE files in each submodules. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[incubator-hudi] branch master updated: Update Release notes
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new e78ba59 Update Release notes e78ba59 is described below commit e78ba598c549d51a3e7ce4231acebe46f9828001 Author: Balaji Varadarajan AuthorDate: Thu Oct 3 09:11:37 2019 -0700 Update Release notes --- RELEASE_NOTES.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index 89beac2..3e356af 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -26,7 +26,9 @@ Release 0.5.0-incubating * Bug fixes in query side integration, hive-sync, deltaStreamer, compaction, rollbacks, restore ### Full PR List - * **Balaji Varadarajan** HUDI-121 : Address comments during RC2 voting + * **Balaji Varadarajan** [HUDI-287] Address comments during review of release candidate. Remove LICENSE and NOTICE files in hoodie child modules. + * **Balaji Varadarajan** [HUDI-121] Fix bugs in Release Scripts found during RC creation + * **Balaji Varadarajan** [HUDI-121] : Address comments during RC2 voting * **Bhavani Sudha Saktheeswaran** [HUDI-271] Create QuickstartUtils for simplifying quickstart guide * **vinoyang** [HUDI-247] Unify the re-initialization of HoodieTableMetaClient in test for hoodie-client module (#930) * **Balaji Varadarajan** [HUDI-279] Fix regression in Schema Evolution due to PR-755
[incubator-hudi] 03/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc4
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch release-0.5.0 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git commit fb053bf44dc7ed862e86e63fa35de0e7e43aa234 Author: Balaji Varadarajan AuthorDate: Thu Oct 3 09:15:39 2019 -0700 [HUDI-121] Preparing for Release 0.5.0-incubating-rc4 --- docker/hoodie/hadoop/base/pom.xml | 2 +- docker/hoodie/hadoop/datanode/pom.xml | 2 +- docker/hoodie/hadoop/historyserver/pom.xml| 2 +- docker/hoodie/hadoop/hive_base/pom.xml| 2 +- docker/hoodie/hadoop/namenode/pom.xml | 2 +- docker/hoodie/hadoop/pom.xml | 2 +- docker/hoodie/hadoop/prestobase/pom.xml | 2 +- docker/hoodie/hadoop/spark_base/pom.xml | 2 +- docker/hoodie/hadoop/sparkadhoc/pom.xml | 2 +- docker/hoodie/hadoop/sparkmaster/pom.xml | 2 +- docker/hoodie/hadoop/sparkworker/pom.xml | 2 +- hudi-cli/pom.xml | 2 +- hudi-client/pom.xml | 2 +- hudi-common/pom.xml | 2 +- hudi-hadoop-mr/pom.xml| 2 +- hudi-hive/pom.xml | 2 +- hudi-integ-test/pom.xml | 2 +- hudi-spark/pom.xml| 2 +- hudi-timeline-service/pom.xml | 2 +- hudi-utilities/pom.xml| 2 +- packaging/hudi-hadoop-mr-bundle/pom.xml | 2 +- packaging/hudi-hive-bundle/pom.xml| 2 +- packaging/hudi-presto-bundle/pom.xml | 2 +- packaging/hudi-spark-bundle/pom.xml | 2 +- packaging/hudi-timeline-server-bundle/pom.xml | 2 +- packaging/hudi-utilities-bundle/pom.xml | 2 +- pom.xml | 2 +- 27 files changed, 27 insertions(+), 27 deletions(-) diff --git a/docker/hoodie/hadoop/base/pom.xml b/docker/hoodie/hadoop/base/pom.xml index 7c0a7e8..5921337 100644 --- a/docker/hoodie/hadoop/base/pom.xml +++ b/docker/hoodie/hadoop/base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/datanode/pom.xml b/docker/hoodie/hadoop/datanode/pom.xml index e41e80a..50e8de6 100644 --- a/docker/hoodie/hadoop/datanode/pom.xml +++ b/docker/hoodie/hadoop/datanode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/historyserver/pom.xml b/docker/hoodie/hadoop/historyserver/pom.xml index f4cc5f9..42f0fa7 100644 --- a/docker/hoodie/hadoop/historyserver/pom.xml +++ b/docker/hoodie/hadoop/historyserver/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/hive_base/pom.xml b/docker/hoodie/hadoop/hive_base/pom.xml index 7dbc0fa..f565444 100644 --- a/docker/hoodie/hadoop/hive_base/pom.xml +++ b/docker/hoodie/hadoop/hive_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/namenode/pom.xml b/docker/hoodie/hadoop/namenode/pom.xml index fca0ed1..0f356bb 100644 --- a/docker/hoodie/hadoop/namenode/pom.xml +++ b/docker/hoodie/hadoop/namenode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml index f93e4a3..453bdda 100644 --- a/docker/hoodie/hadoop/pom.xml +++ b/docker/hoodie/hadoop/pom.xml @@ -19,7 +19,7 @@ hudi org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 ../../../pom.xml 4.0.0 diff --git a/docker/hoodie/hadoop/prestobase/pom.xml b/docker/hoodie/hadoop/prestobase/pom.xml index 0cf1501..fd3f004 100644 --- a/docker/hoodie/hadoop/prestobase/pom.xml +++ b/docker/hoodie/hadoop/prestobase/pom.xml @@ -22,7 +22,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/spark_base/pom.xml b/docker/hoodie/hadoop/spark_base/pom.xml index e88d0a5..3a805aa 100644 --- a/docker/hoodie/hadoop/spark_base/pom.xml +++ b/docker/hoodie/hadoop/spark_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4 4.0.0 pom diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml b/docker/hoodie/hadoop/sparkadhoc/pom.xml index bc01720..c521e13 100644 --- a/docker/hoodie/hadoop/sparkadhoc/pom.xml +++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc3 +0.5.0-incubating-rc4
[incubator-hudi] branch release-0.5.0 updated (789f91e -> fb053bf)
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to branch release-0.5.0 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. omit 789f91e [HUDI-121] Fix issues found during release candidate builds omit dfe4784 [HUDI-121] Fix issues found during release candidate builds omit c3a4dab [HUDI-121] Preparing for Release 0.5.0-incubating-rc3 omit 37e9dc2 [HUDI-121] Preparing for Release 0.5.0-incubating-rc2 add e41835f [HUDI-121] Fix bugs in Release Scripts found during RC creation add 6da2f9a [HUDI-287] Address comments during review of release candidate 1. Remove LICENSE and NOTICE files in hoodie child modules. 2. Remove developers and contributor section from pom 3. Also ensure any failures in validation script is reported appropriately 4. Make hoodie parent pom consistent with that of its parent apache-21 (https://github.com/apache/maven-apache-parent/blob/apache-21/pom.xml) add e78ba59 Update Release notes new e571e14 [HUDI-121] Preparing for Release 0.5.0-incubating-rc2 new e3db2d8 [HUDI-121] Preparing for Release 0.5.0-incubating-rc3 new fb053bf [HUDI-121] Preparing for Release 0.5.0-incubating-rc4 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (789f91e) \ N -- N -- N refs/heads/release-0.5.0 (fb053bf) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: RELEASE_NOTES.md | 4 +- docker/hoodie/hadoop/NOTICE| 5 - docker/hoodie/hadoop/base/NOTICE | 5 - docker/hoodie/hadoop/base/pom.xml | 2 +- docker/hoodie/hadoop/datanode/NOTICE | 5 - docker/hoodie/hadoop/datanode/pom.xml | 2 +- docker/hoodie/hadoop/historyserver/NOTICE | 5 - docker/hoodie/hadoop/historyserver/pom.xml | 2 +- docker/hoodie/hadoop/hive_base/NOTICE | 5 - docker/hoodie/hadoop/hive_base/pom.xml | 2 +- docker/hoodie/hadoop/namenode/NOTICE | 5 - docker/hoodie/hadoop/namenode/pom.xml | 2 +- docker/hoodie/hadoop/pom.xml | 2 +- docker/hoodie/hadoop/prestobase/NOTICE | 5 - docker/hoodie/hadoop/prestobase/pom.xml| 2 +- docker/hoodie/hadoop/spark_base/NOTICE | 5 - docker/hoodie/hadoop/spark_base/pom.xml| 2 +- docker/hoodie/hadoop/sparkadhoc/NOTICE | 5 - docker/hoodie/hadoop/sparkadhoc/pom.xml| 2 +- docker/hoodie/hadoop/sparkmaster/NOTICE| 5 - docker/hoodie/hadoop/sparkmaster/pom.xml | 2 +- docker/hoodie/hadoop/sparkworker/pom.xml | 2 +- hudi-cli/pom.xml | 2 +- hudi-cli/src/main/resources/META-INF/LICENSE | 177 - hudi-cli/src/main/resources/META-INF/NOTICE| 5 - hudi-client/pom.xml| 2 +- hudi-client/src/main/resources/META-INF/LICENSE| 177 - hudi-client/src/main/resources/META-INF/NOTICE | 5 - hudi-common/pom.xml| 2 +- hudi-common/src/main/resources/META-INF/LICENSE| 177 - hudi-common/src/main/resources/META-INF/NOTICE | 5 - hudi-hadoop-mr/pom.xml | 2 +- hudi-hadoop-mr/src/main/resources/META-INF/LICENSE | 177 - hudi-hadoop-mr/src/main/resources/META-INF/NOTICE | 5 - hudi-hive/pom.xml | 2 +- hudi-hive/src/main/resources/META-INF/LICENSE | 177 - hudi-hive/src/main/resources/META-INF/NOTICE | 5 - hudi-integ-test/pom.xml| 2 +- .../src/main/resources/META-INF/LICENSE| 177 - hudi-integ-test/src/main/resources/META-INF/NOTICE | 5 - hudi-spark/pom.xml | 2 +- hudi-spark/src/main/resources/META-INF/LICENSE | 177 - hudi-spark/src/main/resources/META-INF/NOTICE | 5 - hudi-timeline-service/pom.xml |
[incubator-hudi] 01/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc2
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch release-0.5.0 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git commit e571e143d5420bc6de3601c89e66fd15cca94de0 Author: Balaji Varadarajan AuthorDate: Tue Sep 17 10:35:16 2019 -0700 [HUDI-121] Preparing for Release 0.5.0-incubating-rc2 --- docker/hoodie/hadoop/base/pom.xml | 2 +- docker/hoodie/hadoop/datanode/pom.xml | 2 +- docker/hoodie/hadoop/historyserver/pom.xml| 2 +- docker/hoodie/hadoop/hive_base/pom.xml| 2 +- docker/hoodie/hadoop/namenode/pom.xml | 2 +- docker/hoodie/hadoop/pom.xml | 2 +- docker/hoodie/hadoop/prestobase/pom.xml | 2 +- docker/hoodie/hadoop/spark_base/pom.xml | 2 +- docker/hoodie/hadoop/sparkadhoc/pom.xml | 2 +- docker/hoodie/hadoop/sparkmaster/pom.xml | 2 +- docker/hoodie/hadoop/sparkworker/pom.xml | 2 +- hudi-cli/pom.xml | 2 +- hudi-client/pom.xml | 2 +- hudi-common/pom.xml | 2 +- hudi-hadoop-mr/pom.xml| 2 +- hudi-hive/pom.xml | 2 +- hudi-integ-test/pom.xml | 2 +- hudi-spark/pom.xml| 2 +- hudi-timeline-service/pom.xml | 2 +- hudi-utilities/pom.xml| 2 +- packaging/hudi-hadoop-mr-bundle/pom.xml | 2 +- packaging/hudi-hive-bundle/pom.xml| 2 +- packaging/hudi-presto-bundle/pom.xml | 2 +- packaging/hudi-spark-bundle/pom.xml | 2 +- packaging/hudi-timeline-server-bundle/pom.xml | 2 +- packaging/hudi-utilities-bundle/pom.xml | 2 +- pom.xml | 2 +- 27 files changed, 27 insertions(+), 27 deletions(-) diff --git a/docker/hoodie/hadoop/base/pom.xml b/docker/hoodie/hadoop/base/pom.xml index 52dd2a8..8cb0ab2 100644 --- a/docker/hoodie/hadoop/base/pom.xml +++ b/docker/hoodie/hadoop/base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/datanode/pom.xml b/docker/hoodie/hadoop/datanode/pom.xml index 23cb64d..ed4533f 100644 --- a/docker/hoodie/hadoop/datanode/pom.xml +++ b/docker/hoodie/hadoop/datanode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/historyserver/pom.xml b/docker/hoodie/hadoop/historyserver/pom.xml index d35e940..b3455c6 100644 --- a/docker/hoodie/hadoop/historyserver/pom.xml +++ b/docker/hoodie/hadoop/historyserver/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/hive_base/pom.xml b/docker/hoodie/hadoop/hive_base/pom.xml index 2f7c2b5..0afaa0e 100644 --- a/docker/hoodie/hadoop/hive_base/pom.xml +++ b/docker/hoodie/hadoop/hive_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/namenode/pom.xml b/docker/hoodie/hadoop/namenode/pom.xml index a996f57..257781a 100644 --- a/docker/hoodie/hadoop/namenode/pom.xml +++ b/docker/hoodie/hadoop/namenode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml index fff962f..a339226 100644 --- a/docker/hoodie/hadoop/pom.xml +++ b/docker/hoodie/hadoop/pom.xml @@ -19,7 +19,7 @@ hudi org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 ../../../pom.xml 4.0.0 diff --git a/docker/hoodie/hadoop/prestobase/pom.xml b/docker/hoodie/hadoop/prestobase/pom.xml index d3c1d0f..1459268 100644 --- a/docker/hoodie/hadoop/prestobase/pom.xml +++ b/docker/hoodie/hadoop/prestobase/pom.xml @@ -22,7 +22,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/spark_base/pom.xml b/docker/hoodie/hadoop/spark_base/pom.xml index 32b33e0..28a3b78 100644 --- a/docker/hoodie/hadoop/spark_base/pom.xml +++ b/docker/hoodie/hadoop/spark_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml b/docker/hoodie/hadoop/sparkadhoc/pom.xml index 80a811c..0c6e1b4 100644 --- a/docker/hoodie/hadoop/sparkadhoc/pom.xml +++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.1-SNAPSHOT +0.5.0-incubating-rc2 4.0.0 pom diff --git a/docker/hoodie/hadoop/s
[incubator-hudi] 02/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc3
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch release-0.5.0 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git commit e3db2d8d5d5b04705ef1bd845a8bc847748b4d0e Author: Balaji Varadarajan AuthorDate: Mon Sep 30 16:34:45 2019 -0700 [HUDI-121] Preparing for Release 0.5.0-incubating-rc3 --- docker/hoodie/hadoop/base/pom.xml | 2 +- docker/hoodie/hadoop/datanode/pom.xml | 2 +- docker/hoodie/hadoop/historyserver/pom.xml| 2 +- docker/hoodie/hadoop/hive_base/pom.xml| 2 +- docker/hoodie/hadoop/namenode/pom.xml | 2 +- docker/hoodie/hadoop/pom.xml | 2 +- docker/hoodie/hadoop/prestobase/pom.xml | 2 +- docker/hoodie/hadoop/spark_base/pom.xml | 2 +- docker/hoodie/hadoop/sparkadhoc/pom.xml | 2 +- docker/hoodie/hadoop/sparkmaster/pom.xml | 2 +- docker/hoodie/hadoop/sparkworker/pom.xml | 2 +- hudi-cli/pom.xml | 2 +- hudi-client/pom.xml | 2 +- hudi-common/pom.xml | 2 +- hudi-hadoop-mr/pom.xml| 2 +- hudi-hive/pom.xml | 2 +- hudi-integ-test/pom.xml | 2 +- hudi-spark/pom.xml| 2 +- hudi-timeline-service/pom.xml | 2 +- hudi-utilities/pom.xml| 2 +- packaging/hudi-hadoop-mr-bundle/pom.xml | 2 +- packaging/hudi-hive-bundle/pom.xml| 2 +- packaging/hudi-presto-bundle/pom.xml | 2 +- packaging/hudi-spark-bundle/pom.xml | 2 +- packaging/hudi-timeline-server-bundle/pom.xml | 2 +- packaging/hudi-utilities-bundle/pom.xml | 2 +- pom.xml | 2 +- 27 files changed, 27 insertions(+), 27 deletions(-) diff --git a/docker/hoodie/hadoop/base/pom.xml b/docker/hoodie/hadoop/base/pom.xml index 8cb0ab2..7c0a7e8 100644 --- a/docker/hoodie/hadoop/base/pom.xml +++ b/docker/hoodie/hadoop/base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/datanode/pom.xml b/docker/hoodie/hadoop/datanode/pom.xml index ed4533f..e41e80a 100644 --- a/docker/hoodie/hadoop/datanode/pom.xml +++ b/docker/hoodie/hadoop/datanode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/historyserver/pom.xml b/docker/hoodie/hadoop/historyserver/pom.xml index b3455c6..f4cc5f9 100644 --- a/docker/hoodie/hadoop/historyserver/pom.xml +++ b/docker/hoodie/hadoop/historyserver/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/hive_base/pom.xml b/docker/hoodie/hadoop/hive_base/pom.xml index 0afaa0e..7dbc0fa 100644 --- a/docker/hoodie/hadoop/hive_base/pom.xml +++ b/docker/hoodie/hadoop/hive_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/namenode/pom.xml b/docker/hoodie/hadoop/namenode/pom.xml index 257781a..fca0ed1 100644 --- a/docker/hoodie/hadoop/namenode/pom.xml +++ b/docker/hoodie/hadoop/namenode/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml index a339226..f93e4a3 100644 --- a/docker/hoodie/hadoop/pom.xml +++ b/docker/hoodie/hadoop/pom.xml @@ -19,7 +19,7 @@ hudi org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 ../../../pom.xml 4.0.0 diff --git a/docker/hoodie/hadoop/prestobase/pom.xml b/docker/hoodie/hadoop/prestobase/pom.xml index 1459268..0cf1501 100644 --- a/docker/hoodie/hadoop/prestobase/pom.xml +++ b/docker/hoodie/hadoop/prestobase/pom.xml @@ -22,7 +22,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/spark_base/pom.xml b/docker/hoodie/hadoop/spark_base/pom.xml index 28a3b78..e88d0a5 100644 --- a/docker/hoodie/hadoop/spark_base/pom.xml +++ b/docker/hoodie/hadoop/spark_base/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3 4.0.0 pom diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml b/docker/hoodie/hadoop/sparkadhoc/pom.xml index 0c6e1b4..bc01720 100644 --- a/docker/hoodie/hadoop/sparkadhoc/pom.xml +++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml @@ -19,7 +19,7 @@ hudi-hadoop-docker org.apache.hudi -0.5.0-incubating-rc2 +0.5.0-incubating-rc3
[incubator-hudi] branch master updated: [HUDI-121] Fix bug in validation in create_source_release.sh
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new e75fa07 [HUDI-121] Fix bug in validation in create_source_release.sh e75fa07 is described below commit e75fa070f85ed1e0f72f9718a715a08a76679140 Author: Balaji Varadarajan AuthorDate: Thu Oct 3 09:20:12 2019 -0700 [HUDI-121] Fix bug in validation in create_source_release.sh --- scripts/release/create_source_release.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/release/create_source_release.sh b/scripts/release/create_source_release.sh index b911936..05aa4b9 100755 --- a/scripts/release/create_source_release.sh +++ b/scripts/release/create_source_release.sh @@ -29,8 +29,8 @@ set -o nounset set -o xtrace CURR_DIR=`pwd` -if [[ `basename $CURR_DIR` != "release" ]] ; then - echo "You have to call the script from the release/ dir" +if [[ `basename $CURR_DIR` != "scripts" ]] ; then + echo "You have to call the script from the scripts/ dir" exit 1 fi
[GitHub] [incubator-hudi] vinothchandar commented on issue #933: Support for multiple level partitioning in Hudi
vinothchandar commented on issue #933: Support for multiple level partitioning in Hudi URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-538021110 @HariprasadAllaka1612 thanks! mind updating the FAQs? :) https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185 cc @bhasudha This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HUDI-286) Remove or hide tags from Hudi official web site
[ https://issues.apache.org/jira/browse/HUDI-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943708#comment-16943708 ] Vinoth Chandar commented on HUDI-286: - +1 thanks for catching this > Remove or hide tags from Hudi official web site > --- > > Key: HUDI-286 > URL: https://issues.apache.org/jira/browse/HUDI-286 > Project: Apache Hudi (incubating) > Issue Type: Wish > Components: Docs >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Currently, Hudi's doc did not provide a tag HTML page. While we provided a > hyper link to an unknown URL, e.g. > [getting_started|http://hudi.apache.org/tag_getting_started.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest
[ https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943711#comment-16943711 ] Vinoth Chandar commented on HUDI-290: - +1 +1 :) that was due to me and someone else have two styles.. > Normalize Test class name of HoodieWriteConfigTest > -- > > Key: HUDI-290 > URL: https://issues.apache.org/jira/browse/HUDI-290 > Project: Apache Hudi (incubating) > Issue Type: Sub-task > Components: Testing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > In general, a test case name start with {{Test}}. It would be better to > rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-253) DeltaStreamer should report nicer error messages for misconfigs
[ https://issues.apache.org/jira/browse/HUDI-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943709#comment-16943709 ] Vinoth Chandar commented on HUDI-253: - No worries! > DeltaStreamer should report nicer error messages for misconfigs > --- > > Key: HUDI-253 > URL: https://issues.apache.org/jira/browse/HUDI-253 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: deltastreamer, Usability >Reporter: Vinoth Chandar >Assignee: Pratyaksh Sharma >Priority: Major > > e.g: > https://lists.apache.org/thread.html/4fdcdd7ba77a4f0366ec0e95f54298115fcc9567f6b0c9998f1b92b7@ > -- This message was sent by Atlassian Jira (v8.3.4#803005)
svn commit: r36186 - in /dev/incubator/hudi/hudi-0.5.0-incubating-rc4: ./ hudi-0.5.0-incubating-rc4.src.tgz hudi-0.5.0-incubating-rc4.src.tgz.asc hudi-0.5.0-incubating-rc4.src.tgz.sha512
Author: vbalaji Date: Thu Oct 3 16:40:15 2019 New Revision: 36186 Log: Adding source release of hudi-0.5.0-incubating-rc4 Added: dev/incubator/hudi/hudi-0.5.0-incubating-rc4/ dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz (with props) dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512 Added: dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz == Binary file - no diff available. Propchange: dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz -- svn:mime-type = application/octet-stream Added: dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc == --- dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc (added) +++ dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc Thu Oct 3 16:40:15 2019 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIzBAABCAAdFiEEr5uvedMRo9Mojlg/JKSZA3JiqqQFAl2WH98ACgkQJKSZA3Ji +qqTTLhAA520jepPv4imii1vlHMdV2Yh+4ju48M/58ecV3rGqqUHCFY8A39z3KrfY +pdqDzx+osl1Lh+eIdQjvI3DSAfuQbFgebQ19fg5vTlV9/UFYi4D51lDckuUMTxEm +ZOF5cknhFERa6gbiObXDvaecuvVsnkkTn/6AFIVb23za2YbCKvQh9h+eNi3rPmAr +dRdwcGUNpWByRdv4n7E+82+Hl8Rp+7/lzM1aTWD2Ihlnxu5V8M8bhY8QN46LNNmr +rz2aGJ5a1tsLCOT6PvLvKRr6TPwCJfJafUgfyGCjWWZOg4wUX3W8nykrChVUJAYH +fei71CDlU+jQgyDAPRpv7I3aiRXVhQNt9UrBl4mY8A735ntyGgjip53z+ztsWxLL +XBxmDXr8fedGZLEU4QWvf6P0jcpMvHh1i4fNYP7u1LNyuvQ0NvkvQKlzjtcIf8sP +9cnz23exd8NrrO8AFsua3r8t73C0YAT37HtkRPskifFT0IFowEGSdURlzbqQSYLZ +TdjyNWuShnGN//Yqt+aT1JnpD2cHQcllJHK0t3yjQAOlvxSLafVFhQvVmURMrxjE +JO9dMqh8KUkMTbCj7E2s/aONxVC/c+6RLZ1iufa4MtEgcc5Y8Mld4EqbITiH/qWC +eSgoe5ma/3Hx42i07RGqigftpFP1M4Tma+bsu01Uh0Tnp+CTZKY= +=Ny8w +-END PGP SIGNATURE- Added: dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512 == --- dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512 (added) +++ dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512 Thu Oct 3 16:40:15 2019 @@ -0,0 +1 @@ +e57b1ab3dbe3a061bc8c57c32523946a5e8e01bc7aa73ab11915ed986d9fdb43c63e9681bf32636a559ddb285e465ce718e2cd05197316645a7d1bbc547f267d hudi-0.5.0-incubating-rc4.src.tgz
[incubator-hudi] branch master updated: [HUDI-121] Fix bug in validation in deploy_staging_jars.sh
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new cef06c1 [HUDI-121] Fix bug in validation in deploy_staging_jars.sh cef06c1 is described below commit cef06c1e4849ea93644c17af89b56b3f17d322fc Author: Balaji Varadarajan AuthorDate: Thu Oct 3 09:42:30 2019 -0700 [HUDI-121] Fix bug in validation in deploy_staging_jars.sh --- scripts/release/deploy_staging_jars.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/release/deploy_staging_jars.sh b/scripts/release/deploy_staging_jars.sh index 7c067e6..5867f9c 100755 --- a/scripts/release/deploy_staging_jars.sh +++ b/scripts/release/deploy_staging_jars.sh @@ -29,8 +29,8 @@ set -o nounset set -o xtrace CURR_DIR=`pwd` -if [[ `basename $CURR_DIR` != "release" ]] ; then - echo "You have to call the script from the release/ dir" +if [[ `basename $CURR_DIR` != "scripts" ]] ; then + echo "You have to call the script from the scripts/ dir" exit 1 fi
[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment
[ https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943728#comment-16943728 ] Vinoth Chandar commented on HUDI-288: - Great! This will help larger companies with more topics easily adopt delta streamer. > Add support for ingesting multiple kafka streams in a single DeltaStreamer > deployment > - > > Key: HUDI-288 > URL: https://issues.apache.org/jira/browse/HUDI-288 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: deltastreamer >Reporter: Vinoth Chandar >Assignee: leesf >Priority: Major > > https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@ > has all the context -- This message was sent by Atlassian Jira (v8.3.4#803005)
[incubator-hudi] branch master updated: [HUDI-265] Failed to delete tmp dirs created in unit tests (#928)
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 3dedc7e [HUDI-265] Failed to delete tmp dirs created in unit tests (#928) 3dedc7e is described below commit 3dedc7e5fdd5f885915e81e47e110b845a905dbf Author: leesf <490081...@qq.com> AuthorDate: Fri Oct 4 00:48:13 2019 +0800 [HUDI-265] Failed to delete tmp dirs created in unit tests (#928) --- .../org/apache/hudi/HoodieClientTestHarness.java | 48 +--- .../org/apache/hudi/TestCompactionAdminClient.java | 5 +- .../java/org/apache/hudi/TestConsistencyGuard.java | 6 +- .../apache/hudi/func/TestUpdateMapFunction.java| 5 +- .../hudi/index/TestHBaseQPSResourceAllocator.java | 3 +- .../java/org/apache/hudi/index/TestHbaseIndex.java | 3 +- .../org/apache/hudi/index/TestHoodieIndex.java | 5 +- .../hudi/index/bloom/TestHoodieBloomIndex.java | 3 +- .../index/bloom/TestHoodieGlobalBloomIndex.java| 5 +- .../apache/hudi/io/TestHoodieCommitArchiveLog.java | 3 +- .../org/apache/hudi/io/TestHoodieCompactor.java| 3 +- .../org/apache/hudi/io/TestHoodieMergeHandle.java | 3 +- .../apache/hudi/table/TestCopyOnWriteTable.java| 3 +- .../apache/hudi/table/TestMergeOnReadTable.java| 4 +- .../hudi/common/HoodieCommonTestHarness.java | 89 ++ .../common/table/HoodieTableMetaClientTest.java| 14 +--- .../hudi/common/table/log/HoodieLogFormatTest.java | 11 ++- .../table/string/HoodieActiveTimelineTest.java | 13 ++-- .../table/view/HoodieTableFileSystemViewTest.java | 38 - .../table/view/IncrementalFSViewSyncTest.java | 68 ++--- .../RocksDBBasedIncrementalFSViewSyncTest.java | 4 +- .../table/view/RocksDbBasedFileSystemViewTest.java | 2 +- ...SpillableMapBasedIncrementalFSViewSyncTest.java | 2 +- .../hudi/common/util/TestCompactionUtils.java | 20 ++--- .../org/apache/hudi/common/util/TestFSUtils.java | 14 ++-- .../apache/hudi/common/util/TestFileIOUtils.java | 7 +- .../apache/hudi/common/util/TestParquetUtils.java | 15 +--- .../hudi/common/util/TestRocksDBManager.java | 12 ++- .../common/util/collection/TestDiskBasedMap.java | 24 +++--- .../util/collection/TestExternalSpillableMap.java | 32 .../util/collection/TestRocksDbBasedMap.java | 11 ++- .../hudi/hadoop/TestHoodieROTablePathFilter.java | 13 ++-- .../hudi/utilities/TestHoodieSnapshotCopier.java | 32 33 files changed, 248 insertions(+), 272 deletions(-) diff --git a/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java b/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java index 10fb0bc..80cb70f 100644 --- a/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java +++ b/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java @@ -17,7 +17,6 @@ package org.apache.hudi; -import java.io.File; import java.io.IOException; import java.io.Serializable; import java.util.concurrent.ExecutorService; @@ -29,30 +28,27 @@ import org.apache.hadoop.fs.Path; import org.apache.hadoop.hdfs.DistributedFileSystem; import org.apache.hadoop.hdfs.MiniDFSCluster; import org.apache.hudi.common.HoodieClientTestUtils; +import org.apache.hudi.common.HoodieCommonTestHarness; import org.apache.hudi.common.HoodieTestDataGenerator; import org.apache.hudi.common.minicluster.HdfsTestService; -import org.apache.hudi.common.model.HoodieTableType; import org.apache.hudi.common.model.HoodieTestUtils; import org.apache.hudi.common.table.HoodieTableMetaClient; import org.apache.hudi.common.util.FSUtils; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.sql.SQLContext; -import org.junit.rules.TemporaryFolder; import org.slf4j.Logger; import org.slf4j.LoggerFactory; /** * The test harness for resource initialization and cleanup. */ -public abstract class HoodieClientTestHarness implements Serializable { +public abstract class HoodieClientTestHarness extends HoodieCommonTestHarness implements Serializable { private static final Logger logger = LoggerFactory.getLogger(HoodieClientTestHarness.class); protected transient JavaSparkContext jsc = null; protected transient SQLContext sqlContext; protected transient FileSystem fs; - protected String basePath = null; - protected TemporaryFolder folder = null; protected transient HoodieTestDataGenerator dataGen = null; protected transient ExecutorService executorService; protected transient HoodieTableMetaClient metaClient; @@ -69,7 +65,7 @@ public abstract class HoodieClientTestHarness implements Serializable { * @throws IOException */ public void initResources() throws IOException { -initTempFolderAndPath(); +initPath(); initSparkContexts(); initTestData
[GitHub] [incubator-hudi] vinothchandar merged pull request #928: [HUDI-265] Failed to delete tmp dirs created in unit tests
vinothchandar merged pull request #928: [HUDI-265] Failed to delete tmp dirs created in unit tests URL: https://github.com/apache/incubator-hudi/pull/928 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HUDI-64) Estimation of compression ratio & other dynamic storage knobs based on historical stats
[ https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943733#comment-16943733 ] Vinoth Chandar commented on HUDI-64: [~yanghua] absolutely no problem. take your time :) > Estimation of compression ratio & other dynamic storage knobs based on > historical stats > --- > > Key: HUDI-64 > URL: https://issues.apache.org/jira/browse/HUDI-64 > Project: Apache Hudi (incubating) > Issue Type: New Feature > Components: Storage Management, Write Client >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > Roughly along the likes of. [https://github.com/uber/hudi/issues/270] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-289) Implement a long running test for Hudi writing and querying end-end
[ https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943731#comment-16943731 ] Vinoth Chandar commented on HUDI-289: - Awesome! https://github.com/apache/incubator-hudi/pull/623 was an intial attempt at this. High level thinking now is to use the existing DistributedTestDataSource to generata some workload and then compare state of the datasource (backed by rocksdb and the dataset) after each commit or action.. Just raw thoughts.. Feel free to completely change approach as well. > Implement a long running test for Hudi writing and querying end-end > --- > > Key: HUDI-289 > URL: https://issues.apache.org/jira/browse/HUDI-289 > Project: Apache Hudi (incubating) > Issue Type: Test > Components: Usability >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > > We would need an equivalent of an end-end test which runs some workload for > few hours atleast, triggers various actions like commit, deltacopmmit, > rollback, compaction and ensures correctness of code before every release > P.S: Learn from all the CSS issues managing compaction.. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[incubator-hudi] tag 0.5.0-incubating-rc4 created (now fb053bf)
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to tag 0.5.0-incubating-rc4 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. at fb053bf (commit) No new revisions were added by this update.
[incubator-hudi] tag release-0.5.0-incubating-rc4 created (now fb053bf)
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to tag release-0.5.0-incubating-rc4 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. at fb053bf (commit) No new revisions were added by this update.
[jira] [Created] (HUDI-291) Simplify quickstart
Bhavani Sudha Saktheeswaran created HUDI-291: Summary: Simplify quickstart Key: HUDI-291 URL: https://issues.apache.org/jira/browse/HUDI-291 Project: Apache Hudi (incubating) Issue Type: Improvement Components: Docs, docs-chinese, Usability Reporter: Bhavani Sudha Saktheeswaran Assignee: Bhavani Sudha Saktheeswaran Make quickstart really simple by only using spark examples and default configs for easier playing around with Hudi APIs. The intent is to introduce what Hudi offers to end users as quickly as possible, without having to deal with setting up Hive or other external systems. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-291) Simplify quickstart
[ https://issues.apache.org/jira/browse/HUDI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-291: Labels: pull-request-available (was: ) > Simplify quickstart > --- > > Key: HUDI-291 > URL: https://issues.apache.org/jira/browse/HUDI-291 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: Docs, docs-chinese, Usability >Reporter: Bhavani Sudha Saktheeswaran >Assignee: Bhavani Sudha Saktheeswaran >Priority: Minor > Labels: pull-request-available > > Make quickstart really simple by only using spark examples and default > configs for easier playing around with Hudi APIs. The intent is to introduce > what Hudi offers to end users as quickly as possible, without having to deal > with setting up Hive or other external systems. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] bhasudha opened a new pull request #937: [HUDI-291] Simplify quickstart documentation
bhasudha opened a new pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937 - Uses spark-shell based examples to showcase Hudi core features - Info related to hive sync, hive, presto, etc are removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538117831 Please review the contents of this quickstart and let me know if you want to see any changes. For a rough idea, the quickstart will look like the schreenshot below [quickstart_screenshot.pdf](https://github.com/apache/incubator-hudi/files/3687968/quickstart_screenshot.pdf) I am planning to add more styling changes to this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538118329 @yanghua @leesf I am making these changes in the main quickstart.md. May need your help in changing the corresponding .cn.md files after the reviews This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275123 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276090 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275748 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331274747 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: Review comment: include git clone command as well? so someone can just keep copy pasting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276260 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276441 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331274347 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide Review comment: peek This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275909 ## File path: docs/quickstart.md ## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| | - | | | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: ``` -cd incubator-hudi -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ -export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin -export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2 -export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop -export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7 -export SPARK_INSTALL=$SPARK_HOME -export SPARK_CONF_DIR=$SPARK_HOME/conf -export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH +$ mvn clean install -DskipTests -DskipITs + +$ # Export the location of hudi-spark-bundle for later reference +$ mkdir -p /var/tmp/hudi && cp packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar /var/tmp/hudi/hudi-spark-bundle.jar +$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar ``` -### Run HoodieJavaApp +## Setup spark-shell +Hudi works with Spark-2.x versions. You can follow instructions [here](https://spark.apache.org/downloads.html) for +setting up spark. -Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously inserted 100 records) onto your DFS/local filesystem. Use the wrapper script -to run from command-line +From the extracted directory run spark-shell with Hudi as: ``` -cd hudi-spark -./run_hoodie_app.sh --help -Usage: [options] - Options: ---help, -h - Default: false ---table-name, -n - table name for Hudi sample tabl
[GitHub] [incubator-hudi] vinothchandar commented on issue #937: [HUDI-291] Simplify quickstart documentation
vinothchandar commented on issue #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538150247 My bad .. saw the pdf now.. Can we remove the toc.. Its not very useful if we expect most users just follow the whole page This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538151189 Makes sense. Thanks @vinothchandar for the reviews. I ll take a closer look at it later today. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-hudi] tag 0.5.0-incubating-rc4 deleted (was fb053bf)
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to tag 0.5.0-incubating-rc4 in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. *** WARNING: tag 0.5.0-incubating-rc4 was deleted! *** was fb053bf [HUDI-121] Preparing for Release 0.5.0-incubating-rc4 The revisions that were on this tag are still contained in other references; therefore, this change does not discard any commits from the repository.
[jira] [Closed] (HUDI-265) Failed to delete tmp dirs created in unit tests
[ https://issues.apache.org/jira/browse/HUDI-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf closed HUDI-265. -- Resolution: Fixed Fixed via master: 3dedc7e5fdd5f885915e81e47e110b845a905dbf > Failed to delete tmp dirs created in unit tests > --- > > Key: HUDI-265 > URL: https://issues.apache.org/jira/browse/HUDI-265 > Project: Apache Hudi (incubating) > Issue Type: Test > Components: Testing >Reporter: leesf >Assignee: leesf >Priority: Major > Labels: pull-request-available > Fix For: 0.5.1 > > Time Spent: 20m > Remaining Estimate: 0h > > In some unit tests, such as TestHoodieSnapshotCopier, TestUpdateMapFunction. > After run these tests, it fails to delete tmp dir created in _init(with > before annotation)_ after clean(with after annotation), thus will cause too > many folders in /tmp. we need to delete these dirs after finishing ut. > I will go through all the unit tests that did not properly delete the tmp dir > and send a patch. > > cc [~vinoth] [~vbalaji] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] Guru107 commented on issue #143: Tracking ticket for folks to be added to slack group
Guru107 commented on issue #143: Tracking ticket for folks to be added to slack group URL: https://github.com/apache/incubator-hudi/issues/143#issuecomment-538180537 Hi @vinothchandar, please add guruak...@gmail.com. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] leesf commented on issue #937: [HUDI-291] Simplify quickstart documentation
leesf commented on issue #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538180728 > @yanghua @leesf I am making these changes in the main quickstart.md. May need your help in changing the corresponding .cn.md files after the reviews Happy to see the improvement of the quickstart. And i am glad to apply the changes to chinese doc. @bhasudha This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] leesf opened a new pull request #938: [HUDI-232] Implement sealing/unsealing for HoodieRecord class
leesf opened a new pull request #938: [HUDI-232] Implement sealing/unsealing for HoodieRecord class URL: https://github.com/apache/incubator-hudi/pull/938 see jira: https://jira.apache.org/jira/projects/HUDI/issues/HUDI-232 CC @vinothchandar Please review when you get a chance. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HUDI-232) Implement sealing/unsealing for HoodieRecord class
[ https://issues.apache.org/jira/browse/HUDI-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-232: Labels: pull-request-available (was: ) > Implement sealing/unsealing for HoodieRecord class > -- > > Key: HUDI-232 > URL: https://issues.apache.org/jira/browse/HUDI-232 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: Write Client >Affects Versions: 0.5.0 >Reporter: Vinoth Chandar >Assignee: leesf >Priority: Major > Labels: pull-request-available > > HoodieRecord class sometimes is modified to set the record location. We can > get into issues like HUDI-170 if the modification is misplaced. We need a > mechanism to seal the class and unseal for modification explicity.. Try to > modify in sealed state should throw an error -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-265) Failed to delete tmp dirs created in unit tests
[ https://issues.apache.org/jira/browse/HUDI-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-265: Fix Version/s: (was: 0.5.1) 0.5.0 > Failed to delete tmp dirs created in unit tests > --- > > Key: HUDI-265 > URL: https://issues.apache.org/jira/browse/HUDI-265 > Project: Apache Hudi (incubating) > Issue Type: Test > Components: Testing >Reporter: leesf >Assignee: leesf >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In some unit tests, such as TestHoodieSnapshotCopier, TestUpdateMapFunction. > After run these tests, it fails to delete tmp dir created in _init(with > before annotation)_ after clean(with after annotation), thus will cause too > many folders in /tmp. we need to delete these dirs after finishing ut. > I will go through all the unit tests that did not properly delete the tmp dir > and send a patch. > > cc [~vinoth] [~vbalaji] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-hudi] bvaradar opened a new pull request #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi
bvaradar opened a new pull request #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi URL: https://github.com/apache/incubator-hudi/pull/939 1. Checked for other leftover license/notice files 2. Exploded fat jars and checked LICENSE and notice files 3. Removed unused license-mapping file and notice related configurations in pom.xml This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] bvaradar commented on issue #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi
bvaradar commented on issue #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi URL: https://github.com/apache/incubator-hudi/pull/939#issuecomment-538235145 The unit-test logs looks fine. @vinothchandar : Can you review . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services