[GitHub] [incubator-hudi] leesf commented on issue #928: [HUDI-265] Failed to delete tmp dirs created in unit tests

2019-10-03 Thread GitBox
leesf commented on issue #928: [HUDI-265] Failed to delete tmp dirs created in 
unit tests
URL: https://github.com/apache/incubator-hudi/pull/928#issuecomment-537838407
 
 
   Any other concerns to be merged? @vinothchandar 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (HUDI-286) Remove or hide tags from Hudi official web site

2019-10-03 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943415#comment-16943415
 ] 

vinoyang commented on HUDI-286:
---

[~vinoth] Yes, we do not actively use tags in the docs. However, a 
"getting_started" tag is shown on the first page of the web site. When visitors 
click it, the URL redirects to a non-effect page. IMO, if we remove or hide the 
element from the page, it would be better.

> Remove or hide tags from Hudi official web site
> ---
>
> Key: HUDI-286
> URL: https://issues.apache.org/jira/browse/HUDI-286
> Project: Apache Hudi (incubating)
>  Issue Type: Wish
>  Components: Docs
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Currently, Hudi's doc did not provide a tag HTML page. While we provided a 
> hyper link to an unknown URL, e.g. 
> [getting_started|http://hudi.apache.org/tag_getting_started.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

2019-10-03 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943424#comment-16943424
 ] 

leesf commented on HUDI-288:


Yes, and will feedback after a closer look at the details of current data flow 
from kafka to hudi.

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -
>
> Key: HUDI-288
> URL: https://issues.apache.org/jira/browse/HUDI-288
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-217) Provide a unified resource management class to standardize the resource allocation and release for hudi client test cases

2019-10-03 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-217.
-
Resolution: Done

Done via master: 895d732a1423f3a113d9630aa3c56e4b0837effd

> Provide a unified resource management class to standardize the resource 
> allocation and release for hudi client test cases
> -
>
> Key: HUDI-217
> URL: https://issues.apache.org/jira/browse/HUDI-217
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, resource allocation and release are very confused in the test 
> cases of hudi client module. We should provide a unified class to manage the 
> resources. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest

2019-10-03 Thread vinoyang (Jira)
vinoyang created HUDI-290:
-

 Summary: Normalize Test class name of HoodieWriteConfigTest
 Key: HUDI-290
 URL: https://issues.apache.org/jira/browse/HUDI-290
 Project: Apache Hudi (incubating)
  Issue Type: Sub-task
  Components: Testing
Reporter: vinoyang
Assignee: vinoyang


In general, a test case name start with {{Test}}. It would be better to rename 
{{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest

2019-10-03 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943430#comment-16943430
 ] 

vinoyang commented on HUDI-290:
---

[~vinoth] WDYT?

> Normalize Test class name of HoodieWriteConfigTest
> --
>
> Key: HUDI-290
> URL: https://issues.apache.org/jira/browse/HUDI-290
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> In general, a test case name start with {{Test}}. It would be better to 
> rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-64) Estimation of compression ratio & other dynamic storage knobs based on historical stats

2019-10-03 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943434#comment-16943434
 ] 

vinoyang commented on HUDI-64:
--

[~vinoth] I'd like to take this ticket. I am in China's National Day holiday, 
and I may have time after October 8th.

> Estimation of compression ratio & other dynamic storage knobs based on 
> historical stats
> ---
>
> Key: HUDI-64
> URL: https://issues.apache.org/jira/browse/HUDI-64
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Storage Management, Write Client
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Roughly along the likes of. [https://github.com/uber/hudi/issues/270] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-289) Implement a long running test for Hudi writing and querying end-end

2019-10-03 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943435#comment-16943435
 ] 

vinoyang commented on HUDI-289:
---

[~vinoth] OK, I'd like to take this ticket.

> Implement a long running test for Hudi writing and querying end-end
> ---
>
> Key: HUDI-289
> URL: https://issues.apache.org/jira/browse/HUDI-289
> Project: Apache Hudi (incubating)
>  Issue Type: Test
>  Components: Usability
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> We would need an equivalent of an end-end test which runs some workload for 
> few hours atleast, triggers various actions like commit, deltacopmmit, 
> rollback, compaction and ensures correctness of code before every release
> P.S: Learn from all the CSS issues managing compaction.. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] leesf opened a new pull request #936: [HUDI-285] Implement HoodieStorageWriter based on actual file type

2019-10-03 Thread GitBox
leesf opened a new pull request #936: [HUDI-285] Implement HoodieStorageWriter 
based on actual file type
URL: https://github.com/apache/incubator-hudi/pull/936
 
 
   see jira: https://jira.apache.org/jira/projects/HUDI/issues/HUDI-285
   
   CC @vinothchandar 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-285) Implement HoodieStorageWriter based on actual file type

2019-10-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-285:

Labels: pull-request-available  (was: )

> Implement HoodieStorageWriter based on actual file type
> ---
>
> Key: HUDI-285
> URL: https://issues.apache.org/jira/browse/HUDI-285
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Write Client
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>
> Currently the _getStorageWriter_ method in HoodieStorageWriterFactory to get 
> HoodieStorageWriter is hard code to HoodieParquetWriter since currently only 
> parquet is supported for HoodieStorageWriter. However, it is better to 
> implement HoodieStorageWriter based on actual file type for extension.
> cc [~vinoth] [~vbalaji]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

2019-10-03 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf reassigned HUDI-288:
--

Assignee: leesf

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -
>
> Key: HUDI-288
> URL: https://issues.apache.org/jira/browse/HUDI-288
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest

2019-10-03 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943558#comment-16943558
 ] 

leesf commented on HUDI-290:


+1 rename to TestHoodieWriteConfig, and i see many UTs start already start with 
Test... Also please check other UTs name not started with Test in the project.  
Thanks.

> Normalize Test class name of HoodieWriteConfigTest
> --
>
> Key: HUDI-290
> URL: https://issues.apache.org/jira/browse/HUDI-290
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> In general, a test case name start with {{Test}}. It would be better to 
> rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] tweise commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.

2019-10-03 Thread GitBox
tweise commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in 
hoodie child modules. 
URL: https://github.com/apache/incubator-hudi/pull/935#issuecomment-537975855
 
 
   Did you verify that the files are automatically included into the jar?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest

2019-10-03 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943558#comment-16943558
 ] 

leesf edited comment on HUDI-290 at 10/3/19 2:53 PM:
-

+1 rename to TestHoodieWriteConfig, and i see many UT names already start with 
Test... Also please check other UT names not started with Test in the project.  
Thanks.


was (Author: xleesf):
+1 rename to TestHoodieWriteConfig, and i see many UTs start already start with 
Test... Also please check other UTs name not started with Test in the project.  
Thanks.

> Normalize Test class name of HoodieWriteConfigTest
> --
>
> Key: HUDI-290
> URL: https://issues.apache.org/jira/browse/HUDI-290
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> In general, a test case name start with {{Test}}. It would be better to 
> rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] bvaradar commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.

2019-10-03 Thread GitBox
bvaradar commented on issue #935: [HUDI-287] Remove LICENSE and NOTICE files in 
hoodie child modules. 
URL: https://github.com/apache/incubator-hudi/pull/935#issuecomment-538009259
 
 
   > Did you verify that the files are automatically included into the jar?
   
   Yes, They are present in the jars. 
   
   For e:g: 
   
   varadarb-C02SH0P1G8WL:target varadarb$ jar tf hudi-common-0.5.1-SNAPSHOT.jar 
| grep META-INF
   META-INF/
   
   META-INF/LICENSE
   META-INF/NOTICE
   .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar merged pull request #935: [HUDI-287] Remove LICENSE and NOTICE files in hoodie child modules.

2019-10-03 Thread GitBox
bvaradar merged pull request #935: [HUDI-287] Remove LICENSE and NOTICE files 
in hoodie child modules. 
URL: https://github.com/apache/incubator-hudi/pull/935
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (HUDI-287) Remove LICENSE and NOTICE files in hoodie child modules

2019-10-03 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan resolved HUDI-287.
-
Resolution: Fixed

> Remove LICENSE and NOTICE files in hoodie child modules
> ---
>
> Key: HUDI-287
> URL: https://issues.apache.org/jira/browse/HUDI-287
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: Balaji Varadarajan
>Assignee: Balaji Varadarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This was earlier added to ensure LICENSE and NOTICE files are present in 
> generated jars. In the earlier pom setup, hudi parent-pom was not linked to 
> apache parent pom. There were no "generate-resource-bundle" plugin present in 
> parent hudi pom to automatically generate LICENSE and NOTICE files in jars 
> and we resorted to manually storing the LICENSE/NOTICE files in each 
> submodule 
>  
> With Apache parent pom, the NOTICE and LICENSE files are automatically setup 
> for each jar added.  This would mean that we can safely remove all LICENSE 
> and NOTICE files in each submodules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-287) Remove LICENSE and NOTICE files in hoodie child modules

2019-10-03 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan closed HUDI-287.
---

> Remove LICENSE and NOTICE files in hoodie child modules
> ---
>
> Key: HUDI-287
> URL: https://issues.apache.org/jira/browse/HUDI-287
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: Balaji Varadarajan
>Assignee: Balaji Varadarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This was earlier added to ensure LICENSE and NOTICE files are present in 
> generated jars. In the earlier pom setup, hudi parent-pom was not linked to 
> apache parent pom. There were no "generate-resource-bundle" plugin present in 
> parent hudi pom to automatically generate LICENSE and NOTICE files in jars 
> and we resorted to manually storing the LICENSE/NOTICE files in each 
> submodule 
>  
> With Apache parent pom, the NOTICE and LICENSE files are automatically setup 
> for each jar added.  This would mean that we can safely remove all LICENSE 
> and NOTICE files in each submodules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[incubator-hudi] branch master updated: Update Release notes

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new e78ba59  Update Release notes
e78ba59 is described below

commit e78ba598c549d51a3e7ce4231acebe46f9828001
Author: Balaji Varadarajan 
AuthorDate: Thu Oct 3 09:11:37 2019 -0700

Update Release notes
---
 RELEASE_NOTES.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md
index 89beac2..3e356af 100644
--- a/RELEASE_NOTES.md
+++ b/RELEASE_NOTES.md
@@ -26,7 +26,9 @@ Release 0.5.0-incubating
  * Bug fixes in query side integration, hive-sync, deltaStreamer, compaction, 
rollbacks, restore
 
 ### Full PR List
-  * **Balaji Varadarajan** HUDI-121 : Address comments during RC2 voting
+  * **Balaji Varadarajan** [HUDI-287] Address comments during review of 
release candidate. Remove LICENSE and NOTICE files in hoodie child modules. 
+  * **Balaji Varadarajan** [HUDI-121] Fix bugs in Release Scripts found during 
RC creation
+  * **Balaji Varadarajan** [HUDI-121] : Address comments during RC2 voting
   * **Bhavani Sudha Saktheeswaran** [HUDI-271] Create QuickstartUtils for 
simplifying quickstart guide
   * **vinoyang** [HUDI-247] Unify the re-initialization of 
HoodieTableMetaClient in test for hoodie-client module (#930)
   * **Balaji Varadarajan** [HUDI-279] Fix regression in Schema Evolution due 
to PR-755



[incubator-hudi] 03/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc4

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch release-0.5.0
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit fb053bf44dc7ed862e86e63fa35de0e7e43aa234
Author: Balaji Varadarajan 
AuthorDate: Thu Oct 3 09:15:39 2019 -0700

[HUDI-121] Preparing for Release 0.5.0-incubating-rc4
---
 docker/hoodie/hadoop/base/pom.xml | 2 +-
 docker/hoodie/hadoop/datanode/pom.xml | 2 +-
 docker/hoodie/hadoop/historyserver/pom.xml| 2 +-
 docker/hoodie/hadoop/hive_base/pom.xml| 2 +-
 docker/hoodie/hadoop/namenode/pom.xml | 2 +-
 docker/hoodie/hadoop/pom.xml  | 2 +-
 docker/hoodie/hadoop/prestobase/pom.xml   | 2 +-
 docker/hoodie/hadoop/spark_base/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkadhoc/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkmaster/pom.xml  | 2 +-
 docker/hoodie/hadoop/sparkworker/pom.xml  | 2 +-
 hudi-cli/pom.xml  | 2 +-
 hudi-client/pom.xml   | 2 +-
 hudi-common/pom.xml   | 2 +-
 hudi-hadoop-mr/pom.xml| 2 +-
 hudi-hive/pom.xml | 2 +-
 hudi-integ-test/pom.xml   | 2 +-
 hudi-spark/pom.xml| 2 +-
 hudi-timeline-service/pom.xml | 2 +-
 hudi-utilities/pom.xml| 2 +-
 packaging/hudi-hadoop-mr-bundle/pom.xml   | 2 +-
 packaging/hudi-hive-bundle/pom.xml| 2 +-
 packaging/hudi-presto-bundle/pom.xml  | 2 +-
 packaging/hudi-spark-bundle/pom.xml   | 2 +-
 packaging/hudi-timeline-server-bundle/pom.xml | 2 +-
 packaging/hudi-utilities-bundle/pom.xml   | 2 +-
 pom.xml   | 2 +-
 27 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/docker/hoodie/hadoop/base/pom.xml 
b/docker/hoodie/hadoop/base/pom.xml
index 7c0a7e8..5921337 100644
--- a/docker/hoodie/hadoop/base/pom.xml
+++ b/docker/hoodie/hadoop/base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/datanode/pom.xml 
b/docker/hoodie/hadoop/datanode/pom.xml
index e41e80a..50e8de6 100644
--- a/docker/hoodie/hadoop/datanode/pom.xml
+++ b/docker/hoodie/hadoop/datanode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/historyserver/pom.xml 
b/docker/hoodie/hadoop/historyserver/pom.xml
index f4cc5f9..42f0fa7 100644
--- a/docker/hoodie/hadoop/historyserver/pom.xml
+++ b/docker/hoodie/hadoop/historyserver/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/hive_base/pom.xml 
b/docker/hoodie/hadoop/hive_base/pom.xml
index 7dbc0fa..f565444 100644
--- a/docker/hoodie/hadoop/hive_base/pom.xml
+++ b/docker/hoodie/hadoop/hive_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/namenode/pom.xml 
b/docker/hoodie/hadoop/namenode/pom.xml
index fca0ed1..0f356bb 100644
--- a/docker/hoodie/hadoop/namenode/pom.xml
+++ b/docker/hoodie/hadoop/namenode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml
index f93e4a3..453bdda 100644
--- a/docker/hoodie/hadoop/pom.xml
+++ b/docker/hoodie/hadoop/pom.xml
@@ -19,7 +19,7 @@
   
 hudi
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
 ../../../pom.xml
   
   4.0.0
diff --git a/docker/hoodie/hadoop/prestobase/pom.xml 
b/docker/hoodie/hadoop/prestobase/pom.xml
index 0cf1501..fd3f004 100644
--- a/docker/hoodie/hadoop/prestobase/pom.xml
+++ b/docker/hoodie/hadoop/prestobase/pom.xml
@@ -22,7 +22,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/spark_base/pom.xml 
b/docker/hoodie/hadoop/spark_base/pom.xml
index e88d0a5..3a805aa 100644
--- a/docker/hoodie/hadoop/spark_base/pom.xml
+++ b/docker/hoodie/hadoop/spark_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml 
b/docker/hoodie/hadoop/sparkadhoc/pom.xml
index bc01720..c521e13 100644
--- a/docker/hoodie/hadoop/sparkadhoc/pom.xml
+++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc3
+0.5.0-incubating-rc4
  

[incubator-hudi] branch release-0.5.0 updated (789f91e -> fb053bf)

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to branch release-0.5.0
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


omit 789f91e  [HUDI-121] Fix issues found during release candidate builds
omit dfe4784  [HUDI-121] Fix issues found during release candidate builds
omit c3a4dab  [HUDI-121] Preparing for Release 0.5.0-incubating-rc3
omit 37e9dc2  [HUDI-121] Preparing for Release 0.5.0-incubating-rc2
 add e41835f  [HUDI-121] Fix bugs in Release Scripts found during RC 
creation
 add 6da2f9a  [HUDI-287] Address comments during review of release 
candidate   1. Remove LICENSE and NOTICE files in hoodie child modules.   2. 
Remove developers and contributor section from pom   3. Also ensure any 
failures in validation script is reported appropriately   4. Make hoodie parent 
pom consistent with that of its parent apache-21 
(https://github.com/apache/maven-apache-parent/blob/apache-21/pom.xml)
 add e78ba59  Update Release notes
 new e571e14  [HUDI-121] Preparing for Release 0.5.0-incubating-rc2
 new e3db2d8  [HUDI-121] Preparing for Release 0.5.0-incubating-rc3
 new fb053bf  [HUDI-121] Preparing for Release 0.5.0-incubating-rc4

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (789f91e)
\
 N -- N -- N   refs/heads/release-0.5.0 (fb053bf)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 RELEASE_NOTES.md   |   4 +-
 docker/hoodie/hadoop/NOTICE|   5 -
 docker/hoodie/hadoop/base/NOTICE   |   5 -
 docker/hoodie/hadoop/base/pom.xml  |   2 +-
 docker/hoodie/hadoop/datanode/NOTICE   |   5 -
 docker/hoodie/hadoop/datanode/pom.xml  |   2 +-
 docker/hoodie/hadoop/historyserver/NOTICE  |   5 -
 docker/hoodie/hadoop/historyserver/pom.xml |   2 +-
 docker/hoodie/hadoop/hive_base/NOTICE  |   5 -
 docker/hoodie/hadoop/hive_base/pom.xml |   2 +-
 docker/hoodie/hadoop/namenode/NOTICE   |   5 -
 docker/hoodie/hadoop/namenode/pom.xml  |   2 +-
 docker/hoodie/hadoop/pom.xml   |   2 +-
 docker/hoodie/hadoop/prestobase/NOTICE |   5 -
 docker/hoodie/hadoop/prestobase/pom.xml|   2 +-
 docker/hoodie/hadoop/spark_base/NOTICE |   5 -
 docker/hoodie/hadoop/spark_base/pom.xml|   2 +-
 docker/hoodie/hadoop/sparkadhoc/NOTICE |   5 -
 docker/hoodie/hadoop/sparkadhoc/pom.xml|   2 +-
 docker/hoodie/hadoop/sparkmaster/NOTICE|   5 -
 docker/hoodie/hadoop/sparkmaster/pom.xml   |   2 +-
 docker/hoodie/hadoop/sparkworker/pom.xml   |   2 +-
 hudi-cli/pom.xml   |   2 +-
 hudi-cli/src/main/resources/META-INF/LICENSE   | 177 -
 hudi-cli/src/main/resources/META-INF/NOTICE|   5 -
 hudi-client/pom.xml|   2 +-
 hudi-client/src/main/resources/META-INF/LICENSE| 177 -
 hudi-client/src/main/resources/META-INF/NOTICE |   5 -
 hudi-common/pom.xml|   2 +-
 hudi-common/src/main/resources/META-INF/LICENSE| 177 -
 hudi-common/src/main/resources/META-INF/NOTICE |   5 -
 hudi-hadoop-mr/pom.xml |   2 +-
 hudi-hadoop-mr/src/main/resources/META-INF/LICENSE | 177 -
 hudi-hadoop-mr/src/main/resources/META-INF/NOTICE  |   5 -
 hudi-hive/pom.xml  |   2 +-
 hudi-hive/src/main/resources/META-INF/LICENSE  | 177 -
 hudi-hive/src/main/resources/META-INF/NOTICE   |   5 -
 hudi-integ-test/pom.xml|   2 +-
 .../src/main/resources/META-INF/LICENSE| 177 -
 hudi-integ-test/src/main/resources/META-INF/NOTICE |   5 -
 hudi-spark/pom.xml |   2 +-
 hudi-spark/src/main/resources/META-INF/LICENSE | 177 -
 hudi-spark/src/main/resources/META-INF/NOTICE  |   5 -
 hudi-timeline-service/pom.xml  |  

[incubator-hudi] 01/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc2

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch release-0.5.0
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit e571e143d5420bc6de3601c89e66fd15cca94de0
Author: Balaji Varadarajan 
AuthorDate: Tue Sep 17 10:35:16 2019 -0700

[HUDI-121] Preparing for Release 0.5.0-incubating-rc2
---
 docker/hoodie/hadoop/base/pom.xml | 2 +-
 docker/hoodie/hadoop/datanode/pom.xml | 2 +-
 docker/hoodie/hadoop/historyserver/pom.xml| 2 +-
 docker/hoodie/hadoop/hive_base/pom.xml| 2 +-
 docker/hoodie/hadoop/namenode/pom.xml | 2 +-
 docker/hoodie/hadoop/pom.xml  | 2 +-
 docker/hoodie/hadoop/prestobase/pom.xml   | 2 +-
 docker/hoodie/hadoop/spark_base/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkadhoc/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkmaster/pom.xml  | 2 +-
 docker/hoodie/hadoop/sparkworker/pom.xml  | 2 +-
 hudi-cli/pom.xml  | 2 +-
 hudi-client/pom.xml   | 2 +-
 hudi-common/pom.xml   | 2 +-
 hudi-hadoop-mr/pom.xml| 2 +-
 hudi-hive/pom.xml | 2 +-
 hudi-integ-test/pom.xml   | 2 +-
 hudi-spark/pom.xml| 2 +-
 hudi-timeline-service/pom.xml | 2 +-
 hudi-utilities/pom.xml| 2 +-
 packaging/hudi-hadoop-mr-bundle/pom.xml   | 2 +-
 packaging/hudi-hive-bundle/pom.xml| 2 +-
 packaging/hudi-presto-bundle/pom.xml  | 2 +-
 packaging/hudi-spark-bundle/pom.xml   | 2 +-
 packaging/hudi-timeline-server-bundle/pom.xml | 2 +-
 packaging/hudi-utilities-bundle/pom.xml   | 2 +-
 pom.xml   | 2 +-
 27 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/docker/hoodie/hadoop/base/pom.xml 
b/docker/hoodie/hadoop/base/pom.xml
index 52dd2a8..8cb0ab2 100644
--- a/docker/hoodie/hadoop/base/pom.xml
+++ b/docker/hoodie/hadoop/base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/datanode/pom.xml 
b/docker/hoodie/hadoop/datanode/pom.xml
index 23cb64d..ed4533f 100644
--- a/docker/hoodie/hadoop/datanode/pom.xml
+++ b/docker/hoodie/hadoop/datanode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/historyserver/pom.xml 
b/docker/hoodie/hadoop/historyserver/pom.xml
index d35e940..b3455c6 100644
--- a/docker/hoodie/hadoop/historyserver/pom.xml
+++ b/docker/hoodie/hadoop/historyserver/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/hive_base/pom.xml 
b/docker/hoodie/hadoop/hive_base/pom.xml
index 2f7c2b5..0afaa0e 100644
--- a/docker/hoodie/hadoop/hive_base/pom.xml
+++ b/docker/hoodie/hadoop/hive_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/namenode/pom.xml 
b/docker/hoodie/hadoop/namenode/pom.xml
index a996f57..257781a 100644
--- a/docker/hoodie/hadoop/namenode/pom.xml
+++ b/docker/hoodie/hadoop/namenode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml
index fff962f..a339226 100644
--- a/docker/hoodie/hadoop/pom.xml
+++ b/docker/hoodie/hadoop/pom.xml
@@ -19,7 +19,7 @@
   
 hudi
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
 ../../../pom.xml
   
   4.0.0
diff --git a/docker/hoodie/hadoop/prestobase/pom.xml 
b/docker/hoodie/hadoop/prestobase/pom.xml
index d3c1d0f..1459268 100644
--- a/docker/hoodie/hadoop/prestobase/pom.xml
+++ b/docker/hoodie/hadoop/prestobase/pom.xml
@@ -22,7 +22,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/spark_base/pom.xml 
b/docker/hoodie/hadoop/spark_base/pom.xml
index 32b33e0..28a3b78 100644
--- a/docker/hoodie/hadoop/spark_base/pom.xml
+++ b/docker/hoodie/hadoop/spark_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml 
b/docker/hoodie/hadoop/sparkadhoc/pom.xml
index 80a811c..0c6e1b4 100644
--- a/docker/hoodie/hadoop/sparkadhoc/pom.xml
+++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.1-SNAPSHOT
+0.5.0-incubating-rc2
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/s

[incubator-hudi] 02/03: [HUDI-121] Preparing for Release 0.5.0-incubating-rc3

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch release-0.5.0
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit e3db2d8d5d5b04705ef1bd845a8bc847748b4d0e
Author: Balaji Varadarajan 
AuthorDate: Mon Sep 30 16:34:45 2019 -0700

[HUDI-121] Preparing for Release 0.5.0-incubating-rc3
---
 docker/hoodie/hadoop/base/pom.xml | 2 +-
 docker/hoodie/hadoop/datanode/pom.xml | 2 +-
 docker/hoodie/hadoop/historyserver/pom.xml| 2 +-
 docker/hoodie/hadoop/hive_base/pom.xml| 2 +-
 docker/hoodie/hadoop/namenode/pom.xml | 2 +-
 docker/hoodie/hadoop/pom.xml  | 2 +-
 docker/hoodie/hadoop/prestobase/pom.xml   | 2 +-
 docker/hoodie/hadoop/spark_base/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkadhoc/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkmaster/pom.xml  | 2 +-
 docker/hoodie/hadoop/sparkworker/pom.xml  | 2 +-
 hudi-cli/pom.xml  | 2 +-
 hudi-client/pom.xml   | 2 +-
 hudi-common/pom.xml   | 2 +-
 hudi-hadoop-mr/pom.xml| 2 +-
 hudi-hive/pom.xml | 2 +-
 hudi-integ-test/pom.xml   | 2 +-
 hudi-spark/pom.xml| 2 +-
 hudi-timeline-service/pom.xml | 2 +-
 hudi-utilities/pom.xml| 2 +-
 packaging/hudi-hadoop-mr-bundle/pom.xml   | 2 +-
 packaging/hudi-hive-bundle/pom.xml| 2 +-
 packaging/hudi-presto-bundle/pom.xml  | 2 +-
 packaging/hudi-spark-bundle/pom.xml   | 2 +-
 packaging/hudi-timeline-server-bundle/pom.xml | 2 +-
 packaging/hudi-utilities-bundle/pom.xml   | 2 +-
 pom.xml   | 2 +-
 27 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/docker/hoodie/hadoop/base/pom.xml 
b/docker/hoodie/hadoop/base/pom.xml
index 8cb0ab2..7c0a7e8 100644
--- a/docker/hoodie/hadoop/base/pom.xml
+++ b/docker/hoodie/hadoop/base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/datanode/pom.xml 
b/docker/hoodie/hadoop/datanode/pom.xml
index ed4533f..e41e80a 100644
--- a/docker/hoodie/hadoop/datanode/pom.xml
+++ b/docker/hoodie/hadoop/datanode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/historyserver/pom.xml 
b/docker/hoodie/hadoop/historyserver/pom.xml
index b3455c6..f4cc5f9 100644
--- a/docker/hoodie/hadoop/historyserver/pom.xml
+++ b/docker/hoodie/hadoop/historyserver/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/hive_base/pom.xml 
b/docker/hoodie/hadoop/hive_base/pom.xml
index 0afaa0e..7dbc0fa 100644
--- a/docker/hoodie/hadoop/hive_base/pom.xml
+++ b/docker/hoodie/hadoop/hive_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/namenode/pom.xml 
b/docker/hoodie/hadoop/namenode/pom.xml
index 257781a..fca0ed1 100644
--- a/docker/hoodie/hadoop/namenode/pom.xml
+++ b/docker/hoodie/hadoop/namenode/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/pom.xml b/docker/hoodie/hadoop/pom.xml
index a339226..f93e4a3 100644
--- a/docker/hoodie/hadoop/pom.xml
+++ b/docker/hoodie/hadoop/pom.xml
@@ -19,7 +19,7 @@
   
 hudi
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
 ../../../pom.xml
   
   4.0.0
diff --git a/docker/hoodie/hadoop/prestobase/pom.xml 
b/docker/hoodie/hadoop/prestobase/pom.xml
index 1459268..0cf1501 100644
--- a/docker/hoodie/hadoop/prestobase/pom.xml
+++ b/docker/hoodie/hadoop/prestobase/pom.xml
@@ -22,7 +22,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/spark_base/pom.xml 
b/docker/hoodie/hadoop/spark_base/pom.xml
index 28a3b78..e88d0a5 100644
--- a/docker/hoodie/hadoop/spark_base/pom.xml
+++ b/docker/hoodie/hadoop/spark_base/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
   
   4.0.0
   pom
diff --git a/docker/hoodie/hadoop/sparkadhoc/pom.xml 
b/docker/hoodie/hadoop/sparkadhoc/pom.xml
index 0c6e1b4..bc01720 100644
--- a/docker/hoodie/hadoop/sparkadhoc/pom.xml
+++ b/docker/hoodie/hadoop/sparkadhoc/pom.xml
@@ -19,7 +19,7 @@
   
 hudi-hadoop-docker
 org.apache.hudi
-0.5.0-incubating-rc2
+0.5.0-incubating-rc3
 

[incubator-hudi] branch master updated: [HUDI-121] Fix bug in validation in create_source_release.sh

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new e75fa07  [HUDI-121] Fix bug in validation in create_source_release.sh
e75fa07 is described below

commit e75fa070f85ed1e0f72f9718a715a08a76679140
Author: Balaji Varadarajan 
AuthorDate: Thu Oct 3 09:20:12 2019 -0700

[HUDI-121] Fix bug in validation in create_source_release.sh
---
 scripts/release/create_source_release.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/release/create_source_release.sh 
b/scripts/release/create_source_release.sh
index b911936..05aa4b9 100755
--- a/scripts/release/create_source_release.sh
+++ b/scripts/release/create_source_release.sh
@@ -29,8 +29,8 @@ set -o nounset
 set -o xtrace
 
 CURR_DIR=`pwd`
-if [[ `basename $CURR_DIR` != "release" ]] ; then
-  echo "You have to call the script from the release/ dir"
+if [[ `basename $CURR_DIR` != "scripts" ]] ; then
+  echo "You have to call the script from the scripts/ dir"
   exit 1
 fi
 



[GitHub] [incubator-hudi] vinothchandar commented on issue #933: Support for multiple level partitioning in Hudi

2019-10-03 Thread GitBox
vinothchandar commented on issue #933: Support for multiple level partitioning 
in Hudi
URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-538021110
 
 
   @HariprasadAllaka1612 thanks! mind updating the FAQs? :) 
   https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185 
   
   cc @bhasudha 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (HUDI-286) Remove or hide tags from Hudi official web site

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943708#comment-16943708
 ] 

Vinoth Chandar commented on HUDI-286:
-

+1 thanks for catching this 

> Remove or hide tags from Hudi official web site
> ---
>
> Key: HUDI-286
> URL: https://issues.apache.org/jira/browse/HUDI-286
> Project: Apache Hudi (incubating)
>  Issue Type: Wish
>  Components: Docs
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Currently, Hudi's doc did not provide a tag HTML page. While we provided a 
> hyper link to an unknown URL, e.g. 
> [getting_started|http://hudi.apache.org/tag_getting_started.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-290) Normalize Test class name of HoodieWriteConfigTest

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943711#comment-16943711
 ] 

Vinoth Chandar commented on HUDI-290:
-

+1 +1 :) that was due to me and someone else have two styles.. 

> Normalize Test class name of HoodieWriteConfigTest
> --
>
> Key: HUDI-290
> URL: https://issues.apache.org/jira/browse/HUDI-290
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> In general, a test case name start with {{Test}}. It would be better to 
> rename {{HoodieWriteConfigTest}} to {{TestHoodieWriteConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-253) DeltaStreamer should report nicer error messages for misconfigs

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943709#comment-16943709
 ] 

Vinoth Chandar commented on HUDI-253:
-

No worries! 

> DeltaStreamer should report nicer error messages for misconfigs
> ---
>
> Key: HUDI-253
> URL: https://issues.apache.org/jira/browse/HUDI-253
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer, Usability
>Reporter: Vinoth Chandar
>Assignee: Pratyaksh Sharma
>Priority: Major
>
> e.g: 
> https://lists.apache.org/thread.html/4fdcdd7ba77a4f0366ec0e95f54298115fcc9567f6b0c9998f1b92b7@
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


svn commit: r36186 - in /dev/incubator/hudi/hudi-0.5.0-incubating-rc4: ./ hudi-0.5.0-incubating-rc4.src.tgz hudi-0.5.0-incubating-rc4.src.tgz.asc hudi-0.5.0-incubating-rc4.src.tgz.sha512

2019-10-03 Thread vbalaji
Author: vbalaji
Date: Thu Oct  3 16:40:15 2019
New Revision: 36186

Log:
Adding source release of hudi-0.5.0-incubating-rc4


Added:
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/

dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz  
 (with props)

dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc

dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512

Added: 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz
==
Binary file - no diff available.

Propchange: 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz
--
svn:mime-type = application/octet-stream

Added: 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc
==
--- 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc
 (added)
+++ 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.asc
 Thu Oct  3 16:40:15 2019
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCAAdFiEEr5uvedMRo9Mojlg/JKSZA3JiqqQFAl2WH98ACgkQJKSZA3Ji
+qqTTLhAA520jepPv4imii1vlHMdV2Yh+4ju48M/58ecV3rGqqUHCFY8A39z3KrfY
+pdqDzx+osl1Lh+eIdQjvI3DSAfuQbFgebQ19fg5vTlV9/UFYi4D51lDckuUMTxEm
+ZOF5cknhFERa6gbiObXDvaecuvVsnkkTn/6AFIVb23za2YbCKvQh9h+eNi3rPmAr
+dRdwcGUNpWByRdv4n7E+82+Hl8Rp+7/lzM1aTWD2Ihlnxu5V8M8bhY8QN46LNNmr
+rz2aGJ5a1tsLCOT6PvLvKRr6TPwCJfJafUgfyGCjWWZOg4wUX3W8nykrChVUJAYH
+fei71CDlU+jQgyDAPRpv7I3aiRXVhQNt9UrBl4mY8A735ntyGgjip53z+ztsWxLL
+XBxmDXr8fedGZLEU4QWvf6P0jcpMvHh1i4fNYP7u1LNyuvQ0NvkvQKlzjtcIf8sP
+9cnz23exd8NrrO8AFsua3r8t73C0YAT37HtkRPskifFT0IFowEGSdURlzbqQSYLZ
+TdjyNWuShnGN//Yqt+aT1JnpD2cHQcllJHK0t3yjQAOlvxSLafVFhQvVmURMrxjE
+JO9dMqh8KUkMTbCj7E2s/aONxVC/c+6RLZ1iufa4MtEgcc5Y8Mld4EqbITiH/qWC
+eSgoe5ma/3Hx42i07RGqigftpFP1M4Tma+bsu01Uh0Tnp+CTZKY=
+=Ny8w
+-END PGP SIGNATURE-

Added: 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512
==
--- 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512
 (added)
+++ 
dev/incubator/hudi/hudi-0.5.0-incubating-rc4/hudi-0.5.0-incubating-rc4.src.tgz.sha512
 Thu Oct  3 16:40:15 2019
@@ -0,0 +1 @@
+e57b1ab3dbe3a061bc8c57c32523946a5e8e01bc7aa73ab11915ed986d9fdb43c63e9681bf32636a559ddb285e465ce718e2cd05197316645a7d1bbc547f267d
  hudi-0.5.0-incubating-rc4.src.tgz




[incubator-hudi] branch master updated: [HUDI-121] Fix bug in validation in deploy_staging_jars.sh

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new cef06c1  [HUDI-121] Fix bug in validation in deploy_staging_jars.sh
cef06c1 is described below

commit cef06c1e4849ea93644c17af89b56b3f17d322fc
Author: Balaji Varadarajan 
AuthorDate: Thu Oct 3 09:42:30 2019 -0700

[HUDI-121] Fix bug in validation in deploy_staging_jars.sh
---
 scripts/release/deploy_staging_jars.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/release/deploy_staging_jars.sh 
b/scripts/release/deploy_staging_jars.sh
index 7c067e6..5867f9c 100755
--- a/scripts/release/deploy_staging_jars.sh
+++ b/scripts/release/deploy_staging_jars.sh
@@ -29,8 +29,8 @@ set -o nounset
 set -o xtrace
 
 CURR_DIR=`pwd`
-if [[ `basename $CURR_DIR` != "release" ]] ; then
-  echo "You have to call the script from the release/ dir"
+if [[ `basename $CURR_DIR` != "scripts" ]] ; then
+  echo "You have to call the script from the scripts/ dir"
   exit 1
 fi
 



[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943728#comment-16943728
 ] 

Vinoth Chandar commented on HUDI-288:
-

Great! This will help larger companies with more topics easily adopt delta 
streamer. 

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -
>
> Key: HUDI-288
> URL: https://issues.apache.org/jira/browse/HUDI-288
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[incubator-hudi] branch master updated: [HUDI-265] Failed to delete tmp dirs created in unit tests (#928)

2019-10-03 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 3dedc7e  [HUDI-265] Failed to delete tmp dirs created in unit tests 
(#928)
3dedc7e is described below

commit 3dedc7e5fdd5f885915e81e47e110b845a905dbf
Author: leesf <490081...@qq.com>
AuthorDate: Fri Oct 4 00:48:13 2019 +0800

[HUDI-265] Failed to delete tmp dirs created in unit tests (#928)
---
 .../org/apache/hudi/HoodieClientTestHarness.java   | 48 +---
 .../org/apache/hudi/TestCompactionAdminClient.java |  5 +-
 .../java/org/apache/hudi/TestConsistencyGuard.java |  6 +-
 .../apache/hudi/func/TestUpdateMapFunction.java|  5 +-
 .../hudi/index/TestHBaseQPSResourceAllocator.java  |  3 +-
 .../java/org/apache/hudi/index/TestHbaseIndex.java |  3 +-
 .../org/apache/hudi/index/TestHoodieIndex.java |  5 +-
 .../hudi/index/bloom/TestHoodieBloomIndex.java |  3 +-
 .../index/bloom/TestHoodieGlobalBloomIndex.java|  5 +-
 .../apache/hudi/io/TestHoodieCommitArchiveLog.java |  3 +-
 .../org/apache/hudi/io/TestHoodieCompactor.java|  3 +-
 .../org/apache/hudi/io/TestHoodieMergeHandle.java  |  3 +-
 .../apache/hudi/table/TestCopyOnWriteTable.java|  3 +-
 .../apache/hudi/table/TestMergeOnReadTable.java|  4 +-
 .../hudi/common/HoodieCommonTestHarness.java   | 89 ++
 .../common/table/HoodieTableMetaClientTest.java| 14 +---
 .../hudi/common/table/log/HoodieLogFormatTest.java | 11 ++-
 .../table/string/HoodieActiveTimelineTest.java | 13 ++--
 .../table/view/HoodieTableFileSystemViewTest.java  | 38 -
 .../table/view/IncrementalFSViewSyncTest.java  | 68 ++---
 .../RocksDBBasedIncrementalFSViewSyncTest.java |  4 +-
 .../table/view/RocksDbBasedFileSystemViewTest.java |  2 +-
 ...SpillableMapBasedIncrementalFSViewSyncTest.java |  2 +-
 .../hudi/common/util/TestCompactionUtils.java  | 20 ++---
 .../org/apache/hudi/common/util/TestFSUtils.java   | 14 ++--
 .../apache/hudi/common/util/TestFileIOUtils.java   |  7 +-
 .../apache/hudi/common/util/TestParquetUtils.java  | 15 +---
 .../hudi/common/util/TestRocksDBManager.java   | 12 ++-
 .../common/util/collection/TestDiskBasedMap.java   | 24 +++---
 .../util/collection/TestExternalSpillableMap.java  | 32 
 .../util/collection/TestRocksDbBasedMap.java   | 11 ++-
 .../hudi/hadoop/TestHoodieROTablePathFilter.java   | 13 ++--
 .../hudi/utilities/TestHoodieSnapshotCopier.java   | 32 
 33 files changed, 248 insertions(+), 272 deletions(-)

diff --git 
a/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java 
b/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java
index 10fb0bc..80cb70f 100644
--- a/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java
+++ b/hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java
@@ -17,7 +17,6 @@
 
 package org.apache.hudi;
 
-import java.io.File;
 import java.io.IOException;
 import java.io.Serializable;
 import java.util.concurrent.ExecutorService;
@@ -29,30 +28,27 @@ import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.hdfs.DistributedFileSystem;
 import org.apache.hadoop.hdfs.MiniDFSCluster;
 import org.apache.hudi.common.HoodieClientTestUtils;
+import org.apache.hudi.common.HoodieCommonTestHarness;
 import org.apache.hudi.common.HoodieTestDataGenerator;
 import org.apache.hudi.common.minicluster.HdfsTestService;
-import org.apache.hudi.common.model.HoodieTableType;
 import org.apache.hudi.common.model.HoodieTestUtils;
 import org.apache.hudi.common.table.HoodieTableMetaClient;
 import org.apache.hudi.common.util.FSUtils;
 import org.apache.spark.api.java.JavaSparkContext;
 import org.apache.spark.sql.SQLContext;
-import org.junit.rules.TemporaryFolder;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 /**
  * The test harness for resource initialization and cleanup.
  */
-public abstract class HoodieClientTestHarness implements Serializable {
+public abstract class HoodieClientTestHarness extends HoodieCommonTestHarness 
implements Serializable {
 
   private static final Logger logger = 
LoggerFactory.getLogger(HoodieClientTestHarness.class);
 
   protected transient JavaSparkContext jsc = null;
   protected transient SQLContext sqlContext;
   protected transient FileSystem fs;
-  protected String basePath = null;
-  protected TemporaryFolder folder = null;
   protected transient HoodieTestDataGenerator dataGen = null;
   protected transient ExecutorService executorService;
   protected transient HoodieTableMetaClient metaClient;
@@ -69,7 +65,7 @@ public abstract class HoodieClientTestHarness implements 
Serializable {
* @throws IOException
*/
   public void initResources() throws IOException {
-initTempFolderAndPath();
+initPath();
 initSparkContexts();
 initTestData

[GitHub] [incubator-hudi] vinothchandar merged pull request #928: [HUDI-265] Failed to delete tmp dirs created in unit tests

2019-10-03 Thread GitBox
vinothchandar merged pull request #928: [HUDI-265] Failed to delete tmp dirs 
created in unit tests
URL: https://github.com/apache/incubator-hudi/pull/928
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (HUDI-64) Estimation of compression ratio & other dynamic storage knobs based on historical stats

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943733#comment-16943733
 ] 

Vinoth Chandar commented on HUDI-64:


[~yanghua] absolutely no problem. take your time :) 

> Estimation of compression ratio & other dynamic storage knobs based on 
> historical stats
> ---
>
> Key: HUDI-64
> URL: https://issues.apache.org/jira/browse/HUDI-64
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Storage Management, Write Client
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Roughly along the likes of. [https://github.com/uber/hudi/issues/270] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-289) Implement a long running test for Hudi writing and querying end-end

2019-10-03 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943731#comment-16943731
 ] 

Vinoth Chandar commented on HUDI-289:
-

Awesome! 

https://github.com/apache/incubator-hudi/pull/623 was an intial attempt at 
this. 

High level thinking now is to use the existing DistributedTestDataSource to 
generata some workload and then compare state of the datasource (backed by 
rocksdb and the dataset) after each commit or action.. Just raw thoughts.. Feel 
free to completely change approach as well. 

> Implement a long running test for Hudi writing and querying end-end
> ---
>
> Key: HUDI-289
> URL: https://issues.apache.org/jira/browse/HUDI-289
> Project: Apache Hudi (incubating)
>  Issue Type: Test
>  Components: Usability
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> We would need an equivalent of an end-end test which runs some workload for 
> few hours atleast, triggers various actions like commit, deltacopmmit, 
> rollback, compaction and ensures correctness of code before every release
> P.S: Learn from all the CSS issues managing compaction.. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[incubator-hudi] tag 0.5.0-incubating-rc4 created (now fb053bf)

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to tag 0.5.0-incubating-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


  at fb053bf  (commit)
No new revisions were added by this update.



[incubator-hudi] tag release-0.5.0-incubating-rc4 created (now fb053bf)

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to tag release-0.5.0-incubating-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


  at fb053bf  (commit)
No new revisions were added by this update.



[jira] [Created] (HUDI-291) Simplify quickstart

2019-10-03 Thread Bhavani Sudha Saktheeswaran (Jira)
Bhavani Sudha Saktheeswaran created HUDI-291:


 Summary: Simplify quickstart
 Key: HUDI-291
 URL: https://issues.apache.org/jira/browse/HUDI-291
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Docs, docs-chinese, Usability
Reporter: Bhavani Sudha Saktheeswaran
Assignee: Bhavani Sudha Saktheeswaran


Make quickstart really simple by only using spark examples and default configs 
for easier playing around with Hudi APIs. The intent is to introduce what Hudi 
offers to end users as quickly as possible, without having to deal with setting 
up Hive or other external systems. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-291) Simplify quickstart

2019-10-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-291:

Labels: pull-request-available  (was: )

> Simplify quickstart
> ---
>
> Key: HUDI-291
> URL: https://issues.apache.org/jira/browse/HUDI-291
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Docs, docs-chinese, Usability
>Reporter: Bhavani Sudha Saktheeswaran
>Assignee: Bhavani Sudha Saktheeswaran
>Priority: Minor
>  Labels: pull-request-available
>
> Make quickstart really simple by only using spark examples and default 
> configs for easier playing around with Hudi APIs. The intent is to introduce 
> what Hudi offers to end users as quickly as possible, without having to deal 
> with setting up Hive or other external systems. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] bhasudha opened a new pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
bhasudha opened a new pull request #937: [HUDI-291] Simplify quickstart 
documentation
URL: https://github.com/apache/incubator-hudi/pull/937
 
 
   - Uses spark-shell based examples to showcase Hudi core features
   - Info related to hive sync, hive, presto, etc are removed
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538117831
 
 
   Please review the contents of this quickstart and let me know if you want to 
see any changes.
   
   For a rough idea, the quickstart will look like the schreenshot below 
   
[quickstart_screenshot.pdf](https://github.com/apache/incubator-hudi/files/3687968/quickstart_screenshot.pdf)
   
   I am planning to add more styling changes to this PR. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538118329
 
 
   @yanghua @leesf  I am making these changes in the main quickstart.md. May 
need your help in changing the corresponding .cn.md files after the reviews


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275123
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276090
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275748
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331274747
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 Review comment:
   include git clone command as well? so someone can just keep copy pasting.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276260
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331276441
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331274347
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
 
 Review comment:
   peek


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify 
quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331275909
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -3,196 +3,186 @@ title: Quickstart
 keywords: hudi, quickstart
 tags: [quickstart]
 sidebar: mydoc_sidebar
-toc: false
+toc: true
 permalink: quickstart.html
 ---
 
-To get a quick peek at Hudi's capabilities, we have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) 
-that showcases this on a docker based setup with all dependent systems running 
locally. We recommend you replicate the same setup 
-and run the demo yourself, by following steps [here](docker_demo.html). Also, 
if you are looking for ways to migrate your existing data to Hudi, 
-refer to [migration guide](migration_guide.html).
 
-If you have Hive, Hadoop, Spark installed already & prefer to do it on your 
own setup, read on.
+This guide provides a quick peak at Hudi's capabilities using simple 
spark-shell. Using Spark datasources, this guide 
+walks through code snippets that allows you to insert and update a Hudi table 
of default Storage type: 
+  [Copy on 
Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). 
+After each write operation we show how to read the data. We will also be 
looking at how to query a Hudi table incrementally. 
 
-## Download Hudi
+We have put together a [demo 
video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a 
docker based 
+setup with all dependent systems running locally. We recommend you replicate 
the same setup and run the demo yourself, 
+by following steps [here](docker_demo.html). Also, if you are looking for ways 
to migrate your existing data to Hudi, 
+refer to [migration guide](migration_guide.html). 
 
-Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line
+For the quickstart, you would need to build Hudi spark bundle jar and provide 
that to the spark shell as shown below.
 
-```
-$ mvn clean install -DskipTests -DskipITs
-```
-
-Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol 
can talk to Hive 1.x, you can use Hudi to 
-talk to older hive versions.
-
-For IDE, you can pull in the code into IntelliJ as a normal maven project. 
-You might want to add your spark jars folder to project dependencies under 
'Module Setttings', to be able to run from IDE.
-
-
-### Version Compatibility
+## Build Hudi spark bundle jar
 
-Hudi requires Java 8 to be installed on a *nix system. Hudi works with 
Spark-2.x versions. 
-Further, we have verified that Hudi works with the following combination of 
Hadoop/Hive/Spark.
-
-| Hadoop | Hive  | Spark | Instructions to Build Hudi |
-|  | - |  |  |
-| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn 
clean install -DskipTests" |
-
-If your environment has other versions of hadoop/hive/spark, please try out 
Hudi 
-and let us know if there are any issues. 
-
-## Generate Sample Dataset
-
-### Environment Variables
-
-Please set the following environment variables according to your setup. We 
have given an example setup with CDH version
+Hudi requires Java 8 to be installed on a *nix system.
+Check out [code](https://github.com/apache/incubator-hudi) and normally build 
the maven project, from command line:
 
 ```
-cd incubator-hudi 
-export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
-export HIVE_HOME=/var/hadoop/setup/apache-hive-1.1.0-cdh5.7.2-bin
-export HADOOP_HOME=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_INSTALL=/var/hadoop/setup/hadoop-2.6.0-cdh5.7.2
-export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
-export SPARK_HOME=/var/hadoop/setup/spark-2.3.1-bin-hadoop2.7
-export SPARK_INSTALL=$SPARK_HOME
-export SPARK_CONF_DIR=$SPARK_HOME/conf
-export 
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin:$HADOOP_HOME/bin:$SPARK_INSTALL/bin:$PATH
+$ mvn clean install -DskipTests -DskipITs
+
+$ # Export the location of hudi-spark-bundle for later reference 
+$ mkdir -p /var/tmp/hudi && cp 
packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar  
/var/tmp/hudi/hudi-spark-bundle.jar 
+$ export HUDI_SPARK_BUNDLE_PATH=/var/tmp/hudi/hudi-spark-bundle.jar
 ```
 
-### Run HoodieJavaApp
+## Setup spark-shell
+Hudi works with Spark-2.x versions. You can follow instructions 
[here](https://spark.apache.org/downloads.html) for 
+setting up spark. 
 
-Run __hudi-spark/src/test/java/HoodieJavaApp.java__ class, to place a two 
commits (commit 1 => 100 inserts, commit 2 => 100 updates to previously 
inserted 100 records) onto your DFS/local filesystem. Use the wrapper script
-to run from command-line
+From the extracted directory run spark-shell with Hudi as:
 
 ```
-cd hudi-spark
-./run_hoodie_app.sh --help
-Usage:  [options]
-  Options:
---help, -h
-   Default: false
---table-name, -n
-   table name for Hudi sample tabl

[GitHub] [incubator-hudi] vinothchandar commented on issue #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
vinothchandar commented on issue #937: [HUDI-291] Simplify quickstart 
documentation
URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538150247
 
 
   My bad .. saw the pdf now.. Can we remove the toc..  Its not very useful if 
we expect most users just follow the whole page 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
bhasudha commented on issue #937: [HUDI-291] Simplify quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538151189
 
 
   Makes sense. Thanks @vinothchandar for the reviews. I ll take a closer look 
at it later today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] tag 0.5.0-incubating-rc4 deleted (was fb053bf)

2019-10-03 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to tag 0.5.0-incubating-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


*** WARNING: tag 0.5.0-incubating-rc4 was deleted! ***

 was fb053bf  [HUDI-121] Preparing for Release 0.5.0-incubating-rc4

The revisions that were on this tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.



[jira] [Closed] (HUDI-265) Failed to delete tmp dirs created in unit tests

2019-10-03 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf closed HUDI-265.
--
Resolution: Fixed

Fixed via master: 3dedc7e5fdd5f885915e81e47e110b845a905dbf

> Failed to delete tmp dirs created in unit tests
> ---
>
> Key: HUDI-265
> URL: https://issues.apache.org/jira/browse/HUDI-265
> Project: Apache Hudi (incubating)
>  Issue Type: Test
>  Components: Testing
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In some unit tests, such as TestHoodieSnapshotCopier, TestUpdateMapFunction.  
> After run these tests, it fails to delete tmp dir created in _init(with 
> before annotation)_ after clean(with after annotation), thus will cause too 
> many folders in /tmp. we need to delete these dirs after finishing ut.
> I will go through all the unit tests that did not properly delete the tmp dir 
> and send a patch.
>  
> cc [~vinoth] [~vbalaji]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] Guru107 commented on issue #143: Tracking ticket for folks to be added to slack group

2019-10-03 Thread GitBox
Guru107 commented on issue #143: Tracking ticket for folks to be added to slack 
group
URL: https://github.com/apache/incubator-hudi/issues/143#issuecomment-538180537
 
 
   Hi @vinothchandar, please add guruak...@gmail.com. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on issue #937: [HUDI-291] Simplify quickstart documentation

2019-10-03 Thread GitBox
leesf commented on issue #937: [HUDI-291] Simplify quickstart documentation
URL: https://github.com/apache/incubator-hudi/pull/937#issuecomment-538180728
 
 
   > @yanghua @leesf I am making these changes in the main quickstart.md. May 
need your help in changing the corresponding .cn.md files after the reviews
   
   Happy to see the improvement of the quickstart. And i am glad to apply the 
changes to chinese doc. @bhasudha 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf opened a new pull request #938: [HUDI-232] Implement sealing/unsealing for HoodieRecord class

2019-10-03 Thread GitBox
leesf opened a new pull request #938: [HUDI-232] Implement sealing/unsealing 
for HoodieRecord class
URL: https://github.com/apache/incubator-hudi/pull/938
 
 
   see jira: https://jira.apache.org/jira/projects/HUDI/issues/HUDI-232
   
   CC @vinothchandar Please review when you get a chance. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-232) Implement sealing/unsealing for HoodieRecord class

2019-10-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-232:

Labels: pull-request-available  (was: )

> Implement sealing/unsealing for HoodieRecord class
> --
>
> Key: HUDI-232
> URL: https://issues.apache.org/jira/browse/HUDI-232
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>  Components: Write Client
>Affects Versions: 0.5.0
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
>
> HoodieRecord class sometimes is modified to set the record location. We can 
> get into issues like HUDI-170 if the modification is misplaced. We need a 
> mechanism to seal the class and unseal for modification explicity.. Try to 
> modify in sealed state should throw an error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-265) Failed to delete tmp dirs created in unit tests

2019-10-03 Thread Balaji Varadarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Varadarajan updated HUDI-265:

Fix Version/s: (was: 0.5.1)
   0.5.0

> Failed to delete tmp dirs created in unit tests
> ---
>
> Key: HUDI-265
> URL: https://issues.apache.org/jira/browse/HUDI-265
> Project: Apache Hudi (incubating)
>  Issue Type: Test
>  Components: Testing
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In some unit tests, such as TestHoodieSnapshotCopier, TestUpdateMapFunction.  
> After run these tests, it fails to delete tmp dir created in _init(with 
> before annotation)_ after clean(with after annotation), thus will cause too 
> many folders in /tmp. we need to delete these dirs after finishing ut.
> I will go through all the unit tests that did not properly delete the tmp dir 
> and send a patch.
>  
> cc [~vinoth] [~vbalaji]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] bvaradar opened a new pull request #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi

2019-10-03 Thread GitBox
bvaradar opened a new pull request #939: [HUDI-121] Remove leftover notice file 
and replace com.uber.hoodie with org.apache.hudi 
URL: https://github.com/apache/incubator-hudi/pull/939
 
 
   
   1. Checked  for other leftover license/notice files
   2. Exploded fat jars and checked LICENSE and notice files
   3. Removed unused license-mapping file and notice related configurations in 
pom.xml 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on issue #939: [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi

2019-10-03 Thread GitBox
bvaradar commented on issue #939: [HUDI-121] Remove leftover notice file and 
replace com.uber.hoodie with org.apache.hudi 
URL: https://github.com/apache/incubator-hudi/pull/939#issuecomment-538235145
 
 
   The unit-test logs looks fine.
   
   @vinothchandar : Can you review .
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services