[GitHub] [incubator-hudi] XuQianJin-Stars opened a new pull request #1064: [HUDI-209] Implement JMX metrics reporter

2019-11-30 Thread GitBox
XuQianJin-Stars opened a new pull request #1064: [HUDI-209] Implement JMX 
metrics reporter
URL: https://github.com/apache/incubator-hudi/pull/1064
 
 
   Currently, there are only two reporters MetricsGraphiteReporter and 
InMemoryMetricsReporter. InMemoryMetricsReporter is used for testing. So 
actually we only have one metrics reporter. Since JMX is a standard of the 
monitor on the JVM platform, I propose to provide a JMX metrics reporter.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on issue #991: Hudi Test Suite (Refactor)

2019-11-30 Thread GitBox
yanghua commented on issue #991: Hudi Test Suite (Refactor) 
URL: https://github.com/apache/incubator-hudi/pull/991#issuecomment-560053982
 
 
   Yes, we should rebase it based on the master branch. The work reflects on 
branch #1057 . However, I can not merge it with this PR or the feature branch. 
So I created a copy.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Build failed in Jenkins: hudi-snapshot-deployment-0.5 #115

2019-11-30 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 2.20 KB...]
/home/jenkins/tools/maven/apache-maven-3.5.4/bin:
m2.conf
mvn
mvn.cmd
mvnDebug
mvnDebug.cmd
mvnyjp

/home/jenkins/tools/maven/apache-maven-3.5.4/boot:
plexus-classworlds-2.5.2.jar

/home/jenkins/tools/maven/apache-maven-3.5.4/conf:
logging
settings.xml
toolchains.xml

/home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging:
simplelogger.properties

/home/jenkins/tools/maven/apache-maven-3.5.4/lib:
aopalliance-1.0.jar
cdi-api-1.0.jar
cdi-api.license
commons-cli-1.4.jar
commons-cli.license
commons-io-2.5.jar
commons-io.license
commons-lang3-3.5.jar
commons-lang3.license
ext
guava-20.0.jar
guice-4.2.0-no_aop.jar
jansi-1.17.1.jar
jansi-native
javax.inject-1.jar
jcl-over-slf4j-1.7.25.jar
jcl-over-slf4j.license
jsr250-api-1.0.jar
jsr250-api.license
maven-artifact-3.5.4.jar
maven-artifact.license
maven-builder-support-3.5.4.jar
maven-builder-support.license
maven-compat-3.5.4.jar
maven-compat.license
maven-core-3.5.4.jar
maven-core.license
maven-embedder-3.5.4.jar
maven-embedder.license
maven-model-3.5.4.jar
maven-model-builder-3.5.4.jar
maven-model-builder.license
maven-model.license
maven-plugin-api-3.5.4.jar
maven-plugin-api.license
maven-repository-metadata-3.5.4.jar
maven-repository-metadata.license
maven-resolver-api-1.1.1.jar
maven-resolver-api.license
maven-resolver-connector-basic-1.1.1.jar
maven-resolver-connector-basic.license
maven-resolver-impl-1.1.1.jar
maven-resolver-impl.license
maven-resolver-provider-3.5.4.jar
maven-resolver-provider.license
maven-resolver-spi-1.1.1.jar
maven-resolver-spi.license
maven-resolver-transport-wagon-1.1.1.jar
maven-resolver-transport-wagon.license
maven-resolver-util-1.1.1.jar
maven-resolver-util.license
maven-settings-3.5.4.jar
maven-settings-builder-3.5.4.jar
maven-settings-builder.license
maven-settings.license
maven-shared-utils-3.2.1.jar
maven-shared-utils.license
maven-slf4j-provider-3.5.4.jar
maven-slf4j-provider.license
org.eclipse.sisu.inject-0.3.3.jar
org.eclipse.sisu.inject.license
org.eclipse.sisu.plexus-0.3.3.jar
org.eclipse.sisu.plexus.license
plexus-cipher-1.7.jar
plexus-cipher.license
plexus-component-annotations-1.7.1.jar
plexus-component-annotations.license
plexus-interpolation-1.24.jar
plexus-interpolation.license
plexus-sec-dispatcher-1.4.jar
plexus-sec-dispatcher.license
plexus-utils-3.1.0.jar
plexus-utils.license
slf4j-api-1.7.25.jar
slf4j-api.license
wagon-file-3.1.0.jar
wagon-file.license
wagon-http-3.1.0-shaded.jar
wagon-http.license
wagon-provider-api-3.1.0.jar
wagon-provider-api.license

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/ext:
README.txt

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native:
freebsd32
freebsd64
linux32
linux64
osx
README.txt
windows32
windows64

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/osx:
libjansi.jnilib

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows32:
jansi.dll

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows64:
jansi.dll
Finished /home/jenkins/tools/maven/apache-maven-3.5.4 Directory Listing :
Detected current version as: 
'HUDI_home=
0.5.1-SNAPSHOT'
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Hudi   [pom]
[INFO] hudi-common[jar]
[INFO] hudi-timeline-service  [jar]
[INFO] hudi-hadoop-mr [jar]
[INFO] hudi-client[jar]
[INFO] hudi-hive  [jar]
[INFO] hudi-spark [jar]
[INFO] hudi-utilities [jar]
[INFO] hudi-cli   [jar]
[INFO] hudi-hadoop-mr-bundle  [jar]
[INFO] hudi-hive-bundle   [jar]
[INFO] hudi-spark-bundle  [jar]
[INFO] hudi-presto-bundle [jar]
[INFO] hudi-utilities-bundle  [jar]
[INFO] hudi-timeline-server-bundle

[jira] [Updated] (HUDI-370) Refactor hudi-common based on new ImportOrder code style rule

2019-11-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-370:

Labels: pull-request-available  (was: )

> Refactor hudi-common based on new ImportOrder code style rule
> -
>
> Key: HUDI-370
> URL: https://issues.apache.org/jira/browse/HUDI-370
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>Reporter: lamber-ken
>Assignee: lamber-ken
>Priority: Major
>  Labels: pull-request-available
>
> Refactor hudi-common based on new ImportOrder code style rule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] lamber-ken opened a new pull request #1063: [HUDI-370] Refactor hudi-common based on new ImportOrder code style rule

2019-11-30 Thread GitBox
lamber-ken opened a new pull request #1063: [HUDI-370] Refactor hudi-common 
based on new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1063
 
 
   ## What is the purpose of the pull request
   
   Refactor hudi-common based on new ImportOrder code style rule
   
   ## Brief change log
   
 - Refactor hudi-common based on new ImportOrder code style rule.
   
   ## Verify this pull request
   
   This pull request is a code cleanup without any test coverage.
   
   ## Committer checklist
   
- [x] Has a corresponding JIRA in PR title & commit

- [x] Commit message is descriptive of the change

- [x] CI is green
   
- [x] Necessary doc changes done or have another open PR
  
- [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-294) Delete Paths written in Cleaner plan needs to be relative to partition-path

2019-11-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-294:

Labels: pull-request-available  (was: )

> Delete Paths written in Cleaner plan needs to be relative to partition-path
> ---
>
> Key: HUDI-294
> URL: https://issues.apache.org/jira/browse/HUDI-294
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Cleaner
>Reporter: Balaji Varadarajan
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>
> The deleted file paths stored in Clean metadata are all absolute. They need 
> to be changed to relative path.
> The challenge would be to handle cases when both version of cleaner metadata 
> are present and needs to be processed  (backwards compatibility)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] leesf opened a new pull request #1062: [HUDI-294] Delete Paths written in Cleaner plan needs to be relative to partition-path

2019-11-30 Thread GitBox
leesf opened a new pull request #1062: [HUDI-294] Delete Paths written in 
Cleaner plan needs to be relative to partition-path
URL: https://github.com/apache/incubator-hudi/pull/1062
 
 
   ## What is the purpose of the pull request
   
   Make delete paths written in cleaner plan to be relative.
   
   ## Brief change log
   
   - Adding _CleanMetadataMigrator_  as the entrypoint to handle clean metadata.
   - Adding _CleanV1MigrationHandler_ to handle old version.
   - Adding _CleanV2MigrationHandler_ to handle new version.
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   - Adding _TestCleaner#testUpgradeDowngrade_.
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on issue #1058: [Docs] Update Hudi Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#issuecomment-560033896
 
 
   hi @vinothchandar, quick overview the latest 
https://github.com/BigDataArtisans/incubator-hudi/tree/hudi-readme#apache-hudi-incubating.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352312078
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
 
 Review comment:
   > I prefer to leave this alone. Can you please revert? IMO `building-hudi` 
is simpler/shorter url
   
   no problem, done. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar merged pull request #1051: [HUDI-357] Refactor hudi-cli based on new comment and code style rules

2019-11-30 Thread GitBox
vinothchandar merged pull request #1051: [HUDI-357] Refactor hudi-cli based on 
new comment and code style rules
URL: https://github.com/apache/incubator-hudi/pull/1051
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: [HUDI-357] Refactor hudi-cli based on new comment and code style rules (#1051)

2019-11-30 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 75132c1  [HUDI-357] Refactor hudi-cli based on new comment and code 
style rules (#1051)
75132c1 is described below

commit 75132c139f0faf9ef68bb461b3a551238a377455
Author: Gurudatt Kulkarni 
AuthorDate: Sun Dec 1 00:42:41 2019 +0530

[HUDI-357] Refactor hudi-cli based on new comment and code style rules 
(#1051)
---
 hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java|  6 ++
 .../org/apache/hudi/cli/HoodieHistoryFileNameProvider.java   |  3 +++
 .../src/main/java/org/apache/hudi/cli/HoodiePrintHelper.java | 12 ++--
 hudi-cli/src/main/java/org/apache/hudi/cli/HoodiePrompt.java |  3 +++
 .../main/java/org/apache/hudi/cli/HoodieSplashScreen.java|  3 +++
 hudi-cli/src/main/java/org/apache/hudi/cli/Main.java |  6 +++---
 hudi-cli/src/main/java/org/apache/hudi/cli/Table.java| 10 +-
 hudi-cli/src/main/java/org/apache/hudi/cli/TableHeader.java  | 12 ++--
 .../org/apache/hudi/cli/commands/ArchivedCommitsCommand.java |  3 +++
 .../java/org/apache/hudi/cli/commands/CleansCommand.java |  3 +++
 .../java/org/apache/hudi/cli/commands/CommitsCommand.java|  3 +++
 .../java/org/apache/hudi/cli/commands/CompactionCommand.java |  3 +++
 .../java/org/apache/hudi/cli/commands/DatasetsCommand.java   |  7 +--
 .../org/apache/hudi/cli/commands/FileSystemViewCommand.java  |  5 -
 .../apache/hudi/cli/commands/HDFSParquetImportCommand.java   |  3 +++
 .../org/apache/hudi/cli/commands/HoodieLogFileCommand.java   |  3 +++
 .../java/org/apache/hudi/cli/commands/HoodieSyncCommand.java |  3 +++
 .../java/org/apache/hudi/cli/commands/RepairsCommand.java|  3 +++
 .../java/org/apache/hudi/cli/commands/RollbacksCommand.java  |  5 -
 .../java/org/apache/hudi/cli/commands/SavepointsCommand.java |  3 +++
 .../main/java/org/apache/hudi/cli/commands/SparkMain.java|  5 -
 .../main/java/org/apache/hudi/cli/commands/StatsCommand.java |  3 +++
 .../main/java/org/apache/hudi/cli/commands/UtilsCommand.java |  3 +++
 .../src/main/java/org/apache/hudi/cli/utils/CommitUtil.java  |  3 +++
 .../src/main/java/org/apache/hudi/cli/utils/HiveUtil.java|  3 +++
 .../java/org/apache/hudi/cli/utils/InputStreamConsumer.java  |  3 +++
 .../src/main/java/org/apache/hudi/cli/utils/SparkUtil.java   |  5 -
 27 files changed, 98 insertions(+), 26 deletions(-)

diff --git a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java 
b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java
index d2e6f99..0dafdc4 100644
--- a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java
+++ b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java
@@ -25,6 +25,9 @@ import org.apache.hudi.common.table.HoodieTableMetaClient;
 import org.apache.hudi.common.util.ConsistencyGuardConfig;
 import org.apache.hudi.common.util.FSUtils;
 
+/**
+ * This class is responsible to load table metadata and hoodie related configs.
+ */
 public class HoodieCLI {
 
   public static Configuration conf;
@@ -35,6 +38,9 @@ public class HoodieCLI {
   public static HoodieTableMetaClient tableMetadata;
   public static HoodieTableMetaClient syncTableMetadata;
 
+  /**
+   * Enum for CLI state.
+   */
   public enum CLIState {
 INIT, DATASET, SYNC
   }
diff --git 
a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieHistoryFileNameProvider.java 
b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieHistoryFileNameProvider.java
index eeb2ff2..af01f66 100644
--- 
a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieHistoryFileNameProvider.java
+++ 
b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodieHistoryFileNameProvider.java
@@ -23,6 +23,9 @@ import org.springframework.core.annotation.Order;
 import org.springframework.shell.plugin.support.DefaultHistoryFileNameProvider;
 import org.springframework.stereotype.Component;
 
+/**
+ * CLI history file provider.
+ */
 @Component
 @Order(Ordered.HIGHEST_PRECEDENCE)
 public class HoodieHistoryFileNameProvider extends 
DefaultHistoryFileNameProvider {
diff --git a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodiePrintHelper.java 
b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodiePrintHelper.java
index 3cce301..0e48911 100644
--- a/hudi-cli/src/main/java/org/apache/hudi/cli/HoodiePrintHelper.java
+++ b/hudi-cli/src/main/java/org/apache/hudi/cli/HoodiePrintHelper.java
@@ -25,12 +25,12 @@ import java.util.function.Function;
 import org.apache.hudi.common.util.Option;
 
 /**
- * Helper class to render table for hoodie-cli
+ * Helper class to render table for hoodie-cli.
  */
 public class HoodiePrintHelper {
 
   /**
-   * Print header and raw rows
+   * Print header and raw rows.
*
* @param header Header
* @param rows Raw Rows
@@ -41,7 +41,7 @@ public class HoodiePrintHelper {
   }
 
   /**
-   * 

[GitHub] [incubator-hudi] vinothchandar merged pull request #1059: [HUDI-374] Unable to generateUpdates in QuickstartUtils

2019-11-30 Thread GitBox
vinothchandar merged pull request #1059: [HUDI-374] Unable to generateUpdates 
in QuickstartUtils
URL: https://github.com/apache/incubator-hudi/pull/1059
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: [HUDI-374] Unable to generateUpdates in QuickstartUtils (#1059)

2019-11-30 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new b65a897  [HUDI-374] Unable to generateUpdates in QuickstartUtils 
(#1059)
b65a897 is described below

commit b65a897856259e7872e39a9e3e68661926592d7b
Author: hongdd 
AuthorDate: Sun Dec 1 03:11:00 2019 +0800

[HUDI-374] Unable to generateUpdates in QuickstartUtils (#1059)
---
 hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java 
b/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java
index d09716d..45e922f 100644
--- a/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java
+++ b/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java
@@ -162,7 +162,7 @@ public class QuickstartUtils {
   String randomString = generateRandomString();
   List updates = new ArrayList<>();
   for (int i = 0; i < n; i++) {
-HoodieKey key = existingKeys.get(rand.nextInt(numExistingKeys - 1));
+HoodieKey key = existingKeys.get(rand.nextInt(numExistingKeys));
 HoodieRecord record = generateUpdateRecord(key, randomString);
 updates.add(record);
   }



[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352299506
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
 
 Review comment:
   I prefer to leave this alone. Can you please revert? IMO `building-hudi` is 
simpler/shorter url


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352299187
 
 

 ##
 File path: README.md
 ##
 @@ -15,11 +15,17 @@
   limitations under the License.
 -->
 
-# Hudi
+# Apache Hudi (Incubating)
 Apache Hudi (Incubating) (pronounced Hoodie) stands for `Hadoop Upserts 
Deletes and Incrementals`. 
 Hudi manages the storage of large analytical datasets on DFS (Cloud stores, 
HDFS or any Hadoop FileSystem compatible storage).
 
-### Features
+
+
+[![Build 
Status](https://travis-ci.org/apache/incubator-hudi.svg?branch=master)](https://travis-ci.org/apache/incubator-hudi)
+[![GitHub 
release](https://img.shields.io/github/release/apache/incubator-hudi.svg)](https://github.com/apache/incubator-hudi/releases)
 
 Review comment:
   update


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352297615
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
 
 Review comment:
   It looks  strange here if we keep. I'll update the link info at the 
quickstart page, change 
   
`https://github.com/apache/incubator-hudi#building-apache-hudi-from-source-building-hudi`
   to 
`https://github.com/apache/incubator-hudi#building-apache-hudi-from-source`. 
WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352297249
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
+
+Prerequisites for building Hudi:
 
 Review comment:
   right, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352297249
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
+
+Prerequisites for building Hudi:
 
 Review comment:
   right


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
lamber-ken commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352297207
 
 

 ##
 File path: README.md
 ##
 @@ -15,11 +15,17 @@
   limitations under the License.
 -->
 
-# Hudi
+# Apache Hudi (Incubating)
 Apache Hudi (Incubating) (pronounced Hoodie) stands for `Hadoop Upserts 
Deletes and Incrementals`. 
 Hudi manages the storage of large analytical datasets on DFS (Cloud stores, 
HDFS or any Hadoop FileSystem compatible storage).
 
-### Features
+
+
+[![Build 
Status](https://travis-ci.org/apache/incubator-hudi.svg?branch=master)](https://travis-ci.org/apache/incubator-hudi)
+[![GitHub 
release](https://img.shields.io/github/release/apache/incubator-hudi.svg)](https://github.com/apache/incubator-hudi/releases)
 
 Review comment:
   I refer to this project 
[incubator-iotdb](https://github.com/apache/incubator-iotdb), if you worry 
about it, I will remove it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352296128
 
 

 ##
 File path: README.md
 ##
 @@ -15,11 +15,17 @@
   limitations under the License.
 -->
 
-# Hudi
+# Apache Hudi (Incubating)
 Apache Hudi (Incubating) (pronounced Hoodie) stands for `Hadoop Upserts 
Deletes and Incrementals`. 
 Hudi manages the storage of large analytical datasets on DFS (Cloud stores, 
HDFS or any Hadoop FileSystem compatible storage).
 
-### Features
+
+
+[![Build 
Status](https://travis-ci.org/apache/incubator-hudi.svg?branch=master)](https://travis-ci.org/apache/incubator-hudi)
+[![GitHub 
release](https://img.shields.io/github/release/apache/incubator-hudi.svg)](https://github.com/apache/incubator-hudi/releases)
 
 Review comment:
   this is pointing to non-asf releases.. can we point to the apache release? 
or remove this? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352296191
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
+
+Prerequisites for building Hudi:
 
 Review comment:
   why `Apache Hudi` above and just `Hudi` here? :). I think we can standardize 
on one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi Readme

2019-11-30 Thread GitBox
vinothchandar commented on a change in pull request #1058: [Docs] Update Hudi 
Readme
URL: https://github.com/apache/incubator-hudi/pull/1058#discussion_r352296171
 
 

 ##
 File path: README.md
 ##
 @@ -31,21 +37,25 @@ Hudi manages the storage of large analytical datasets on 
DFS (Cloud stores, HDFS
 Hudi provides the ability to query via three types of views:
  * **Read Optimized View** - Provides excellent snapshot query performance via 
purely columnar storage (e.g. [Parquet](https://parquet.apache.org/))
  * **Incremental View** - Provides a change stream with records inserted or 
updated after a point in time.
- * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g Parquet + 
[Avro](http://avro.apache.org/docs/current/mr.html))
+ * **Real-time View** - Provides snapshot queries on real-time data, using a 
combination of columnar & row-based storage (e.g 
[Parquet](https://parquet.apache.org/) + 
[Avro](http://avro.apache.org/docs/current/mr.html))
 
 Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
 
-### Building Apache Hudi from source {#building-hudi}
+## Building Apache Hudi from source
 
 Review comment:
   this is linked from the quickstart page.. can we keep the 
`{#building-hudi}`? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (HUDI-372) Support the shortName for Hudi DataSource

2019-11-30 Thread lamber-ken (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lamber-ken resolved HUDI-372.
-
Resolution: Resolved

> Support the shortName for Hudi DataSource
> -
>
> Key: HUDI-372
> URL: https://issues.apache.org/jira/browse/HUDI-372
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>Reporter: lamber-ken
>Assignee: lamber-ken
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fix the shortName of DataSouce, after this issue, we can use this command 
> like this
> {code:java}
> spark.read.format("hudi")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] lamber-ken commented on issue #1054: [HUDI-372] Support the shortName for Hudi DataSource

2019-11-30 Thread GitBox
lamber-ken commented on issue #1054: [HUDI-372] Support the shortName for Hudi 
DataSource
URL: https://github.com/apache/incubator-hudi/pull/1054#issuecomment-56912
 
 
   > sometimes travis is flaky at times.. We should try another service like 
circleCI or azure and weed out flakiness from real failures.. Anyways, separate 
topic :)
   
   I see.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar merged pull request #1054: [HUDI-372] Support the shortName for Hudi DataSource

2019-11-30 Thread GitBox
vinothchandar merged pull request #1054: [HUDI-372] Support the shortName for 
Hudi DataSource
URL: https://github.com/apache/incubator-hudi/pull/1054
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: [HUDI-372] Support the shortName for Hudi DataSource (#1054)

2019-11-30 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 024230f  [HUDI-372] Support the shortName for Hudi DataSource (#1054)
024230f is described below

commit 024230fbd23173db38e2c0f66808606481223003
Author: lamber-ken 
AuthorDate: Sun Dec 1 00:02:33 2019 +0800

[HUDI-372] Support the shortName for Hudi DataSource (#1054)

- Ability to do `spark.write.format("hudi")...`
---
 .../org.apache.spark.sql.sources.DataSourceRegister  | 20 
 .../main/scala/org/apache/hudi/DefaultSource.scala   |  2 +-
 hudi-spark/src/test/scala/TestDataSource.scala   | 13 +
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git 
a/hudi-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
 
b/hudi-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
new file mode 100644
index 000..ea82b80
--- /dev/null
+++ 
b/hudi-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
@@ -0,0 +1,20 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+org.apache.hudi.DefaultSource
\ No newline at end of file
diff --git a/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala 
b/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
index 18f3dba..f50d90f 100644
--- a/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
+++ b/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
@@ -104,5 +104,5 @@ class DefaultSource extends RelationProvider
   outputMode)
   }
 
-  override def shortName(): String = "hoodie"
+  override def shortName(): String = "hudi"
 }
diff --git a/hudi-spark/src/test/scala/TestDataSource.scala 
b/hudi-spark/src/test/scala/TestDataSource.scala
index d7ea714..587c55a 100644
--- a/hudi-spark/src/test/scala/TestDataSource.scala
+++ b/hudi-spark/src/test/scala/TestDataSource.scala
@@ -63,6 +63,19 @@ class TestDataSource extends AssertionsForJUnit {
 fs = FSUtils.getFs(basePath, spark.sparkContext.hadoopConfiguration)
   }
 
+  @Test def testShortNameStorage() {
+// Insert Operation
+val records = 
DataSourceTestUtils.convertToStringList(dataGen.generateInserts("000", 
100)).toList
+val inputDF: Dataset[Row] = 
spark.read.json(spark.sparkContext.parallelize(records, 2))
+inputDF.write.format("hudi")
+  .options(commonOpts)
+  .option(DataSourceWriteOptions.OPERATION_OPT_KEY, 
DataSourceWriteOptions.INSERT_OPERATION_OPT_VAL)
+  .mode(SaveMode.Overwrite)
+  .save(basePath)
+
+assertTrue(HoodieDataSourceHelpers.hasNewCommits(fs, basePath, "000"))
+  }
+
   @Test def testCopyOnWriteStorage() {
 // Insert Operation
 val records1 = 
DataSourceTestUtils.convertToStringList(dataGen.generateInserts("000", 
100)).toList



[GitHub] [incubator-hudi] vinothchandar commented on issue #1054: [HUDI-372] Support the shortName for Hudi DataSource

2019-11-30 Thread GitBox
vinothchandar commented on issue #1054: [HUDI-372] Support the shortName for 
Hudi DataSource
URL: https://github.com/apache/incubator-hudi/pull/1054#issuecomment-559987724
 
 
   sometimes travis is flaky at times.. We should try another service like 
circleCI or azure and weed out flakiness from real failures.. Anyways, separate 
topic :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #991: Hudi Test Suite (Refactor)

2019-11-30 Thread GitBox
vinothchandar commented on issue #991: Hudi Test Suite (Refactor) 
URL: https://github.com/apache/incubator-hudi/pull/991#issuecomment-559986350
 
 
   I think @n3nash has pushed a new feature branch? is this PR relevant 
anymore? I ll let @n3nash chime in. but if this PR is what we need, then 
resolving them and getting it into a good shape is top priority IMO. Do you 
agree @n3nash ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services