[jira] [Created] (HUDI-356) Sync translation and code in quickstart.cn and admin_guide.cn pages

2019-11-20 Thread hong dongdong (Jira)
hong dongdong created HUDI-356:
--

 Summary: Sync translation and code in quickstart.cn and 
admin_guide.cn pages
 Key: HUDI-356
 URL: https://issues.apache.org/jira/browse/HUDI-356
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Docs
Reporter: hong dongdong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-355) Refactor hudi-common based on new comment and code style rules

2019-11-20 Thread vinoyang (Jira)
vinoyang created HUDI-355:
-

 Summary: Refactor hudi-common based on new comment and code style 
rules
 Key: HUDI-355
 URL: https://issues.apache.org/jira/browse/HUDI-355
 Project: Apache Hudi (incubating)
  Issue Type: Sub-task
Reporter: vinoyang


This issue used to refactor hudi-common module based on new comment and code 
style rules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-354) Introduce stricter comment and code style validation rules

2019-11-20 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-354:
--
Description: 
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:
 # All public classes must add class-level comments;
 # All comments must end with a clear "."
 # In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
 # Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?
 * Add the XML code snippet into {{PROJECT_ROOT/style/checkstyle.xml}} : 

{code:java}































{code}
 *  Make sure you have installed CheckStyle-IDEA plugin and activated for the 
project.
 * Scan the project module you want to refactor and fix all the issues one by 
one.

 

  was:
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:
 # All public classes must add class-level comments;
 # All comments must end with a clear "."
 # In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
 # Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

Add the XML code snippet into {{PROJECT_ROOT/style/checkstyle.xml}} :

 

 
{code:java}































{code}
 

Make sure you have installed CheckStyle-IDEA plugin and activated for the 
project.

 

Scan the project module you want to refactor and fix all the issues one by one.

 


> Introduce stricter comment and code style validation rules
> --
>
> Key: HUDI-354
> URL: https://issues.apache.org/jira/browse/HUDI-354
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>Reporter: vinoyang
>Priority: Major
>
> This is an umbrella issue used to track apply some stricter comment and code 
> style validation rules for the whole project. The rules list below:
>  # All public classes must add class-level comments;
>  # All comments must end with a clear "."
>  # In the import statement of the class, clearly distinguish (by blank lines) 
> the import of Java SE and the import of non-java SE. Currently, I saw at 
> least two projects(Spark and Flink) that implement this rule. Flink 
> implements stricter rules than Spark. It is divided into several blocks from 
> top to bottom(owner import -> non-owner and non-JavaSE import -> Java SE 
> import -> static import), each block are sorted according to the natural 
> sequence of letters;
>  # Reconfirm the method and whether the comment is consistency;
> Each project sub-module mappings to one subtask.
> How to find all the invalidated points?
>  * Add the XML code snippet into {{PROJECT_ROOT/style/checkstyle.xml}} : 
> {code:java}
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>value="Import {0} appears after other imports that it should precede"/>
> 
> 
> 
> 
>value="Redundant import {0}."/>
> 
> 
> 
> 
> {code}
>  *  Make sure you have installed CheckStyle-IDEA plugin and activated for the 
> project.
>  * Scan the project module you want to refactor and fix all the issues one by 
> one.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-354) Introduce stricter comment and code style validation rules

2019-11-20 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-354:
--
Description: 
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:
 # All public classes must add class-level comments;
 # All comments must end with a clear "."
 # In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
 # Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

Add the XML code snippet into {{PROJECT_ROOT/style/checkstyle.xml}} :

 

 
{code:java}































{code}
 

Make sure you have installed CheckStyle-IDEA plugin and activated for the 
project.

 

Scan the project module you want to refactor and fix all the issues one by one.

 

  was:
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:

# All public classes must add class-level comments;
# All comments must end with a clear "."
# In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
# Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

TBD.


> Introduce stricter comment and code style validation rules
> --
>
> Key: HUDI-354
> URL: https://issues.apache.org/jira/browse/HUDI-354
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>Reporter: vinoyang
>Priority: Major
>
> This is an umbrella issue used to track apply some stricter comment and code 
> style validation rules for the whole project. The rules list below:
>  # All public classes must add class-level comments;
>  # All comments must end with a clear "."
>  # In the import statement of the class, clearly distinguish (by blank lines) 
> the import of Java SE and the import of non-java SE. Currently, I saw at 
> least two projects(Spark and Flink) that implement this rule. Flink 
> implements stricter rules than Spark. It is divided into several blocks from 
> top to bottom(owner import -> non-owner and non-JavaSE import -> Java SE 
> import -> static import), each block are sorted according to the natural 
> sequence of letters;
>  # Reconfirm the method and whether the comment is consistency;
> Each project sub-module mappings to one subtask.
> How to find all the invalidated points?
> Add the XML code snippet into {{PROJECT_ROOT/style/checkstyle.xml}} :
>  
>  
> {code:java}
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>value="Import {0} appears after other imports that it should precede"/>
> 
> 
> 
> 
>value="Redundant import {0}."/>
> 
> 
> 
> 
> {code}
>  
> Make sure you have installed CheckStyle-IDEA plugin and activated for the 
> project.
>  
> Scan the project module you want to refactor and fix all the issues one by 
> one.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] hddong commented on a change in pull request #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
hddong commented on a change in pull request #1024: [HUDI-345] Fix used 
deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#discussion_r348926966
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/io/storage/SizeAwareFSDataOutputStream.java
 ##
 @@ -43,7 +43,7 @@
 
   public SizeAwareFSDataOutputStream(Path path, FSDataOutputStream out, 
ConsistencyGuard consistencyGuard,
   Runnable closeCallback) throws IOException {
-super(out);
+super(out, null);
 
 Review comment:
   > is it okay to pass null here? this is the stats object right?
   
   Actually, `super(out)` is equivalent to `super(out, null)`. Source code 
below:
   ```
 @Deprecated
 public FSDataOutputStream(OutputStream out) throws IOException {
   this(out, null);
 }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] hddong commented on issue #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
hddong commented on issue #1024: [HUDI-345] Fix used deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#issuecomment-556953051
 
 
   > > Yes, we created a wechat group below.
   
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1034: Datasource Writer throws error on resolving struct fields

2019-11-20 Thread GitBox
lamber-ken edited a comment on issue #1034: Datasource Writer throws error on 
resolving struct fields
URL: https://github.com/apache/incubator-hudi/issues/1034#issuecomment-556906589
 
 
   hi, @alphairys, according to your describion, I can not get the error. My 
steps are bellow.
   ```
   export SPARK_HOME=/work/BigData/install/spark/spark-2.3.3-bin-hadoop2.6
   ${SPARK_HOME}/bin/spark-shell --packages 
org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 
'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   
   val tableName = "tmp"
   val basePath = "file:///tmp/tmp"
   
   var datas = List("{ \"deviceId\": \"a\", \"eventTimeMilli\": 
1574297893836, \"location\": { \"latitude\": 2.5, \"longitude\": 3.5 }}");
   val df = spark.read.json(spark.sparkContext.parallelize(datas, 2))
   df.write.format("org.apache.hudi").
   option(RECORDKEY_FIELD_OPT_KEY, "deviceId").
   option(PRECOMBINE_FIELD_OPT_KEY, "eventTimeMilli").
   option(TABLE_NAME, tableName).
   mode(Append).
   save(basePath);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1034: Datasource Writer throws error on resolving struct fields

2019-11-20 Thread GitBox
lamber-ken edited a comment on issue #1034: Datasource Writer throws error on 
resolving struct fields
URL: https://github.com/apache/incubator-hudi/issues/1034#issuecomment-556906589
 
 
   hi, @alphairys, according to your describion, I can not get the error. My 
steps are bellow.
   ```
   export SPARK_HOME=/work/BigData/install/spark/spark-2.3.3-bin-hadoop2.6
   ${SPARK_HOME}/bin/spark-shell --packages 
org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 
'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   
   val tableName = "tmp"
   val basePath = "file:///tmp/tmp"
   
   var datas = List("{ \"deviceId\": \"a\", \"eventTimeMilli\": 
1574297893836, \"location\": { \"latitude\": 2.5, \"longitude\": 3.5 }}");
   val df = spark.read.json(spark.sparkContext.parallelize(datas, 2))
   df.write.format("org.apache.hudi").
   option(RECORDKEY_FIELD_OPT_KEY, "deviceId").
   option(PRECOMBINE_FIELD_OPT_KEY, "eventTimeMilli").
   option(TABLE_NAME, tableName).
   option(PARTITIONPATH_FIELD_OPT_KEY, "deviceId").
   mode(Append).
   save(basePath);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1034: Datasource Writer throws error on resolving struct fields

2019-11-20 Thread GitBox
lamber-ken commented on issue #1034: Datasource Writer throws error on 
resolving struct fields
URL: https://github.com/apache/incubator-hudi/issues/1034#issuecomment-556906589
 
 
   hi, @alphairys, according to your describion, I can not get the error. My 
steps are bellow.
   ```
   
   export SPARK_HOME=/work/BigData/install/spark/spark-2.3.3-bin-hadoop2.6
   ${SPARK_HOME}/bin/spark-shell --packages 
org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 
'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   
   val tableName = "tmp"
   val basePath = "file:///tmp/tmp"
   
   
   var datas = List("{ \"deviceId\": \"a\", \"eventTimeMilli\": 
1574297893836, \"location\": { \"latitude\": 2.5, \"longitude\": 3.5 }}");
   val df = spark.read.json(spark.sparkContext.parallelize(datas, 2))
   df.write.format("org.apache.hudi").
   option(RECORDKEY_FIELD_OPT_KEY, "deviceId").
   option(PRECOMBINE_FIELD_OPT_KEY, "eventTimeMilli").
   option(TABLE_NAME, tableName).
   option(PARTITIONPATH_FIELD_OPT_KEY, "deviceId").
   mode(Append).
   save(basePath);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Build failed in Jenkins: hudi-snapshot-deployment-0.5 #105

2019-11-20 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 2.21 KB...]
/home/jenkins/tools/maven/apache-maven-3.5.4/bin:
m2.conf
mvn
mvn.cmd
mvnDebug
mvnDebug.cmd
mvnyjp

/home/jenkins/tools/maven/apache-maven-3.5.4/boot:
plexus-classworlds-2.5.2.jar

/home/jenkins/tools/maven/apache-maven-3.5.4/conf:
logging
settings.xml
toolchains.xml

/home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging:
simplelogger.properties

/home/jenkins/tools/maven/apache-maven-3.5.4/lib:
aopalliance-1.0.jar
cdi-api-1.0.jar
cdi-api.license
commons-cli-1.4.jar
commons-cli.license
commons-io-2.5.jar
commons-io.license
commons-lang3-3.5.jar
commons-lang3.license
ext
guava-20.0.jar
guice-4.2.0-no_aop.jar
jansi-1.17.1.jar
jansi-native
javax.inject-1.jar
jcl-over-slf4j-1.7.25.jar
jcl-over-slf4j.license
jsr250-api-1.0.jar
jsr250-api.license
maven-artifact-3.5.4.jar
maven-artifact.license
maven-builder-support-3.5.4.jar
maven-builder-support.license
maven-compat-3.5.4.jar
maven-compat.license
maven-core-3.5.4.jar
maven-core.license
maven-embedder-3.5.4.jar
maven-embedder.license
maven-model-3.5.4.jar
maven-model-builder-3.5.4.jar
maven-model-builder.license
maven-model.license
maven-plugin-api-3.5.4.jar
maven-plugin-api.license
maven-repository-metadata-3.5.4.jar
maven-repository-metadata.license
maven-resolver-api-1.1.1.jar
maven-resolver-api.license
maven-resolver-connector-basic-1.1.1.jar
maven-resolver-connector-basic.license
maven-resolver-impl-1.1.1.jar
maven-resolver-impl.license
maven-resolver-provider-3.5.4.jar
maven-resolver-provider.license
maven-resolver-spi-1.1.1.jar
maven-resolver-spi.license
maven-resolver-transport-wagon-1.1.1.jar
maven-resolver-transport-wagon.license
maven-resolver-util-1.1.1.jar
maven-resolver-util.license
maven-settings-3.5.4.jar
maven-settings-builder-3.5.4.jar
maven-settings-builder.license
maven-settings.license
maven-shared-utils-3.2.1.jar
maven-shared-utils.license
maven-slf4j-provider-3.5.4.jar
maven-slf4j-provider.license
org.eclipse.sisu.inject-0.3.3.jar
org.eclipse.sisu.inject.license
org.eclipse.sisu.plexus-0.3.3.jar
org.eclipse.sisu.plexus.license
plexus-cipher-1.7.jar
plexus-cipher.license
plexus-component-annotations-1.7.1.jar
plexus-component-annotations.license
plexus-interpolation-1.24.jar
plexus-interpolation.license
plexus-sec-dispatcher-1.4.jar
plexus-sec-dispatcher.license
plexus-utils-3.1.0.jar
plexus-utils.license
slf4j-api-1.7.25.jar
slf4j-api.license
wagon-file-3.1.0.jar
wagon-file.license
wagon-http-3.1.0-shaded.jar
wagon-http.license
wagon-provider-api-3.1.0.jar
wagon-provider-api.license

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/ext:
README.txt

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native:
freebsd32
freebsd64
linux32
linux64
osx
README.txt
windows32
windows64

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/osx:
libjansi.jnilib

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows32:
jansi.dll

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows64:
jansi.dll
Finished /home/jenkins/tools/maven/apache-maven-3.5.4 Directory Listing :
Detected current version as: 
'HUDI_home=
0.5.1-SNAPSHOT'
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Hudi   [pom]
[INFO] hudi-common[jar]
[INFO] hudi-timeline-service  [jar]
[INFO] hudi-hadoop-mr [jar]
[INFO] hudi-client[jar]
[INFO] hudi-hive  [jar]
[INFO] hudi-spark [jar]
[INFO] hudi-utilities [jar]
[INFO] hudi-cli   [jar]
[INFO] hudi-hadoop-mr-bundle  [jar]
[INFO] hudi-hive-bundle   [jar]
[INFO] hudi-spark-bundle  [jar]
[INFO] hudi-presto-bundle [jar]
[INFO] hudi-utilities-bundle  [jar]
[INFO] hudi-timeline-server-bundle

[jira] [Updated] (HUDI-354) Introduce stricter comment and code style validation rules

2019-11-20 Thread vinoyang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated HUDI-354:
--
Description: 
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:

# All public classes must add class-level comments;
# All comments must end with a clear "."
# In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
# Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

TBD.

  was:
This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:

1) All public classes must add class-level comments;
2) All comments must end with a clear "."
3) In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
4) Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

TBD.


> Introduce stricter comment and code style validation rules
> --
>
> Key: HUDI-354
> URL: https://issues.apache.org/jira/browse/HUDI-354
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>Reporter: vinoyang
>Priority: Major
>
> This is an umbrella issue used to track apply some stricter comment and code 
> style validation rules for the whole project. The rules list below:
> # All public classes must add class-level comments;
> # All comments must end with a clear "."
> # In the import statement of the class, clearly distinguish (by blank lines) 
> the import of Java SE and the import of non-java SE. Currently, I saw at 
> least two projects(Spark and Flink) that implement this rule. Flink 
> implements stricter rules than Spark. It is divided into several blocks from 
> top to bottom(owner import -> non-owner and non-JavaSE import -> Java SE 
> import -> static import), each block are sorted according to the natural 
> sequence of letters;
> # Reconfirm the method and whether the comment is consistency;
> Each project sub-module mappings to one subtask.
> How to find all the invalidated points?
> TBD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-354) Introduce stricter comment and code style validation rules

2019-11-20 Thread vinoyang (Jira)
vinoyang created HUDI-354:
-

 Summary: Introduce stricter comment and code style validation rules
 Key: HUDI-354
 URL: https://issues.apache.org/jira/browse/HUDI-354
 Project: Apache Hudi (incubating)
  Issue Type: Task
Reporter: vinoyang


This is an umbrella issue used to track apply some stricter comment and code 
style validation rules for the whole project. The rules list below:

1) All public classes must add class-level comments;
2) All comments must end with a clear "."
3) In the import statement of the class, clearly distinguish (by blank lines) 
the import of Java SE and the import of non-java SE. Currently, I saw at least 
two projects(Spark and Flink) that implement this rule. Flink implements 
stricter rules than Spark. It is divided into several blocks from top to 
bottom(owner import -> non-owner and non-JavaSE import -> Java SE import -> 
static import), each block are sorted according to the natural sequence of 
letters;
4) Reconfirm the method and whether the comment is consistency;

Each project sub-module mappings to one subtask.

How to find all the invalidated points?

TBD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] leesf merged pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf merged pull request #1033: [doc][chinese] Translate the release page into 
chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch asf-site updated: [doc][chinese] Translate the release page into chinese documentation (#1033)

2019-11-20 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


View the commit online:
https://github.com/apache/incubator-hudi/commit/55b990d935ea36932797997b2e74bf1fd152cf0e

The following commit(s) were added to refs/heads/asf-site by this push:
 new 55b990d  [doc][chinese] Translate the release page into chinese 
documentation (#1033)
55b990d is described below

commit 55b990d935ea36932797997b2e74bf1fd152cf0e
Author: 谢磊 
AuthorDate: Thu Nov 21 09:19:00 2019 +0800

[doc][chinese] Translate the release page into chinese documentation (#1033)
---
 docs/_data/strings_cn.yml  |  5 +
 docs/_data/topnav_cn.yml   |  4 ++--
 docs/_includes/topnav.html |  6 ++
 docs/quickstart.md |  2 +-
 docs/releases.cn.md| 32 
 5 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/docs/_data/strings_cn.yml b/docs/_data/strings_cn.yml
new file mode 100644
index 000..73c119e
--- /dev/null
+++ b/docs/_data/strings_cn.yml
@@ -0,0 +1,5 @@
+
+
+# placed here for translation purposes
+search_placeholder_text: 搜索...
+search_no_results_text: 未搜索到结果
diff --git a/docs/_data/topnav_cn.yml b/docs/_data/topnav_cn.yml
index 0e3c72e..ed8b0cd 100644
--- a/docs/_data/topnav_cn.yml
+++ b/docs/_data/topnav_cn.yml
@@ -3,8 +3,8 @@
 topnav:
 - title: Topnav
   items:
-- title: 新闻
-  url: /news
+- title: 发布
+  url: /releases.html
 - title: 社区
   url: /community.html
 - title: 代码
diff --git a/docs/_includes/topnav.html b/docs/_includes/topnav.html
index 6109b36..79c12cd 100644
--- a/docs/_includes/topnav.html
+++ b/docs/_includes/topnav.html
@@ -71,7 +71,13 @@
 
 
 
+
+{% if page.is_default_language %}
 
+{% else %}
+
+{% endif %}
+
 
 
 
diff --git a/docs/quickstart.md b/docs/quickstart.md
index fa04f21..9cac934 100644
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -159,7 +159,7 @@ instead of `--packages 
org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
 
 Also, we used Spark here to show case the capabilities of Hudi. However, Hudi 
can support multiple storage types/views and 
 Hudi datasets can be queried from query engines like Hive, Spark, Presto and 
much more. We have put together a 
-[demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases all 
of this on a docker based setup with all 
+[demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that show cases all 
of this on a docker based setup with all 
 dependent systems running locally. We recommend you replicate the same setup 
and run the demo yourself, by following 
 steps [here](docker_demo.html) to get a taste for it. Also, if you are looking 
for ways to migrate your existing data 
 to Hudi, refer to [migration guide](migration_guide.html). 
diff --git a/docs/releases.cn.md b/docs/releases.cn.md
new file mode 100644
index 000..05d0fbc
--- /dev/null
+++ b/docs/releases.cn.md
@@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * [这里](https://repository.apache.org/#nexus-search;quick~hudi)提供与此版本对应的 
Apache Hudi (incubating) JAR 包。
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 
来发布。这里有[迁移指南](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)。
+
+### 原始发布记录
+  

[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on a change in pull request #1033: [doc][chinese] 
Translate the release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348831085
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
+
+### 原始发布说明书
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on a change in pull request #1033: [doc][chinese] 
Translate the release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348831189
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on a change in pull request #1033: [doc][chinese] 
Translate the release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348830974
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
+
+### 原始发布说明书
+  获取原始发布说明书 
[here](https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822=12346087)
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on a change in pull request #1033: [doc][chinese] 
Translate the release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348831247
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on issue #1033: [doc][chinese] Translate the release page 
into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#issuecomment-556556906
 
 
   @leesf , thanks for review and suggestion, here is the snapshot
   
![image](https://user-images.githubusercontent.com/20113411/69288719-5067c780-0c35-11ea-9a16-d57a1755ee41.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-353) Add support for Hive style partitioning path

2019-11-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-353:

Labels: pull-request-available  (was: )

> Add support for Hive style partitioning path
> 
>
> Key: HUDI-353
> URL: https://issues.apache.org/jira/browse/HUDI-353
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>Reporter: Wenning Ding
>Priority: Major
>  Labels: pull-request-available
>
> In Hive, the partition folder name follows this format: 
> =.
> But in Hudi, the name of its partition folder is .
> e.g. A dataset is partitioned by three columns: year, month and day.
> In Hive, the data is saved in: 
> {{...//year=2019/month=05/day=01/xxx.parquet}}
> In Hudi, the data is saved in: {{...//2019/05/01/xxx.parquet}}
> Basically I add a new option in Spark datasource named 
> {{HIVE_STYLE_PARTITIONING_FILED_OPT_KEY}} which indicates whether using hive 
> style partitioning or not. By default this option is false (not use).
> Also, if using hive style partitioning, instead of scanning the dataset and 
> manually adding/updating all partitions, we can use "MSCK REPAIR TABLE 
> " to automatically sync all the partition info with Hive 
> MetaStore.
> h3.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] zhedoubushishi opened a new pull request #1036: [HUDI-353] Add hive style partitioning path

2019-11-20 Thread GitBox
zhedoubushishi opened a new pull request #1036: [HUDI-353] Add hive style 
partitioning path
URL: https://github.com/apache/incubator-hudi/pull/1036
 
 
   ##  What is the purpose of the pull request
   
   Add support for hive style partitioning
   Jira: https://jira.apache.org/jira/browse/HUDI-353
   
   ## Brief change log
 - *Add a new option in Spark datasource named 
HIVE_STYLE_PARTITIONING_FILED_OPT_KEY which indicates whether using hive style 
partitioning or not. By default this option is false (not use)*
 - *Modify key generators to support this option when generating partition 
path*
 - *Also, if using hive style partitioning, instead of scanning the dataset 
and manually adding/updating all partitions, we can use "MSCK REPAIR TABLE 
" to automatically sync all the partition info with Hive MetaStore*
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
 - *Added additional test cases in TestDataSourceDefaults to verify the 
change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [x] Has a corresponding JIRA in PR title & commit

- [x] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on issue #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on issue #1033: [doc][chinese] Translate the release page into 
chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#issuecomment-556504291
 
 
   Thanks for opening the PR @lamber-ken ! Looks good overall, only left some 
minor comments. Also, would you please paste the page snapshot after 
modification?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348784056
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
+
+### 原始发布说明书
+  获取原始发布说明书 
[here](https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822=12346087)
 
 Review comment:
   ditto.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348783957
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
+
+### 原始发布说明书
 
 Review comment:
   说明书->记录是否更合适些?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348783673
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
+
+### 发布要点
+  * Package and format renaming from com.uber.hoodie to org.apache.hudi (See 
migration guide section below)
+  * Major redo of Hudi bundles to address class and jar version mismatches in 
different environments
+  * Upgrade from Hive 1.x to Hive 2.x for compile time dependencies - Hive 1.x 
runtime integration still works with a patch : See [the discussion 
thread](https://lists.apache.org/thread.html/48b3f0553f47c576fd7072f56bb0d8a24fb47d4003880d179c7f88a3@%3Cdev.hudi.apache.org%3E)
+  * DeltaStreamer now supports continuous running mode with managed concurrent 
compaction
+  * Support for Composite Keys as record key
+  * HoodieCombinedInputFormat to scale huge hive queries running on Hoodie 
tables
+
+### 此版本的迁移指南
+  这是 Apache Hudi (incubating) 的第一次发布。 在此版本之前,Hudi Jars 使用 "com.uber.hoodie" 
maven co-ordinates 来发布。这里有迁移指南 [migration 
guide](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
 
 Review comment:
   
[迁移指南](https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi)
 ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348783064
 
 

 ##
 File path: docs/releases.cn.md
 ##
 @@ -0,0 +1,32 @@
+---
+title: 发布
+keywords: apache, hudi, release, data lake, upsert
+sidebar: home_sidebar
+permalink: releases.html
+toc: true
+---
+
+
+## 最新稳定版本
+  * 稳定版本 : `0.5.0-incubating`
+
+## 0.5.0-incubating 版本
+
+### 下载
+  * 源码包 : [Apache Hudi(incubating) 0.5.0-incubating Source 
Release](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz)
 
([asc](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.asc),
 
[sha512](https://www.apache.org/dist/incubator/hudi/0.5.0-incubating/hudi-0.5.0-incubating.src.tgz.sha512))
+  * 这里提供与此版本对应的 Apache Hudi (incubating) JAR 
包,[here](https://repository.apache.org/#nexus-search;quick~hudi)
 
 Review comment:
   [这里](https://repository.apache.org/#nexus-search;quick~hudi) 提供与此版本对应的 
Apache Hudi (incubating) JAR 包?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch asf-site updated: [DOC] Add reload step

2019-11-20 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


View the commit online:
https://github.com/apache/incubator-hudi/commit/2f421df0fe0f6d6d4982be55aa7c4f1b67fc8410

The following commit(s) were added to refs/heads/asf-site by this push:
 new 2f421df  [DOC] Add reload step
2f421df is described below

commit 2f421df0fe0f6d6d4982be55aa7c4f1b67fc8410
Author: hongdongdong 
AuthorDate: Wed Nov 20 15:28:02 2019 +0800

[DOC] Add reload step
---
 docs/quickstart.md | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/docs/quickstart.md b/docs/quickstart.md
index 8c2d178..fa04f21 100644
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -75,7 +75,7 @@ val roViewDF = spark.
 read.
 format("org.apache.hudi").
 load(basePath + "/*/*/*/*")
-roViewDF.registerTempTable("hudi_ro_table")
+roViewDF.createOrReplaceTempView("hudi_ro_table")
 spark.sql("select fare, begin_lon, begin_lat, ts from  hudi_ro_table where 
fare > 20.0").show()
 spark.sql("select _hoodie_commit_time, _hoodie_record_key, 
_hoodie_partition_path, rider, driver, fare from  hudi_ro_table").show()
 ```
@@ -111,6 +111,13 @@ This can be achieved using Hudi's incremental view and 
providing a begin time fr
 We do not need to specify endTime, if we want all changes after the given 
commit (as is the common case). 
 
 ```Scala
+// reload data
+spark.
+read.
+format("org.apache.hudi").
+load(basePath + "/*/*/*/*").
+createOrReplaceTempView("hudi_ro_table")
+
 val commits = spark.sql("select distinct(_hoodie_commit_time) as commitTime 
from  hudi_ro_table order by commitTime").map(k => k.getString(0)).take(50)
 val beginTime = commits(commits.length - 2) // commit time we are interested in
 



[GitHub] [incubator-hudi] bhasudha commented on issue #1032: [DOC] Add reload step

2019-11-20 Thread GitBox
bhasudha commented on issue #1032: [DOC] Add reload step
URL: https://github.com/apache/incubator-hudi/pull/1032#issuecomment-556498226
 
 
   @hddong I am merging this. As @yanghua and @yihua suggested, it might be a 
good idea to create Jira issues for future PRs. Thanks for sending in this fix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha merged pull request #1032: [DOC] Add reload step

2019-11-20 Thread GitBox
bhasudha merged pull request #1032: [DOC] Add reload step
URL: https://github.com/apache/incubator-hudi/pull/1032
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348781973
 
 

 ##
 File path: docs/quickstart.md
 ##
 @@ -152,7 +152,7 @@ instead of `--packages 
org.apache.hudi:hudi-spark-bundle:0.5.0-incubating`
 
 Also, we used Spark here to show case the capabilities of Hudi. However, Hudi 
can support multiple storage types/views and 
 Hudi datasets can be queried from query engines like Hive, Spark, Presto and 
much more. We have put together a 
-[demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases all 
of this on a docker based setup with all 
+[demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that show cases all 
of this on a docker based setup with all 
 
 Review comment:
   good catch


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
leesf commented on a change in pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#discussion_r348781736
 
 

 ##
 File path: docs/_data/topnav_cn.yml
 ##
 @@ -3,8 +3,8 @@
 topnav:
 - title: Topnav
   items:
-- title: 新闻
-  url: /news
 
 Review comment:
   why get dropped?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar merged pull request #1035: [DOCS] site update with recent doc changes

2019-11-20 Thread GitBox
vinothchandar merged pull request #1035: [DOCS] site update with recent doc 
changes
URL: https://github.com/apache/incubator-hudi/pull/1035
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch asf-site updated: [DOCS] site update with recent doc changes (#1035)

2019-11-20 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


View the commit online:
https://github.com/apache/incubator-hudi/commit/ea16d5bc6c2daab8ca2cd792cefddac0506a3fec

The following commit(s) were added to refs/heads/asf-site by this push:
 new ea16d5b  [DOCS] site update with recent doc changes (#1035)
ea16d5b is described below

commit ea16d5bc6c2daab8ca2cd792cefddac0506a3fec
Author: Bhavani Sudha Saktheeswaran 
AuthorDate: Wed Nov 20 13:46:20 2019 -0800

[DOCS] site update with recent doc changes (#1035)
---
 content/404.html   | 69 +-
 content/admin_guide.html   | 50 +
 content/cn/404.html| 81 +++---
 content/cn/admin_guide.html| 52 ++
 content/cn/community.html  | 25 ---
 content/cn/comparison.html | 48 +
 content/cn/concepts.html   | 50 +
 content/cn/configurations.html | 71 ++-
 content/cn/contributing.html   | 48 +
 content/cn/docker_demo.html| 48 +
 content/cn/events/2016-12-30-strata-talk-2017.html | 71 ++-
 content/cn/events/2019-01-18-asf-incubation.html   | 71 ++-
 content/cn/gcs_hoodie.html | 48 +
 content/cn/index.html  | 71 ++-
 content/cn/migration_guide.html| 25 ---
 content/cn/news.html   | 23 +++---
 content/cn/news_archive.html   | 25 ---
 content/cn/performance.html| 48 +
 content/cn/powered_by.html | 25 ---
 content/cn/privacy.html| 71 ++-
 content/cn/querying_data.html  | 48 +
 content/cn/quickstart.html | 48 +
 content/cn/s3_hoodie.html  | 48 +
 content/cn/use_cases.html  | 48 +
 content/cn/writing_data.html   | 48 +
 content/community.html | 23 +++---
 content/comparison.html| 46 
 content/concepts.html  | 48 +
 content/configurations.html| 69 +-
 content/contributing.html  | 46 
 content/css/customstyles.css   | 72 ++-
 content/css/lavish-bootstrap.css   |  8 ++-
 content/docker_demo.html   | 46 
 content/events/2016-12-30-strata-talk-2017.html| 69 +-
 content/events/2019-01-18-asf-incubation.html  | 69 +-
 content/feed.xml   |  4 +-
 content/gcs_hoodie.html| 46 
 content/index.html | 69 +-
 content/js/mydoc_scroll.html   | 69 +-
 content/js/toc.js  |  6 +-
 content/migration_guide.html   | 23 +++---
 content/news.html  | 23 +++---
 content/news_archive.html  | 23 +++---
 content/performance.html   | 46 
 content/powered_by.html| 23 +++---
 content/privacy.html   | 69 +-
 content/querying_data.html | 46 
 content/quickstart.html| 46 
 content/releases.html  | 69 +-
 content/s3_hoodie.html | 46 
 content/use_cases.html | 46 
 content/writing_data.html  | 46 
 docs/_data/sidebars/mydoc_sidebar_cn.yml   |  4 +-
 docs/concepts.cn.md|  2 +-
 docs/concepts.md   |  2 +-
 55 files changed, 1504 insertions(+), 960 deletions(-)

diff --git a/content/404.html b/content/404.html
index 075eb4b..7e44011 100644
--- a/content/404.html
+++ b/content/404.html
@@ -208,8 +208,7 @@
 
 
 
-
-
+
   
 
 
@@ -317,14 +316,14 @@
-->
 
 
-
 
 
 $("li.active").parents('li').toggleClass("active");
 
-
-
-
+
+
+
+
Page Not Found
 
 
@@ -335,32 +334,6 @@

 
 
-   

[GitHub] [incubator-hudi] yihua commented on issue #1032: [DOC] Add reload step

2019-11-20 Thread GitBox
yihua commented on issue #1032: [DOC] Add reload step
URL: https://github.com/apache/incubator-hudi/pull/1032#issuecomment-556436379
 
 
   @yanghua I think it would be good to have JIRA tickets for docs change so to 
avoid the duplication of work.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha opened a new pull request #1035: Asf site update with recent doc changes

2019-11-20 Thread GitBox
bhasudha opened a new pull request #1035: Asf site update with recent doc 
changes
URL: https://github.com/apache/incubator-hudi/pull/1035
 
 
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a 
pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
 - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test 
coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
 - *Added integration tests for end-to-end.*
 - *Added HoodieClientWriteTest to verify the change.*
 - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-350) Update javadocs in HoodieCleanHelper to reflect correct defaults for retained commits

2019-11-20 Thread Pratyaksh Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pratyaksh Sharma updated HUDI-350:
--
Status: Closed  (was: Patch Available)

> Update javadocs in HoodieCleanHelper to reflect correct defaults for retained 
> commits
> -
>
> Key: HUDI-350
> URL: https://issues.apache.org/jira/browse/HUDI-350
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Cleaner, newbie
>Reporter: Balaji Varadarajan
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Context in the thread : 
> [https://lists.apache.org/thread.html/e834d1f5df4341596884b476b6433bf609fe70aca19dcd3ac2242845@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-350) Update javadocs in HoodieCleanHelper to reflect correct defaults for retained commits

2019-11-20 Thread Pratyaksh Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pratyaksh Sharma updated HUDI-350:
--
Status: Patch Available  (was: In Progress)

> Update javadocs in HoodieCleanHelper to reflect correct defaults for retained 
> commits
> -
>
> Key: HUDI-350
> URL: https://issues.apache.org/jira/browse/HUDI-350
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Cleaner, newbie
>Reporter: Balaji Varadarajan
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Context in the thread : 
> [https://lists.apache.org/thread.html/e834d1f5df4341596884b476b6433bf609fe70aca19dcd3ac2242845@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-353) Add support for Hive style partitioning path

2019-11-20 Thread Wenning Ding (Jira)
Wenning Ding created HUDI-353:
-

 Summary: Add support for Hive style partitioning path
 Key: HUDI-353
 URL: https://issues.apache.org/jira/browse/HUDI-353
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
Reporter: Wenning Ding


In Hive, the partition folder name follows this format: 
=.
But in Hudi, the name of its partition folder is .

e.g. A dataset is partitioned by three columns: year, month and day.
In Hive, the data is saved in: 
{{...//year=2019/month=05/day=01/xxx.parquet}}
In Hudi, the data is saved in: {{...//2019/05/01/xxx.parquet}}

Basically I add a new option in Spark datasource named 
{{HIVE_STYLE_PARTITIONING_FILED_OPT_KEY}} which indicates whether using hive 
style partitioning or not. By default this option is false (not use).

Also, if using hive style partitioning, instead of scanning the dataset and 
manually adding/updating all partitions, we can use "MSCK REPAIR TABLE 
" to automatically sync all the partition info with Hive MetaStore.
h3.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] zhedoubushishi commented on issue #1001: [HUDI-325] Fix Hive partition error for updated HDFS Hudi table

2019-11-20 Thread GitBox
zhedoubushishi commented on issue #1001: [HUDI-325] Fix Hive partition error 
for updated HDFS Hudi table
URL: https://github.com/apache/incubator-hudi/pull/1001#issuecomment-556232591
 
 
   > This is only a partial solution. The reason for Hudi even trying to change 
partition location is because `UPDATE Partition Event` is generated because the 
HDFS partition path provided by Hudi and what it gets from Hive do not match 
because of hdfs uri difference. We should fix that, as hudi should not try to 
alter partition location on simple row level update.
   
   This URI difference is also fixed in the new commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] alphairys opened a new issue #1034: Datasource writer throws Working with nested fields

2019-11-20 Thread GitBox
alphairys opened a new issue #1034: Datasource writer throws Working with 
nested fields
URL: https://github.com/apache/incubator-hudi/issues/1034
 
 
   **Issue**
   
   I have a dataframe with the following schema that I would like to write out 
as a hudi table.
   
   **Schema**
   ```
   root
|-- deviceId: string (nullable = true)
|-- eventTimeMilli: long (nullable = true)
|-- location: struct (nullable = true)
||-- latitude: double (nullable = true)
||-- longitude: double (nullable = true)
   ```
   
   **Write out to hudi table:**
   ```
   simple.write.format("org.apache.hudi")
  .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "deviceId")
  .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, 
"eventTimeMilli")
  .option(HoodieWriteConfig.TABLE_NAME, hudiTableName)
  .mode(SaveMode.Append)
  .save(hudiTablePath)
   
   ```
   Running this, i get the following error:
   ```
   Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
[{"type":"record","name":"location","namespace":"hoodie.hudi_test.hudi_test_record","fields":[{"name":"latitude","type":["double","null"]},{"name":"longitude","type":["double","null"]}]},"null"]:
 {"latitude": 34.7027, "longitude": 137.54862}
   at 
org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:740)
   at ..
   ```
   
   
   
   **Full Error Log**
   ```
   TaskSetManager: Task 0 in stage 8.0 failed 4 times; aborting job
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 
243, ip-172-31-23-108.ec2.internal, executor 39): java.io.IOException: Could 
not create payload for class: org.apache.hudi.OverwriteWithLatestAvroPayload
   at 
org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:129)
   at 
org.apache.hudi.DataSourceUtils.createHoodieRecord(DataSourceUtils.java:178)
   at 
org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:91)
   at 
org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:88)
   at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
   at scala.collection.Iterator$$anon$10.next(Iterator.scala:394)
   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
   at 
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
   at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
   at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
   at 
scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
   at scala.collection.AbstractIterator.to(Iterator.scala:1334)
   at 
scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
   at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
   at 
scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
   at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
   at 
org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1364)
   at 
org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1364)
   at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
   at 
org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   at org.apache.spark.scheduler.Task.run(Task.scala:123)
   at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: Unable to instantiate 
class
   at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:75)
   at 
org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:126)
   ... 28 more
   Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
   at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at 

[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
nsivabalan commented on a change in pull request #1004: [HUDI-328] Adding 
delete api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348646876
 
 

 ##
 File path: 
hudi-client/src/test/java/org/apache/hudi/TestHoodieClientOnCopyOnWriteStorage.java
 ##
 @@ -274,6 +275,15 @@ private void testUpsertsInternal(HoodieWriteConfig 
hoodieWriteConfig,
 updateBatch(hoodieWriteConfig, client, newCommitTime, prevCommitTime,
 Option.of(Arrays.asList(commitTimeBetweenPrevAndNew)), initCommitTime, 
numRecords, writeFn, isPrepped, true,
 numRecords, 200, 2);
+
+// Delete 1
+prevCommitTime = newCommitTime;
+newCommitTime = "005";
+numRecords = 50;
+
+deleteBatch(hoodieWriteConfig, client, newCommitTime, prevCommitTime,
 
 Review comment:
   I wrote a 
test(TestHoodieClientOnCopyOnWriteStorage#testDeletesWithoutInserts), but the 
config had record schema. Fixed it to have null schema. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on issue #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
bvaradar commented on issue #1004: [HUDI-328] Adding delete api to 
HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#issuecomment-556110523
 
 
   > @bvaradar : "Add a comment in code to explain this". sorry, can you 
explain what do you want me to comment about.
   
if anyone reading HoodieSparkSqlWriter.scala has similar thoughts about 
refactoring the  if-else block, it would be easier for them to read your 
comment in the code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] nsivabalan commented on issue #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
nsivabalan commented on issue #1004: [HUDI-328] Adding delete api to 
HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#issuecomment-556081540
 
 
   @bvaradar : "Add a comment in code to explain this". sorry, can you explain 
what do you want me to comment about. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on issue #1032: [DOC] Add reload step

2019-11-20 Thread GitBox
yanghua commented on issue #1032: [DOC] Add reload step
URL: https://github.com/apache/incubator-hudi/pull/1032#issuecomment-556076883
 
 
   One question: whether documentation relevant issues need to forcibly mapping 
a JIRA issue?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
nsivabalan commented on a change in pull request #1004: [HUDI-328] Adding 
delete api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348569104
 
 

 ##
 File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
 ##
 @@ -72,131 +73,212 @@ private[hudi] object HoodieSparkSqlWriter {
 parameters(OPERATION_OPT_KEY)
   }
 
-// register classes & schemas
-val structName = s"${tblName.get}_record"
-val nameSpace = s"hoodie.${tblName.get}"
-sparkContext.getConf.registerKryoClasses(
-  Array(classOf[org.apache.avro.generic.GenericData],
-classOf[org.apache.avro.Schema]))
-val schema = AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, 
structName, nameSpace)
-sparkContext.getConf.registerAvroSchemas(schema)
-log.info(s"Registered avro schema : ${schema.toString(true)}")
-
-// Convert to RDD[HoodieRecord]
-val keyGenerator = 
DataSourceUtils.createKeyGenerator(toProperties(parameters))
-val genericRecords: RDD[GenericRecord] = AvroConversionUtils.createRdd(df, 
structName, nameSpace)
-val hoodieAllIncomingRecords = genericRecords.map(gr => {
-  val orderingVal = DataSourceUtils.getNestedFieldValAsString(
-gr, parameters(PRECOMBINE_FIELD_OPT_KEY)).asInstanceOf[Comparable[_]]
-  DataSourceUtils.createHoodieRecord(gr,
-orderingVal, keyGenerator.getKey(gr), 
parameters(PAYLOAD_CLASS_OPT_KEY))
-}).toJavaRDD()
+var writeSuccessful: Boolean = false
+var commitTime: String = null
+var writeStatuses: JavaRDD[WriteStatus] = null
 
 val jsc = new JavaSparkContext(sparkContext)
-
 val basePath = new Path(parameters("path"))
 val fs = basePath.getFileSystem(sparkContext.hadoopConfiguration)
 var exists = fs.exists(new Path(basePath, 
HoodieTableMetaClient.METAFOLDER_NAME))
 
-// Handle various save modes
-if (mode == SaveMode.ErrorIfExists && exists) {
-  throw new HoodieException(s"hoodie dataset at $basePath already exists.")
-}
-if (mode == SaveMode.Ignore && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Ignoring & not 
performing actual writes.")
-  return (true, common.util.Option.empty())
-}
-if (mode == SaveMode.Overwrite && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Deleting existing 
data & overwriting with new data.")
-  fs.delete(basePath, true)
-  exists = false
-}
+if (!operation.equalsIgnoreCase(DELETE_OPERATION_OPT_VAL)) {
+  // register classes & schemas
+  val structName = s"${tblName.get}_record"
+  val nameSpace = s"hoodie.${tblName.get}"
+  sparkContext.getConf.registerKryoClasses(
+Array(classOf[org.apache.avro.generic.GenericData],
+  classOf[org.apache.avro.Schema]))
+  val schema = 
AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, structName, 
nameSpace)
+  sparkContext.getConf.registerAvroSchemas(schema)
+  log.info(s"Registered avro schema : ${schema.toString(true)}")
 
-// Create the dataset if not present
-if (!exists) {
-  HoodieTableMetaClient.initTableType(sparkContext.hadoopConfiguration, 
path.get, storageType,
-tblName.get, "archived")
-}
+  // Convert to RDD[HoodieRecord]
+  val keyGenerator = 
DataSourceUtils.createKeyGenerator(toProperties(parameters))
+  val genericRecords: RDD[GenericRecord] = 
AvroConversionUtils.createRdd(df, structName, nameSpace)
+  val hoodieAllIncomingRecords = genericRecords.map(gr => {
+val orderingVal = DataSourceUtils.getNestedFieldValAsString(
+  gr, parameters(PRECOMBINE_FIELD_OPT_KEY)).asInstanceOf[Comparable[_]]
+DataSourceUtils.createHoodieRecord(gr,
+  orderingVal, keyGenerator.getKey(gr), 
parameters(PAYLOAD_CLASS_OPT_KEY))
+  }).toJavaRDD()
 
-// Create a HoodieWriteClient & issue the write.
-val client = DataSourceUtils.createHoodieClient(jsc, schema.toString, 
path.get, tblName.get,
-  mapAsJavaMap(parameters)
-)
-
-val hoodieRecords =
-  if (parameters(INSERT_DROP_DUPS_OPT_KEY).toBoolean) {
-DataSourceUtils.dropDuplicates(
-  jsc,
-  hoodieAllIncomingRecords,
-  mapAsJavaMap(parameters), client.getTimelineServer)
-  } else {
-hoodieAllIncomingRecords
+  // Handle various save modes
+  if (mode == SaveMode.ErrorIfExists && exists) {
 
 Review comment:
   you mean immediately after write handling begins? This was the code flow 
that existed. I didn't want to change any ordering w/o knowing the details. I 
just worked on delete path (i.e. else block lines 194 - 271) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact 

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348562199
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
 
 Review comment:
   +1 to limit the category of issues. Another question: does "suggestion" 
belong to "support" here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348563108
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
 
 Review comment:
   And whether docker or not?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
yanghua commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348562199
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
 
 Review comment:
   +1 to limit the category of issues, shall we give a reminding if users want 
to report a bug, please go to JIRA? Another question: does "suggestion" belong 
to "support" here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348550354
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
 
 Review comment:
   Add a section here with guidelines on the following? 
   
   - Refer to FAQ first, to see if this covered (link to faq)
   - Join the mailing list to engage in conversations and get faster 
support.(link to signing up)
   - If you are have triaged this as a real issue, then file a JIRA directly 
instead (link to instructions)
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348548797
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
 
 Review comment:
   Replace with `about: Report an issue faced while using Hudi` 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348548953
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
 
 Review comment:
   leave blank?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348549310
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+**Describe the bug**
 
 Review comment:
   Replace with "Describe the problem you faced" 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348548493
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
 
 Review comment:
   Replace with `Support Request` (also please change md file name). 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348551822
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/config.yml
 ##
 @@ -0,0 +1,9 @@
+blank_issues_enabled: false
+contact_links:
+  - name: FAQ
+url: https://cwiki.apache.org/confluence/display/HUDI/FAQ
+about: Do checkout frequently asked questions before filing an issue.
+  - name: Hudi Issue Tracker
+url: https://issues.apache.org/jira/projects/HUDI
+about: Reporting bugs on jira helps us better track the issue, but it is 
not manditory as of now.
 
 Review comment:
   actually if the user has triaged the bug already, then it can be created on 
JIRA instead. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1029: [HUDI-348] Add Issue 
template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#discussion_r348551229
 
 

 ##
 File path: .github/ISSUE_TEMPLATE/bug_report.md
 ##
 @@ -0,0 +1,30 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
 
 Review comment:
   Add a section on environment and collect following data? 
   
   - Hudi version :
   - Spark version :
   - Hive version :
   - Hadoop version :
   - Storage(HDFS/S3/GCS..) :
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar closed issue #1007: Azure Support

2019-11-20 Thread GitBox
vinothchandar closed issue #1007: Azure Support 
URL: https://github.com/apache/incubator-hudi/issues/1007
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
vinothchandar commented on a change in pull request #1024: [HUDI-345] Fix used 
deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#discussion_r348545277
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/io/storage/SizeAwareFSDataOutputStream.java
 ##
 @@ -43,7 +43,7 @@
 
   public SizeAwareFSDataOutputStream(Path path, FSDataOutputStream out, 
ConsistencyGuard consistencyGuard,
   Runnable closeCallback) throws IOException {
-super(out);
+super(out, null);
 
 Review comment:
   is it okay to pass null here? this is the stats object right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #1032: [DOC] Add reload step

2019-11-20 Thread GitBox
vinothchandar commented on issue #1032: [DOC] Add reload step
URL: https://github.com/apache/incubator-hudi/pull/1032#issuecomment-556051604
 
 
   @bhasudha please review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated (d9fbe33 -> 1e14390)

2019-11-20 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from d9fbe33  [HOTFIX] Fix error configuration item of 
dockerfile-maven-plugin
 add 1e14390  [HUDI-350]: updated default value of 
config.getCleanerCommitsRetained() in javadocs

No new revisions were added by this update.

Summary of changes:
 .../src/main/java/org/apache/hudi/io/HoodieCleanHelper.java| 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)



[GitHub] [incubator-hudi] bvaradar merged pull request #1031: [HUDI-350]: updated default value of config.getCleanerCommitsRetained() in javadocs

2019-11-20 Thread GitBox
bvaradar merged pull request #1031: [HUDI-350]: updated default value of 
config.getCleanerCommitsRetained() in javadocs
URL: https://github.com/apache/incubator-hudi/pull/1031
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1031: [HUDI-350]: updated default value of config.getCleanerCommitsRetained() in javadocs

2019-11-20 Thread GitBox
pratyakshsharma commented on a change in pull request #1031: [HUDI-350]: 
updated default value of config.getCleanerCommitsRetained() in javadocs
URL: https://github.com/apache/incubator-hudi/pull/1031#discussion_r348503606
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieCleanHelper.java
 ##
 @@ -176,10 +176,10 @@ public HoodieCleanHelper(HoodieTable hoodieTable, 
HoodieWriteConfig config) {
* - Leaves the latest version of the file untouched - For older versions, - 
It leaves all the commits untouched which
* has occured in last config.getCleanerCommitsRetained() 
commits - It leaves ONE commit before this
* window. We assume that the max(query execution time) == commit_batch_time 
* config.getCleanerCommitsRetained().
-   * This is 12 hours by default. This is essential to leave the file used by 
the query thats running for the max time.
+   * This is 5 hours by default. This is essential to leave the file used by 
the query thats running for the max time.
 
 Review comment:
   added


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete 
api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348502684
 
 

 ##
 File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
 ##
 @@ -72,131 +73,200 @@ private[hudi] object HoodieSparkSqlWriter {
 parameters(OPERATION_OPT_KEY)
   }
 
-// register classes & schemas
-val structName = s"${tblName.get}_record"
-val nameSpace = s"hoodie.${tblName.get}"
-sparkContext.getConf.registerKryoClasses(
-  Array(classOf[org.apache.avro.generic.GenericData],
-classOf[org.apache.avro.Schema]))
-val schema = AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, 
structName, nameSpace)
-sparkContext.getConf.registerAvroSchemas(schema)
-log.info(s"Registered avro schema : ${schema.toString(true)}")
-
-// Convert to RDD[HoodieRecord]
-val keyGenerator = 
DataSourceUtils.createKeyGenerator(toProperties(parameters))
-val genericRecords: RDD[GenericRecord] = AvroConversionUtils.createRdd(df, 
structName, nameSpace)
-val hoodieAllIncomingRecords = genericRecords.map(gr => {
-  val orderingVal = DataSourceUtils.getNestedFieldValAsString(
-gr, parameters(PRECOMBINE_FIELD_OPT_KEY)).asInstanceOf[Comparable[_]]
-  DataSourceUtils.createHoodieRecord(gr,
-orderingVal, keyGenerator.getKey(gr), 
parameters(PAYLOAD_CLASS_OPT_KEY))
-}).toJavaRDD()
+var writeSuccessful: Boolean = false
+var commitTime: String = null
+var writeStatuses: JavaRDD[WriteStatus] = null
 
 val jsc = new JavaSparkContext(sparkContext)
-
 val basePath = new Path(parameters("path"))
 val fs = basePath.getFileSystem(sparkContext.hadoopConfiguration)
 var exists = fs.exists(new Path(basePath, 
HoodieTableMetaClient.METAFOLDER_NAME))
 
-// Handle various save modes
-if (mode == SaveMode.ErrorIfExists && exists) {
-  throw new HoodieException(s"hoodie dataset at $basePath already exists.")
-}
-if (mode == SaveMode.Ignore && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Ignoring & not 
performing actual writes.")
-  return (true, common.util.Option.empty())
-}
-if (mode == SaveMode.Overwrite && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Deleting existing 
data & overwriting with new data.")
-  fs.delete(basePath, true)
-  exists = false
-}
+if (!operation.equalsIgnoreCase(DELETE_OPERATION_OPT_VAL)) {
 
 Review comment:
   @nsivabalan : Add a comment in code to explain this


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete 
api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348501608
 
 

 ##
 File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
 ##
 @@ -72,131 +73,212 @@ private[hudi] object HoodieSparkSqlWriter {
 parameters(OPERATION_OPT_KEY)
   }
 
-// register classes & schemas
-val structName = s"${tblName.get}_record"
-val nameSpace = s"hoodie.${tblName.get}"
-sparkContext.getConf.registerKryoClasses(
-  Array(classOf[org.apache.avro.generic.GenericData],
-classOf[org.apache.avro.Schema]))
-val schema = AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, 
structName, nameSpace)
-sparkContext.getConf.registerAvroSchemas(schema)
-log.info(s"Registered avro schema : ${schema.toString(true)}")
-
-// Convert to RDD[HoodieRecord]
-val keyGenerator = 
DataSourceUtils.createKeyGenerator(toProperties(parameters))
-val genericRecords: RDD[GenericRecord] = AvroConversionUtils.createRdd(df, 
structName, nameSpace)
-val hoodieAllIncomingRecords = genericRecords.map(gr => {
-  val orderingVal = DataSourceUtils.getNestedFieldValAsString(
-gr, parameters(PRECOMBINE_FIELD_OPT_KEY)).asInstanceOf[Comparable[_]]
-  DataSourceUtils.createHoodieRecord(gr,
-orderingVal, keyGenerator.getKey(gr), 
parameters(PAYLOAD_CLASS_OPT_KEY))
-}).toJavaRDD()
+var writeSuccessful: Boolean = false
+var commitTime: String = null
+var writeStatuses: JavaRDD[WriteStatus] = null
 
 val jsc = new JavaSparkContext(sparkContext)
-
 val basePath = new Path(parameters("path"))
 val fs = basePath.getFileSystem(sparkContext.hadoopConfiguration)
 var exists = fs.exists(new Path(basePath, 
HoodieTableMetaClient.METAFOLDER_NAME))
 
-// Handle various save modes
-if (mode == SaveMode.ErrorIfExists && exists) {
-  throw new HoodieException(s"hoodie dataset at $basePath already exists.")
-}
-if (mode == SaveMode.Ignore && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Ignoring & not 
performing actual writes.")
-  return (true, common.util.Option.empty())
-}
-if (mode == SaveMode.Overwrite && exists) {
-  log.warn(s"hoodie dataset at $basePath already exists. Deleting existing 
data & overwriting with new data.")
-  fs.delete(basePath, true)
-  exists = false
-}
+if (!operation.equalsIgnoreCase(DELETE_OPERATION_OPT_VAL)) {
+  // register classes & schemas
+  val structName = s"${tblName.get}_record"
+  val nameSpace = s"hoodie.${tblName.get}"
+  sparkContext.getConf.registerKryoClasses(
+Array(classOf[org.apache.avro.generic.GenericData],
+  classOf[org.apache.avro.Schema]))
+  val schema = 
AvroConversionUtils.convertStructTypeToAvroSchema(df.schema, structName, 
nameSpace)
+  sparkContext.getConf.registerAvroSchemas(schema)
+  log.info(s"Registered avro schema : ${schema.toString(true)}")
 
-// Create the dataset if not present
-if (!exists) {
-  HoodieTableMetaClient.initTableType(sparkContext.hadoopConfiguration, 
path.get, storageType,
-tblName.get, "archived")
-}
+  // Convert to RDD[HoodieRecord]
+  val keyGenerator = 
DataSourceUtils.createKeyGenerator(toProperties(parameters))
+  val genericRecords: RDD[GenericRecord] = 
AvroConversionUtils.createRdd(df, structName, nameSpace)
+  val hoodieAllIncomingRecords = genericRecords.map(gr => {
+val orderingVal = DataSourceUtils.getNestedFieldValAsString(
+  gr, parameters(PRECOMBINE_FIELD_OPT_KEY)).asInstanceOf[Comparable[_]]
+DataSourceUtils.createHoodieRecord(gr,
+  orderingVal, keyGenerator.getKey(gr), 
parameters(PAYLOAD_CLASS_OPT_KEY))
+  }).toJavaRDD()
 
-// Create a HoodieWriteClient & issue the write.
-val client = DataSourceUtils.createHoodieClient(jsc, schema.toString, 
path.get, tblName.get,
-  mapAsJavaMap(parameters)
-)
-
-val hoodieRecords =
-  if (parameters(INSERT_DROP_DUPS_OPT_KEY).toBoolean) {
-DataSourceUtils.dropDuplicates(
-  jsc,
-  hoodieAllIncomingRecords,
-  mapAsJavaMap(parameters), client.getTimelineServer)
-  } else {
-hoodieAllIncomingRecords
+  // Handle various save modes
+  if (mode == SaveMode.ErrorIfExists && exists) {
 
 Review comment:
   Why not check these immediately after delete handling begins ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete 
api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348494772
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##
 @@ -1345,4 +1423,13 @@ private void updateMetadataAndRollingStats(String 
actionType, HoodieCommitMetada
 }
   }
 
+  /**
+   * Refers to different operation types
+   */
+  enum OperationType {
 
 Review comment:
   Can you also add the prepped variants of insert/upsert/bulk-insert to 
uniquely identify each public API that caused the commit ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete api to HoodieWriteClient

2019-11-20 Thread GitBox
bvaradar commented on a change in pull request #1004: [HUDI-328] Adding delete 
api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r348500830
 
 

 ##
 File path: 
hudi-client/src/test/java/org/apache/hudi/TestHoodieClientOnCopyOnWriteStorage.java
 ##
 @@ -274,6 +275,15 @@ private void testUpsertsInternal(HoodieWriteConfig 
hoodieWriteConfig,
 updateBatch(hoodieWriteConfig, client, newCommitTime, prevCommitTime,
 Option.of(Arrays.asList(commitTimeBetweenPrevAndNew)), initCommitTime, 
numRecords, writeFn, isPrepped, true,
 numRecords, 200, 2);
+
+// Delete 1
+prevCommitTime = newCommitTime;
+newCommitTime = "005";
+numRecords = 50;
+
+deleteBatch(hoodieWriteConfig, client, newCommitTime, prevCommitTime,
 
 Review comment:
   I couldn't pinpoint a test-case where we create a brand new hoodie write 
client and write config (which does not have previous avro schema). If there is 
one already, we are good. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken commented on issue #1033: [doc][chinese] Translate the release page 
into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033#issuecomment-556015273
 
 
   hi, @leesf, help to review, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken opened a new pull request #1033: [doc][chinese] Translate the release page into chinese documentation

2019-11-20 Thread GitBox
lamber-ken opened a new pull request #1033: [doc][chinese] Translate the 
release page into chinese documentation
URL: https://github.com/apache/incubator-hudi/pull/1033
 
 
   ## What is the purpose of the pull request
   
   Translate the release page into chinese documentation
   
   ## Brief change log
   
 - Translate the release page into chinese documentation
 - Modify mydoc_sidebar_cn
 - Add strings_cn
 - Fix typo in quickstart
   
   ## Verify this pull request
   
   This pull request is a doc work without any test coverage.
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1031: [HUDI-350]: updated default value of config.getCleanerCommitsRetained() in javadocs

2019-11-20 Thread GitBox
bvaradar commented on a change in pull request #1031: [HUDI-350]: updated 
default value of config.getCleanerCommitsRetained() in javadocs
URL: https://github.com/apache/incubator-hudi/pull/1031#discussion_r348490034
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieCleanHelper.java
 ##
 @@ -176,10 +176,10 @@ public HoodieCleanHelper(HoodieTable hoodieTable, 
HoodieWriteConfig config) {
* - Leaves the latest version of the file untouched - For older versions, - 
It leaves all the commits untouched which
* has occured in last config.getCleanerCommitsRetained() 
commits - It leaves ONE commit before this
* window. We assume that the max(query execution time) == commit_batch_time 
* config.getCleanerCommitsRetained().
-   * This is 12 hours by default. This is essential to leave the file used by 
the query thats running for the max time.
+   * This is 5 hours by default. This is essential to leave the file used by 
the query thats running for the max time.
 
 Review comment:
   can you add explicitly  the assumption (assuming ingestion is running every 
30 minutes) here too


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-348) Add Issue template for the project

2019-11-20 Thread Gurudatt Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gurudatt Kulkarni updated HUDI-348:
---
Summary: Add Issue template for the project  (was: Create a issue template 
for github)

> Add Issue template for the project
> --
>
> Key: HUDI-348
> URL: https://issues.apache.org/jira/browse/HUDI-348
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Developer Productivity
>Reporter: Gurudatt Kulkarni
>Assignee: Gurudatt Kulkarni
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add an issue template for a convenient way to file issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] Guru107 commented on issue #1029: [HUDI-348] Add Issue template for the project

2019-11-20 Thread GitBox
Guru107 commented on issue #1029: [HUDI-348] Add Issue template for the project
URL: https://github.com/apache/incubator-hudi/pull/1029#issuecomment-555999002
 
 
   @yanghua Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf edited a comment on issue #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
leesf edited a comment on issue #1024: [HUDI-345] Fix used deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#issuecomment-555993173
 
 
   > @leesf thanks. So, just be fix for deprecated method. By the way, is there 
any wechat group or DingTalk.
   Yes, we created a wechat group below.
   
   
![meitu_1](https://user-images.githubusercontent.com/1012/69241025-c2f68a00-0bd8-11ea-80d9-2e5af163816f.jpg)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on issue #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
leesf commented on issue #1024: [HUDI-345] Fix used deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#issuecomment-555993173
 
 
   > @leesf thanks. So, just be fix for deprecated method. By the way, is there 
any wechat group or DingTalk.
   Yes, we created a wechat group below.
   https://github.com/apache/incubator-hudi/pull/1024#issuecomment-555895904


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

2019-11-20 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978324#comment-16978324
 ] 

leesf commented on HUDI-288:


> I think we can stick to the same whitelist/blacklist that Kafka itself uses? 

It makes sense.

> IIUC, even now, we can specifiy multiple topics as source but they get 
>written as a single Hudi dataset.

Take a look to the currently code, and find the config 
_hoodie.deltastreamer.source.kafka.topic_  to identify the topic to ingest, and 
I  think it does not support topics, so we only support configuring only one 
topic to ingest currently, any thing I missed and please correct me if I am 
wrong.

> we want to ingest kafka topics are separate Hudi datasets.  1-1 mapping 
>between a kafka topic and a hudi dataset.. I think the tool can take a 
>`--base-path-prefix` and place each hudi dataset under 
>`/`

It makes sense.

> Also we could allow topic level overrides as needed.. for deltra steamer/hudi 
>properties.. Our DFSPropertiesConfiguration class already supports includes as 
>well. 

Sorry not to understand it correctly. Could you please show more details?

 

> Are you targetting this for 0.5.1 next release? Or do you think we can pick 
>up some things already labelled for that release.

I would like to get it ready for 0.5.1.

 

 

 

 

> Add support for ingesting multiple kafka streams in a single DeltaStreamer 
> deployment
> -
>
> Key: HUDI-288
> URL: https://issues.apache.org/jira/browse/HUDI-288
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@
>  has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] yanghua commented on issue #991: Hudi Test Suite (Refactor)

2019-11-20 Thread GitBox
yanghua commented on issue #991: Hudi Test Suite (Refactor) 
URL: https://github.com/apache/incubator-hudi/pull/991#issuecomment-555921351
 
 
   @n3nash The Travis is still red. Can we move this PR into a new branch now 
so that I can start to do something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-352) The official documentation about project structure missed hudi-timeline-service module

2019-11-20 Thread vinoyang (Jira)
vinoyang created HUDI-352:
-

 Summary: The official documentation about project structure missed 
hudi-timeline-service module
 Key: HUDI-352
 URL: https://issues.apache.org/jira/browse/HUDI-352
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Docs
Reporter: vinoyang


The official documentation about project structure[1] missed 
hudi-timeline-service module, we should add it.

[1]: http://hudi.apache.org/contributing.html#code--project-structure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] hddong commented on issue #1024: [HUDI-345] Fix used deprecated function

2019-11-20 Thread GitBox
hddong commented on issue #1024: [HUDI-345] Fix used deprecated function
URL: https://github.com/apache/incubator-hudi/pull/1024#issuecomment-555895904
 
 
   @leesf thanks. So, just be fix for deprecated method. By the way, is there 
any wechat group or DingTalk.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services