[jira] [Updated] (HUDI-385) Refactor hudi scala checkstyle

2019-12-05 Thread lamber-ken (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lamber-ken updated HUDI-385:

Description: 
Currently, the level of scala codestyle rule is warning, it's better check 
these rules one by one and refactor scala codes then now.

 

Furthermore, in order to sync to java codestyle, needs to add two rules. One is 
BlockImportChecker which allows to ensure that only single imports are used in 
order to minimize merge errors in import declarations,

another is ImportOrderChecker which checks that imports are grouped and ordered 
according to the style configuration.

 

Summary

1, check scala checkstyle rules one by one, change some warning level to error.

2, add ImportOrderChecker and BlockImportChecker.

 

  was:Currently, the level of scala codestyle rule is warning, it's better 
check these rules one by one and refactor scala codes then now. Furthermore, in 
order to sync to java codestyle, needs to add two rules. One is 
BlockImportChecker which allows to ensure that only single imports are used in 
order to minimize merge errors in import declarations, another is 
ImportOrderChecker which checks that imports are grouped and ordered according 
to the style configuration. Summary 1, check scala checkstyle rules one by one, 
change some warning level to error. 2, add ImportOrderChecker and 
BlockImportChecker. Any comments and feedback are welcome, WDYT?


> Refactor hudi scala checkstyle
> --
>
> Key: HUDI-385
> URL: https://issues.apache.org/jira/browse/HUDI-385
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>Reporter: lamber-ken
>Priority: Major
>
> Currently, the level of scala codestyle rule is warning, it's better check 
> these rules one by one and refactor scala codes then now.
>  
> Furthermore, in order to sync to java codestyle, needs to add two rules. One 
> is BlockImportChecker which allows to ensure that only single imports are 
> used in order to minimize merge errors in import declarations,
> another is ImportOrderChecker which checks that imports are grouped and 
> ordered according to the style configuration.
>  
> Summary
> 1, check scala checkstyle rules one by one, change some warning level to 
> error.
> 2, add ImportOrderChecker and BlockImportChecker.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-385) Refactor hudi scala checkstyle

2019-12-05 Thread lamber-ken (Jira)
lamber-ken created HUDI-385:
---

 Summary: Refactor hudi scala checkstyle
 Key: HUDI-385
 URL: https://issues.apache.org/jira/browse/HUDI-385
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
Reporter: lamber-ken


Currently, the level of scala codestyle rule is warning, it's better check 
these rules one by one and refactor scala codes then now. Furthermore, in order 
to sync to java codestyle, needs to add two rules. One is BlockImportChecker 
which allows to ensure that only single imports are used in order to minimize 
merge errors in import declarations, another is ImportOrderChecker which checks 
that imports are grouped and ordered according to the style configuration. 
Summary 1, check scala checkstyle rules one by one, change some warning level 
to error. 2, add ImportOrderChecker and BlockImportChecker. Any comments and 
feedback are welcome, WDYT?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HUDI-379) Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread lamber-ken (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lamber-ken resolved HUDI-379.
-
Resolution: Resolved

> Refactor the codes based on new JavadocStyle code style rule
> 
>
> Key: HUDI-379
> URL: https://issues.apache.org/jira/browse/HUDI-379
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>Reporter: lamber-ken
>Assignee: lamber-ken
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Refactor the codes based on new JavadocStyle code style rule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] leesf commented on issue #1081: [HUDI-381] Refactor hudi-client based on new comment and code style r…

2019-12-05 Thread GitBox
leesf commented on issue #1081: [HUDI-381] Refactor hudi-client based on new 
comment and code style r…
URL: https://github.com/apache/incubator-hudi/pull/1081#issuecomment-562432991
 
 
   Hi @XuQianJin-Stars Thanks for opening the PR, but the work has been done by 
the PR(#1079),would you please close this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf merged pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
leesf merged pull request #1079: [HUDI-379] Refactor the codes based on new 
JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Build failed in Jenkins: hudi-snapshot-deployment-0.5 #120

2019-12-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 2.21 KB...]
/home/jenkins/tools/maven/apache-maven-3.5.4/bin:
m2.conf
mvn
mvn.cmd
mvnDebug
mvnDebug.cmd
mvnyjp

/home/jenkins/tools/maven/apache-maven-3.5.4/boot:
plexus-classworlds-2.5.2.jar

/home/jenkins/tools/maven/apache-maven-3.5.4/conf:
logging
settings.xml
toolchains.xml

/home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging:
simplelogger.properties

/home/jenkins/tools/maven/apache-maven-3.5.4/lib:
aopalliance-1.0.jar
cdi-api-1.0.jar
cdi-api.license
commons-cli-1.4.jar
commons-cli.license
commons-io-2.5.jar
commons-io.license
commons-lang3-3.5.jar
commons-lang3.license
ext
guava-20.0.jar
guice-4.2.0-no_aop.jar
jansi-1.17.1.jar
jansi-native
javax.inject-1.jar
jcl-over-slf4j-1.7.25.jar
jcl-over-slf4j.license
jsr250-api-1.0.jar
jsr250-api.license
maven-artifact-3.5.4.jar
maven-artifact.license
maven-builder-support-3.5.4.jar
maven-builder-support.license
maven-compat-3.5.4.jar
maven-compat.license
maven-core-3.5.4.jar
maven-core.license
maven-embedder-3.5.4.jar
maven-embedder.license
maven-model-3.5.4.jar
maven-model-builder-3.5.4.jar
maven-model-builder.license
maven-model.license
maven-plugin-api-3.5.4.jar
maven-plugin-api.license
maven-repository-metadata-3.5.4.jar
maven-repository-metadata.license
maven-resolver-api-1.1.1.jar
maven-resolver-api.license
maven-resolver-connector-basic-1.1.1.jar
maven-resolver-connector-basic.license
maven-resolver-impl-1.1.1.jar
maven-resolver-impl.license
maven-resolver-provider-3.5.4.jar
maven-resolver-provider.license
maven-resolver-spi-1.1.1.jar
maven-resolver-spi.license
maven-resolver-transport-wagon-1.1.1.jar
maven-resolver-transport-wagon.license
maven-resolver-util-1.1.1.jar
maven-resolver-util.license
maven-settings-3.5.4.jar
maven-settings-builder-3.5.4.jar
maven-settings-builder.license
maven-settings.license
maven-shared-utils-3.2.1.jar
maven-shared-utils.license
maven-slf4j-provider-3.5.4.jar
maven-slf4j-provider.license
org.eclipse.sisu.inject-0.3.3.jar
org.eclipse.sisu.inject.license
org.eclipse.sisu.plexus-0.3.3.jar
org.eclipse.sisu.plexus.license
plexus-cipher-1.7.jar
plexus-cipher.license
plexus-component-annotations-1.7.1.jar
plexus-component-annotations.license
plexus-interpolation-1.24.jar
plexus-interpolation.license
plexus-sec-dispatcher-1.4.jar
plexus-sec-dispatcher.license
plexus-utils-3.1.0.jar
plexus-utils.license
slf4j-api-1.7.25.jar
slf4j-api.license
wagon-file-3.1.0.jar
wagon-file.license
wagon-http-3.1.0-shaded.jar
wagon-http.license
wagon-provider-api-3.1.0.jar
wagon-provider-api.license

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/ext:
README.txt

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native:
freebsd32
freebsd64
linux32
linux64
osx
README.txt
windows32
windows64

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/osx:
libjansi.jnilib

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows32:
jansi.dll

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows64:
jansi.dll
Finished /home/jenkins/tools/maven/apache-maven-3.5.4 Directory Listing :
Detected current version as: 
'HUDI_home=
0.5.1-SNAPSHOT'
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Hudi   [pom]
[INFO] hudi-common[jar]
[INFO] hudi-timeline-service  [jar]
[INFO] hudi-hadoop-mr [jar]
[INFO] hudi-client[jar]
[INFO] hudi-hive  [jar]
[INFO] hudi-spark [jar]
[INFO] hudi-utilities [jar]
[INFO] hudi-cli   [jar]
[INFO] hudi-hadoop-mr-bundle  [jar]
[INFO] hudi-hive-bundle   [jar]
[INFO] hudi-spark-bundle  [jar]
[INFO] hudi-presto-bundle [jar]
[INFO] hudi-utilities-bundle  [jar]
[INFO] hudi-timeline-server-bundle

[GitHub] [incubator-hudi] zhedoubushishi commented on issue #1036: [HUDI-353] Add hive style partitioning path

2019-12-05 Thread GitBox
zhedoubushishi commented on issue #1036: [HUDI-353] Add hive style partitioning 
path
URL: https://github.com/apache/incubator-hudi/pull/1036#issuecomment-562414420
 
 
   Code changes is done. Ready to review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1057: Hudi Test Suite

2019-12-05 Thread GitBox
yanghua commented on a change in pull request #1057: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/1057#discussion_r354638462
 
 

 ##
 File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/converter/Converter.java
 ##
 @@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.converter;
+
+import java.io.Serializable;
+import org.apache.spark.api.java.JavaRDD;
+
+/**
+ * Implementations of {@link Converter} will convert data from one format to 
another
+ *
+ * @param  Input Data Type
+ * @param  Output Data Type
+ */
+public interface Converter extends Serializable {
 
 Review comment:
   When reviewing #991 I have surveyed the git history, it seems Hudi has 
`Converter ` before, however, we removed it later. I do not know if it's worth 
that we reintroduce it just for the test suite.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on issue #1057: Hudi Test Suite

2019-12-05 Thread GitBox
yanghua commented on issue #1057: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/1057#issuecomment-562401410
 
 
   > @yanghua Can we continue to use the same branch `hudi_test_suite_refactor` 
please ? You can continue to PR against that branch. I've also fixed the issue 
with POM that was causing the test case failure, you may want to rebase this PR 
against that branch.
   
   I am comfortable that no matter use which branch. The reason why I pushed a 
new `hudi_test_suite` branch is 
[here](https://github.com/apache/incubator-hudi/pull/991#issuecomment-559362785).
 I pinged you and gave a time gap to wait for the suggestion. And I got the 
suggestion from @vinothchandar . If you can fix the conflicts based on master 
brach, I can return back to the `hudi_test_suite_refactor` branch. Another 
option, you can cherry pick your commits about pom issues into my new branch 
and I can rename the branch name.
   
   > Secondly, I think we should name the package - `hudi-test-suite` instead 
of `hudi-end-to-end-tests`, it's more concise, WDYT ?
   
   Ok, agree. After coming to an agreement on the branch problem, we can fix 
the module naming problem.
   
   > 
   > Also, can we execute on the following plan :
   > 
   > 1. Please go over all the changes in the existing PR - I've addressed most 
of the comments.
   
   There were too many comments in the old PR #991. Reviewing that PR will 
spend too much time, let's review the new PR to eliminate interruptions of 
outdate comments. It's a more effective way. WDYT?
   
   > 2. Let's rename the package - like you've already done
   
   OK, agree.
   
   > 3. Both of us can start to use the test suite and run them locally for 2-3 
days. This will give us confidence whether the test suite is ready to test 
large features.
   
   OK, will try it.
   
   > 4. Let's merge this first version of the test suite to master so folks can 
start using it
   
   I will let @vinothchandar make the decision to see if it's the right time to 
merge it into the master branch.
   
   > 5. Both of us can continue to enhance the test suite - a) You can add some 
of the things that you mentioned earlier like DIstributedTestDataSource etc b) 
I want to add more use-cases
   > 
   
   OK
   
   > Let me know if this sounds fair to you.
   
   First of all, you can see I kept your code and your git commits. Something I 
have done is just squashed them into one to fix the conflicts.
   
   Some things may be caused by our insufficiency in communication. If we can 
respond to each other's information more efficiently, then maybe our work will 
be more efficient.
   
   Personally, I try to respond to my Github messages every day, except on 
weekends or long holidays. @n3nash  Do you pay more attention to Slack? If so, 
I can communicate with you via Slack. Or can we create a test suite Slack 
group? However, IMO, communicate via Github is a better way because we can 
discuss code.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #976: [HUDI-106] Adding support for DynamicBloomFilter

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #976: [HUDI-106] Adding support 
for DynamicBloomFilter
URL: https://github.com/apache/incubator-hudi/pull/976#discussion_r354601781
 
 

 ##
 File path: LICENSE
 ##
 @@ -293,3 +293,29 @@ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
OTHER DEALINGS IN THE
 SOFTWARE.
 
 ---
+
+This product includes code from org.apache.hadoop.
+
+* org.apache.hudi.common.bloom.filter.InternalDynamicBloomFilter.java adapted 
from org.apache.hadoop.util.bloom.DynamicBloomFilter.java
+
+* org.apache.hudi.common.bloom.filter.InternalFilter copied from classes in 
org.apache.hadoop.util.bloom package
+
 
 Review comment:
   nit: Add the word "with the following license" 
   
   Would make the wordings consistent with others.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #976: [HUDI-106] Adding support for DynamicBloomFilter

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #976: [HUDI-106] Adding support 
for DynamicBloomFilter
URL: https://github.com/apache/incubator-hudi/pull/976#discussion_r354601802
 
 

 ##
 File path: LICENSE
 ##
 @@ -293,3 +293,29 @@ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
OTHER DEALINGS IN THE
 SOFTWARE.
 
 ---
+
+This product includes code from org.apache.hadoop.
 
 Review comment:
   Looks good otherwise


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (HUDI-383) Introduce TransactionHandle abstraction to manage state transitions in hudi clients

2019-12-05 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf reassigned HUDI-383:
--

Assignee: leesf

> Introduce TransactionHandle abstraction to manage state transitions in hudi 
> clients
> ---
>
> Key: HUDI-383
> URL: https://issues.apache.org/jira/browse/HUDI-383
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Cleaner, Compaction, Write Client
>Reporter: Balaji Varadarajan
>Assignee: leesf
>Priority: Minor
>
> Came up in review comment. 
> https://github.com/apache/incubator-hudi/pull/1009/files#r347705820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1075: [HUDI-114]: added option to overwrite payload implementation in hoodie.properties file

2019-12-05 Thread GitBox
n3nash commented on a change in pull request #1075: [HUDI-114]: added option to 
overwrite payload implementation in hoodie.properties file
URL: https://github.com/apache/incubator-hudi/pull/1075#discussion_r354503414
 
 

 ##
 File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
 ##
 @@ -228,6 +228,10 @@ public Operation convert(String value) throws 
ParameterException {
 + " source-fetch -> Transform -> Hudi Write in loop")
 public Boolean continuousMode = false;
 
+@Parameter(names = {"--update-payload-class"}, description = "Update 
payload class in hoodie.properties file if needed, "
 
 Review comment:
   @pratyakshsharma I actually agree with @vinothchandar, we should just check 
`hoodie.properties` and overwrite if it's different. We don't necessarily want 
to create guards if someone does actually provide a different class by mistake


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] n3nash commented on issue #1080: [HUDI-118]: Options provided for passing properties to Cleaner, compactor and importer commands

2019-12-05 Thread GitBox
n3nash commented on issue #1080: [HUDI-118]: Options provided for passing 
properties to Cleaner, compactor and importer commands
URL: https://github.com/apache/incubator-hudi/pull/1080#issuecomment-562274261
 
 
   @vinothchandar ack


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] n3nash edited a comment on issue #1057: Hudi Test Suite

2019-12-05 Thread GitBox
n3nash edited a comment on issue #1057: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/1057#issuecomment-562256132
 
 
   @yanghua Can we continue to use the same branch `hudi_test_suite_refactor` 
please ? You can continue to PR against that branch. I've also fixed the issue 
with POM that was causing the test case failure, you may want to rebase this PR 
against that branch.
   Secondly, I think we should name the package - `hudi-test-suite` instead of 
`hudi-end-to-end-tests`, it's more concise, WDYT ?
   
   Also, can we execute on the following plan :
   
   1) Please go over all the changes in the existing PR - I've addressed most 
of the comments. 
   2) Let's rename the package - like you've already done
   3) Both of us can start to use the test suite and run them locally for 2-3 
days. This will give us confidence whether the test suite is ready to test 
large features.
   4) Let's merge this first version of the test suite to master so folks can 
start using it
   5) Both of us can continue to enhance the test suite - a) You can add some 
of the things that you mentioned earlier like DIstributedTestDataSource etc b) 
I want to add more use-cases
   
   Let me know if this sounds fair to you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #1080: [HUDI-118]: Options provided for passing properties to Cleaner, compactor and importer commands

2019-12-05 Thread GitBox
vinothchandar commented on issue #1080: [HUDI-118]: Options provided for 
passing properties to Cleaner, compactor and importer commands
URL: https://github.com/apache/incubator-hudi/pull/1080#issuecomment-562264403
 
 
   @n3nash could you please review this. (just spreading out the reviews)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1082: [HUDI-380]: updated ide setup on contributing.html page

2019-12-05 Thread GitBox
vinothchandar commented on a change in pull request #1082: [HUDI-380]: updated 
ide setup on contributing.html page
URL: https://github.com/apache/incubator-hudi/pull/1082#discussion_r354486585
 
 

 ##
 File path: docs/contributing.md
 ##
 @@ -25,6 +25,7 @@ To contribute, you would need to fork the Hudi code on 
Github & then clone your
 
 We have embraced the code style largely based on [google 
format](https://google.github.io/styleguide/javaguide.html). Please setup your 
IDE with style files from 
[here](https://github.com/apache/incubator-hudi/tree/master/style).
 These instructions have been tested on IntelliJ. We also recommend setting up 
the [Save Action 
Plugin](https://plugins.jetbrains.com/plugin/7642-save-actions) to auto format 
& organize imports on save. The Maven Compilation life-cycle will fail if there 
are checkstyle violations.
+If you face jetty version related issues while running test cases, we 
recommend you to add spark jars to the classpath of your module in Intellij. 
 
 Review comment:
   Can we keep this generic and avoid mentioning specific issues...  Ideally, 
we can redo this as bullets. and give specific instructions on how to add 
spark-jars to class path and leave it there? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] n3nash commented on issue #1057: Hudi Test Suite

2019-12-05 Thread GitBox
n3nash commented on issue #1057: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/1057#issuecomment-562256132
 
 
   @yanghua Can we continue to use the same branch `hudi_test_suite_refactor` 
please ? You can continue to PR against that branch. 
   Secondly, I think we should name the package - `hudi-test-suite` instead of 
`hudi-end-to-end-tests`, it's more concise, WDYT ?
   
   Also, can we execute on the following plan :
   
   1) Please go over all the changes in the existing PR - I've addressed most 
of the comments. 
   2) Let's rename the package - like you've already done
   3) Both of us can start to use the test suite and run them locally for 2-3 
days. This will give us confidence whether the test suite is ready to test 
large features.
   4) Let's merge this first version of the test suite to master so folks can 
start using it
   5) Both of us can continue to enhance the test suite - a) You can add some 
of the things that you mentioned earlier like DIstributedTestDataSource etc b) 
I want to add more use-cases
   
   Let me know if this sounds fair to you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on issue #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on issue #1009:  [HUDI-308] Avoid Renames for tracking state 
transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#issuecomment-562254591
 
 
   @vinothchandar @n3nash : Redid the migration handling and addressed your 
comments
   
   High-Level Changes since the previous review.
   1. Hudi WriteConfig can be used to control timeline layout upgrade. Default 
will be to start avoiding renames in writer as soon as we deploy the new hudi 
jar on existing datasets. (unit-tests added)
   2. Readers are assumed to upgrade before Writers
   3. Added missing actions to archiving. Now, Archiving logs all states of 
each action
   4. Changed some code in Cleaner Migration (cc @leesf  - please take a look). 
 If this is too painful to review, I will try to create a separate diff.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1057: Hudi Test Suite

2019-12-05 Thread GitBox
n3nash commented on a change in pull request #1057: Hudi Test Suite
URL: https://github.com/apache/incubator-hudi/pull/1057#discussion_r354473294
 
 

 ##
 File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/converter/Converter.java
 ##
 @@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.converter;
+
+import java.io.Serializable;
+import org.apache.spark.api.java.JavaRDD;
+
+/**
+ * Implementations of {@link Converter} will convert data from one format to 
another
+ *
+ * @param  Input Data Type
+ * @param  Output Data Type
+ */
+public interface Converter extends Serializable {
 
 Review comment:
   @yanghua This is used for test-suite use-cases. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] saheb-kanodia removed a comment on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync

2019-12-05 Thread GitBox
saheb-kanodia removed a comment on issue #894: Getting 
java.lang.NoSuchMethodError while doing Hive sync
URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-562096395
 
 
   I am getting the same error with latest EMR release(5.28.0). Hive version: 
2.3.6 and Spark 2.4.4. I am deploying the code (spark-submit) in yarn-cluster 
mode. Do we know why this is happening? Or a workaround to fix this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #976: [HUDI-106] Adding support for DynamicBloomFilter

2019-12-05 Thread GitBox
nsivabalan commented on a change in pull request #976: [HUDI-106] Adding 
support for DynamicBloomFilter
URL: https://github.com/apache/incubator-hudi/pull/976#discussion_r354351151
 
 

 ##
 File path: LICENSE
 ##
 @@ -293,3 +293,29 @@ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
OTHER DEALINGS IN THE
 SOFTWARE.
 
 ---
+
+This product includes code from org.apache.hadoop.
 
 Review comment:
   @bvaradar : can you check if this is good wrt license


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354337130
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   > The default severity is `error`, is the time to change it?
   
   Yes, this pr fixed all codes based on JavadocStyle.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354337249
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   > Since the PR(#1081) does the similar thing.
   
   Right


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354337249
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   > Since the PR(#1081) does the similar thing.
   
   right


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354337130
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   > The default severity is `error`, is the time to change it?
   
   yes, I fix all codes based on JavadocStyle.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354337130
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   > The default severity is `error`, is the time to change it?
   
   Yes, I fix all codes based on JavadocStyle.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
leesf commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354330514
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   Since the PR(#1081) does the similar thing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf commented on a change in pull request #1079: [HUDI-379] Refactor the codes based on new JavadocStyle code style rule

2019-12-05 Thread GitBox
leesf commented on a change in pull request #1079: [HUDI-379] Refactor the 
codes based on new JavadocStyle code style rule
URL: https://github.com/apache/incubator-hudi/pull/1079#discussion_r354329825
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -271,9 +271,8 @@
 
 
 
-
-
-
+
+
 
 Review comment:
   The default severity is `error`, is the time to change it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] saheb-kanodia edited a comment on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync

2019-12-05 Thread GitBox
saheb-kanodia edited a comment on issue #894: Getting 
java.lang.NoSuchMethodError while doing Hive sync
URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-562096395
 
 
   I am getting the same error with latest EMR release(5.28.0). Hive version: 
2.3.6 and Spark 2.4.4. I am deploying the code (spark-submit) in yarn-cluster 
mode. Do we know why this is happening? Or a workaround to fix this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
lamber-ken commented on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562101623
 
 
   > Sorry, I am using it through confluent 3.3.2
   > Apache Kafka Version is 0.11.0.3
   
   No problem. Get it, I try to reproduce this issue.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] saheb-kanodia commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync

2019-12-05 Thread GitBox
saheb-kanodia commented on issue #894: Getting java.lang.NoSuchMethodError 
while doing Hive sync
URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-562096395
 
 
   I am getting the same error with latest EMR release(5.28.0). Hive version: 
2.3.6 and Spark 2.4.4. I am deploying the code in yarn-cluster mode. Do we know 
why this is happening? Or a workaround to fix this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] Neo2007 commented on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
Neo2007 commented on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562088368
 
 
   Sorry, I am using it through confluent 3.3.2
   Apache Kafka Version is  0.11.0.3


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-384) Treat compaction as commit action internally in Hudi to avoid special handling during state transitions

2019-12-05 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-384:
---

 Summary: Treat compaction as commit action internally in Hudi to 
avoid special handling during state transitions 
 Key: HUDI-384
 URL: https://issues.apache.org/jira/browse/HUDI-384
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Common Core
Reporter: Balaji Varadarajan


Link : 
[https://github.com/apache/incubator-hudi/pull/1009#discussion_r348089546]

Came up during code-review. 

```

seems most of the issue.stems from this switching of compaction => commit? Just 
throwing out an idea to see if we can just call talk about compaction as an 
implementation, but have the action be just commit? i.e remove Compaction 
action and replace with Commit given we have requested and inflight there? 
would that simplify the design? does it open new migration pains?

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354245375
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstant.java
 ##
 @@ -30,7 +34,23 @@
  *
  * @see HoodieTimeline
  */
-public class HoodieInstant implements Serializable {
+public class HoodieInstant implements Serializable, Comparable {
+
+  /**
+   * A COMPACTION action eventually becomes COMMIT when completed. So, when 
grouping instants
+   * for state transitions, this needs to be taken into account
+   */
+  private static final Set comparableActions = Arrays.stream(new 
String[] { HoodieTimeline.COMMIT_ACTION,
 
 Review comment:
   Added a jira to track this : https://jira.apache.org/jira/browse/HUDI-384


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354243394
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -312,8 +332,11 @@ public HoodieInstant 
transitionCompactionInflightToComplete(HoodieInstant inflig
   }
 
   private void createFileInAuxiliaryFolder(HoodieInstant instant, 
Option data) {
-Path fullPath = new Path(metaClient.getMetaAuxiliaryPath(), 
instant.getFileName());
-createFileInPath(fullPath, data);
+if (metaClient.getMetadataVersion().isNullVersion()) {
+  // No need for auxiliary folder for newer versions
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] 01/01: Fixing some unit tests

2019-12-05 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository.

nagarwal pushed a commit to branch hudi_test_suite_refactor
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit 504c2cce6e0b37bd541e15bdb79a14eea1eed156
Author: Nishith Agarwal 
AuthorDate: Tue Nov 26 20:39:19 2019 -0800

Fixing some unit tests
---
 hudi-bench/pom.xml | 91 --
 .../hudi/bench/writer/AvroDeltaInputWriter.java|  7 +-
 .../hudi/bench/TestFileDeltaInputWriter.java   |  7 +-
 .../hudi/bench/job/TestHoodieTestSuiteJob.java | 21 +++--
 hudi-spark/pom.xml | 12 +++
 .../hudi/utilities/TestHoodieDeltaStreamer.java| 10 +--
 .../apache/hudi/utilities/UtilitiesTestBase.java   |  8 +-
 7 files changed, 88 insertions(+), 68 deletions(-)

diff --git a/hudi-bench/pom.xml b/hudi-bench/pom.xml
index 69cc78b..84b60b0 100644
--- a/hudi-bench/pom.xml
+++ b/hudi-bench/pom.xml
@@ -95,11 +95,9 @@
 
 
 
-
 
   org.apache.spark
-  spark-hive_2.11
-  ${spark.version}
+  spark-sql_2.11
   
 
   org.mortbay.jetty
@@ -118,9 +116,9 @@
   *
 
   
-  provided
 
 
+
 
   org.apache.hudi
   hudi-common
@@ -179,19 +177,6 @@
 
 
 
-  ${hive.groupid}
-  hive-exec
-  ${hive.version}
-  
-
-  javax.servlet
-  servlet-api
-
-  
-  test
-
-
-
   org.apache.hadoop
   hadoop-hdfs
   tests
@@ -224,15 +209,6 @@
 
 
 
-  org.apache.hudi
-  hudi-client
-  ${project.version}
-  tests
-  test-jar
-  test
-
-
-
   com.fasterxml.jackson.dataformat
   jackson-dataformat-yaml
   2.7.4
@@ -270,22 +246,6 @@
   
 
 
-
-  ${hive.groupid}
-  hive-jdbc
-  ${hive.version}
-  
-
-  org.slf4j
-  slf4j-api
-
-
-  javax.servlet
-  servlet-api
-
-  
-
-
 
 
   org.antlr
@@ -344,6 +304,53 @@
   
 
 
+
+  org.apache.hudi
+  hudi-client
+  ${project.version}
+  tests
+  test-jar
+  test
+
+
+
+  ${hive.groupid}
+  hive-exec
+  ${hive.version}
+  
+
+  javax.servlet
+  servlet-api
+
+  
+  test
+
+
+
+  ${hive.groupid}
+  hive-jdbc
+  ${hive.version}
+  
+
+  org.slf4j
+  slf4j-api
+
+
+  javax.servlet.jsp
+  *
+
+
+  javax.servlet
+  *
+
+
+  org.eclipse.jetty
+  *
+
+  
+  test
+
+
   
 
 
diff --git 
a/hudi-bench/src/main/java/org/apache/hudi/bench/writer/AvroDeltaInputWriter.java
 
b/hudi-bench/src/main/java/org/apache/hudi/bench/writer/AvroDeltaInputWriter.java
index 234530e..d53c39c 100644
--- 
a/hudi-bench/src/main/java/org/apache/hudi/bench/writer/AvroDeltaInputWriter.java
+++ 
b/hudi-bench/src/main/java/org/apache/hudi/bench/writer/AvroDeltaInputWriter.java
@@ -100,7 +100,10 @@ public class AvroDeltaInputWriter implements 
FileDeltaInputWriter
 
   @Override
   public FileDeltaInputWriter getNewWriter() throws IOException {
-return new AvroDeltaInputWriter(this.configuration, this.basePath, 
this.schema.toString(), this.maxFileSize);
+AvroDeltaInputWriter avroDeltaInputWriter = new 
AvroDeltaInputWriter(this.configuration, this.basePath, this
+.schema.toString(), this.maxFileSize);
+avroDeltaInputWriter.open();
+return avroDeltaInputWriter;
   }
 
   public FileSystem getFs() {
@@ -113,6 +116,6 @@ public class AvroDeltaInputWriter implements 
FileDeltaInputWriter
 
   @Override
   public WriteStats getWriteStats() {
-return writeStats;
+return this.writeStats;
   }
 }
diff --git 
a/hudi-bench/src/test/java/org/apache/hudi/bench/TestFileDeltaInputWriter.java 
b/hudi-bench/src/test/java/org/apache/hudi/bench/TestFileDeltaInputWriter.java
index 08358cf..30e0190 100644
--- 
a/hudi-bench/src/test/java/org/apache/hudi/bench/TestFileDeltaInputWriter.java
+++ 
b/hudi-bench/src/test/java/org/apache/hudi/bench/TestFileDeltaInputWriter.java
@@ -82,6 +82,7 @@ public class TestFileDeltaInputWriter extends 
UtilitiesTestBase {
 .toString(), 1024 * 1024L);
 GenericRecordFullPayloadGenerator payloadGenerator =
 new 
GenericRecordFullPayloadGenerator(schemaProvider.getSourceSchema());
+fileSinkWriter.open();
 // 2. Generate 100 avro payloads and write them to an avro file
 IntStream.range(0, 100).forEach(a -> {
   try {
@@ -119,6 +120,7 @@ public class TestFileDeltaInputWriter extends 
UtilitiesTestBase {
 1024 * 1024L);
 GenericRecordFullPayloadGenerator payloadGenerator =
 new 
GenericRecordFullPayloadGenerator(schemaProvider.getSourceSchema());
+fileSinkWriter.open();
 // 2. 

[incubator-hudi] branch hudi_test_suite_refactor updated (4e95f60 -> 504c2cc)

2019-12-05 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository.

nagarwal pushed a change to branch hudi_test_suite_refactor
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


omit 4e95f60  Fixing some unit tests
 new 504c2cc  Fixing some unit tests

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (4e95f60)
\
 N -- N -- N   refs/heads/hudi_test_suite_refactor (504c2cc)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 hudi-bench/pom.xml | 91 --
 .../hudi/bench/job/TestHoodieTestSuiteJob.java | 21 +++--
 hudi-spark/pom.xml | 12 +++
 .../hudi/utilities/TestHoodieDeltaStreamer.java| 10 +--
 .../apache/hudi/utilities/UtilitiesTestBase.java   |  8 +-
 5 files changed, 78 insertions(+), 64 deletions(-)



[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354241852
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -212,25 +224,24 @@ public void saveAsComplete(HoodieInstant instant, 
Option data) {
 log.info("Completed " + instant);
   }
 
-  public void revertToInflight(HoodieInstant instant) {
-log.info("Reverting " + instant + " to inflight ");
-revertStateTransition(instant, HoodieTimeline.getInflightInstant(instant));
-log.info("Reverted " + instant + " to inflight");
-  }
-
-  public HoodieInstant revertToRequested(HoodieInstant instant) {
-log.warn("Reverting " + instant + " to requested ");
-HoodieInstant requestedInstant = 
HoodieTimeline.getRequestedInstant(instant);
-revertStateTransition(instant, 
HoodieTimeline.getRequestedInstant(instant));
-log.warn("Reverted " + instant + " to requested");
-return requestedInstant;
+  public HoodieInstant revertToInflight(HoodieInstant instant) {
+log.info("Reverting instant to inflight " + instant);
+HoodieInstant inflight = HoodieTimeline.getInflightInstant(instant, 
metaClient.getTableType());
+revertCompleteToInflight(instant, inflight);
 
 Review comment:
   Deletes are atomic in GCS but not so in S3. Thought about this. It shouldn't 
bring in new failure scenarios here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354242118
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -260,8 +271,13 @@ private void deleteInstantFile(HoodieInstant instant) {
 
   /** BEGIN - COMPACTION RELATED META-DATA MANAGEMENT **/
 
-  public Option getInstantAuxiliaryDetails(HoodieInstant instant) {
-Path detailPath = new Path(metaClient.getMetaAuxiliaryPath(), 
instant.getFileName());
+  public Option getPlanDetailsInBytes(HoodieInstant instant) {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354240466
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTimeline.java
 ##
 @@ -250,7 +253,10 @@ static HoodieInstant getCompactionInflightInstant(final 
String timestamp) {
 return new HoodieInstant(State.INFLIGHT, COMPACTION_ACTION, timestamp);
   }
 
-  static HoodieInstant getInflightInstant(final HoodieInstant instant) {
+  static HoodieInstant getInflightInstant(final HoodieInstant instant, final 
HoodieTableType tableType) {
+if ((tableType == HoodieTableType.MERGE_ON_READ) && 
instant.getAction().equals(COMMIT_ACTION)) {
 
 Review comment:
   Added


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354240802
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -151,17 +174,6 @@ public HoodieTimeline getDeltaCommitTimeline() {
 (Function> & Serializable) 
this::getInstantDetails);
   }
 
-  /**
 
 Review comment:
   Reverted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354236756
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
 ##
 @@ -420,17 +427,50 @@ public String getCommitActionType() {
* @return List of Hoodie Instants generated
* @throws IOException in case of failure
*/
-  public static List 
scanHoodieInstantsFromFileSystem(FileSystem fs, Path metaPath,
-  Set includedExtensions) throws IOException {
-return Arrays.stream(HoodieTableMetaClient.scanFiles(fs, metaPath, path -> 
{
-  // Include only the meta files with extensions that needs to be included
-  String extension = FSUtils.getFileExtension(path.getName());
-  return includedExtensions.contains(extension);
-})).sorted(Comparator.comparing(
-// Sort the meta-data by the instant time (first part of the file name)
-fileStatus -> FSUtils.getInstantTime(fileStatus.getPath().getName(
-// create HoodieInstantMarkers from FileStatus, which extracts 
properties
-.map(HoodieInstant::new).collect(Collectors.toList());
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions) throws 
IOException {
+return scanHoodieInstantsFromFileSystem(fs, metaPath, includedExtensions, 
true);
+  }
+
+  /**
+   * Helper method to scan all hoodie-instant metafiles and construct 
HoodieInstant objects
+   *
+   * @param fs FileSystem
+   * @param metaPath   Meta Path where hoodie instants are present
+   * @param includedExtensions Included hoodie extensions
+   * @param excludeIntermediateStates If there are multiple states for the 
same action instant,
+   *  only include the highest state
+   * @return List of Hoodie Instants generated
+   * @throws IOException in case of failure
+   */
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions, boolean 
excludeIntermediateStates)
+  throws IOException {
+Stream instantStream = Arrays.stream(
+HoodieTableMetaClient
+.scanFiles(fs, metaPath, path -> {
+  // Include only the meta files with extensions that needs to be 
included
+  String extension = FSUtils.getFileExtension(path.getName());
+  return includedExtensions.contains(extension);
+})).map(HoodieInstant::new);
+
+if (excludeIntermediateStates) {
+  // Remove intermediate states for each (ts,action) pair
+  instantStream = dedupeInstants(instantStream);
+}
+return instantStream.sorted().collect(Collectors.toList());
+  }
+
+  public static Stream dedupeInstants(Stream 
instantStream) {
+return instantStream.collect(Collectors.groupingBy(x -> 
Pair.of(x.getTimestamp(),
+x.getAction().equals(HoodieTimeline.COMPACTION_ACTION) ? 
HoodieTimeline.COMMIT_ACTION : x.getAction(
+.entrySet().stream().map(e -> e.getValue().stream().reduce((x, y) -> {
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354235768
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
 ##
 @@ -420,17 +427,50 @@ public String getCommitActionType() {
* @return List of Hoodie Instants generated
* @throws IOException in case of failure
*/
-  public static List 
scanHoodieInstantsFromFileSystem(FileSystem fs, Path metaPath,
-  Set includedExtensions) throws IOException {
-return Arrays.stream(HoodieTableMetaClient.scanFiles(fs, metaPath, path -> 
{
-  // Include only the meta files with extensions that needs to be included
-  String extension = FSUtils.getFileExtension(path.getName());
-  return includedExtensions.contains(extension);
-})).sorted(Comparator.comparing(
-// Sort the meta-data by the instant time (first part of the file name)
-fileStatus -> FSUtils.getInstantTime(fileStatus.getPath().getName(
-// create HoodieInstantMarkers from FileStatus, which extracts 
properties
-.map(HoodieInstant::new).collect(Collectors.toList());
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions) throws 
IOException {
+return scanHoodieInstantsFromFileSystem(fs, metaPath, includedExtensions, 
true);
+  }
+
+  /**
+   * Helper method to scan all hoodie-instant metafiles and construct 
HoodieInstant objects
+   *
+   * @param fs FileSystem
+   * @param metaPath   Meta Path where hoodie instants are present
+   * @param includedExtensions Included hoodie extensions
+   * @param excludeIntermediateStates If there are multiple states for the 
same action instant,
+   *  only include the highest state
+   * @return List of Hoodie Instants generated
+   * @throws IOException in case of failure
+   */
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions, boolean 
excludeIntermediateStates)
+  throws IOException {
+Stream instantStream = Arrays.stream(
+HoodieTableMetaClient
+.scanFiles(fs, metaPath, path -> {
+  // Include only the meta files with extensions that needs to be 
included
+  String extension = FSUtils.getFileExtension(path.getName());
+  return includedExtensions.contains(extension);
+})).map(HoodieInstant::new);
+
+if (excludeIntermediateStates) {
+  // Remove intermediate states for each (ts,action) pair
+  instantStream = dedupeInstants(instantStream);
+}
+return instantStream.sorted().collect(Collectors.toList());
+  }
+
+  public static Stream dedupeInstants(Stream 
instantStream) {
 
 Review comment:
   I moved this logic to TableLayout. Let me know if interface needs more 
changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354232466
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieMetadataVersion.java
 ##
 @@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.model;
+
+import com.google.common.base.Preconditions;
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Metadata Layout Version. Add new version when timeline format changes
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354232343
 
 

 ##
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##
 @@ -108,16 +110,25 @@ private void close() {
*/
   public boolean archiveIfRequired(final JavaSparkContext jsc) throws 
IOException {
 try {
-  List instantsToArchive = 
getInstantsToArchive(jsc).collect(Collectors.toList());
+  List instantsToDelete = 
getInstantsToArchive(jsc).collect(Collectors.toList());
+  List instantsToArchive = instantsToDelete.stream()
+  .filter(HoodieInstant::isCompleted).collect(Collectors.toList());
+
+  Preconditions.checkArgument(instantsToArchive.isEmpty() == 
instantsToDelete.isEmpty(),
 
 Review comment:
   Made code changes to allow pending instants also be archived. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354231831
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -342,60 +365,94 @@ public HoodieInstant 
transitionCleanInflightToComplete(HoodieInstant inflightIns
* Transition Clean State from requested to inflight
*
* @param requestedInstant requested instant
+   * @param data Optional data to be stored
* @return commit instant
*/
-  public HoodieInstant transitionCleanRequestedToInflight(HoodieInstant 
requestedInstant) {
+  public HoodieInstant transitionCleanRequestedToInflight(HoodieInstant 
requestedInstant, Option data) {
 
Preconditions.checkArgument(requestedInstant.getAction().equals(HoodieTimeline.CLEAN_ACTION));
 Preconditions.checkArgument(requestedInstant.isRequested());
 HoodieInstant inflight = new HoodieInstant(State.INFLIGHT, CLEAN_ACTION, 
requestedInstant.getTimestamp());
-transitionState(requestedInstant, inflight, Option.empty());
+transitionState(requestedInstant, inflight, data);
 return inflight;
   }
 
 
   private void transitionState(HoodieInstant fromInstant, HoodieInstant 
toInstant, Option data) {
 
Preconditions.checkArgument(fromInstant.getTimestamp().equals(toInstant.getTimestamp()));
-Path commitFilePath = new Path(metaClient.getMetaPath(), 
toInstant.getFileName());
 try {
-  // Re-create the .inflight file by opening a new file and write the 
commit metadata in
-  Path inflightCommitFile = new Path(metaClient.getMetaPath(), 
fromInstant.getFileName());
-  createFileInMetaPath(fromInstant.getFileName(), data);
-  boolean success = metaClient.getFs().rename(inflightCommitFile, 
commitFilePath);
-  if (!success) {
-throw new HoodieIOException("Could not rename " + inflightCommitFile + 
" to " + commitFilePath);
+  if (metaClient.getMetadataVersion().isNullVersion()) {
+// Re-create the .inflight file by opening a new file and write the 
commit metadata in
+createFileInMetaPath(fromInstant.getFileName(), data, false);
+Path fromInstantPath = new Path(metaClient.getMetaPath(), 
fromInstant.getFileName());
+Path toInstantPath = new Path(metaClient.getMetaPath(), 
toInstant.getFileName());
+boolean success = metaClient.getFs().rename(fromInstantPath, 
toInstantPath);
+if (!success) {
+  throw new HoodieIOException("Could not rename " + fromInstantPath + 
" to " + toInstantPath);
+}
+  } else {
+// Ensures old state exists in timeline
+System.out.println("Checking for file exists ?" + new 
Path(metaClient.getMetaPath(),
+fromInstant.getFileName()));
+Preconditions.checkArgument(metaClient.getFs().exists(new 
Path(metaClient.getMetaPath(),
+fromInstant.getFileName(;
+// Use Write Once to create Target File
+writeFileOnceInPath(new Path(metaClient.getMetaPath(), 
toInstant.getFileName()), data);
+System.out.println("Create new file for toInstant ?" + new 
Path(metaClient.getMetaPath(), toInstant.getFileName()));
   }
 } catch (IOException e) {
   throw new HoodieIOException("Could not complete " + fromInstant, e);
 }
   }
 
-  private void revertStateTransition(HoodieInstant curr, HoodieInstant revert) 
{
-
Preconditions.checkArgument(curr.getTimestamp().equals(revert.getTimestamp()));
-Path revertFilePath = new Path(metaClient.getMetaPath(), 
revert.getFileName());
+  private void revertCompleteToInflight(HoodieInstant completed, HoodieInstant 
inflight) {
+
Preconditions.checkArgument(completed.getTimestamp().equals(inflight.getTimestamp()));
+Path inFlightCommitFilePath = new Path(metaClient.getMetaPath(), 
inflight.getFileName());
+Path commitFilePath = new Path(metaClient.getMetaPath(), 
completed.getFileName());
 try {
-  if (!metaClient.getFs().exists(revertFilePath)) {
-Path currFilePath = new Path(metaClient.getMetaPath(), 
curr.getFileName());
-boolean success = metaClient.getFs().rename(currFilePath, 
revertFilePath);
-if (!success) {
-  throw new HoodieIOException("Could not rename " + currFilePath + " 
to " + revertFilePath);
+  if (metaClient.getMetadataVersion().isNullVersion()) {
+if (!metaClient.getFs().exists(inFlightCommitFilePath)) {
+  boolean success = metaClient.getFs().rename(commitFilePath, 
inFlightCommitFilePath);
+  if (!success) {
+throw new HoodieIOException(
+"Could not rename " + commitFilePath + " to " + 
inFlightCommitFilePath);
+  }
 }
-log.info("Renamed " + currFilePath + " to " + revertFilePath);
+  } else {
+Path requestedInstantFilePath = new 

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354230170
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -433,6 +494,32 @@ private void createFileInPath(Path fullPath, 
Option content) {
 }
   }
 
+  /**
+   * Creates a new file in timeline with overwrite set to false. This ensures
+   * files are created only once and never rewritten
+   * @param fullPath File Path
+   * @param content Content to be stored
+   */
+  private void writeFileOnceInPath(Path fullPath, Option content) {
 
 Review comment:
   @vinothchandar : Currently, we employ overwrite semantics for inflight 
commit/delta-commit files. We first create an empty inflight file and then 
overwrite it later with workload profile data. I can try cleaning that up in 
hoodie write client.  Let me know your thoughts ?
   
   @n3nash : Renamed the method. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354216985
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
 ##
 @@ -342,60 +365,94 @@ public HoodieInstant 
transitionCleanInflightToComplete(HoodieInstant inflightIns
* Transition Clean State from requested to inflight
*
* @param requestedInstant requested instant
+   * @param data Optional data to be stored
* @return commit instant
*/
-  public HoodieInstant transitionCleanRequestedToInflight(HoodieInstant 
requestedInstant) {
+  public HoodieInstant transitionCleanRequestedToInflight(HoodieInstant 
requestedInstant, Option data) {
 
Preconditions.checkArgument(requestedInstant.getAction().equals(HoodieTimeline.CLEAN_ACTION));
 Preconditions.checkArgument(requestedInstant.isRequested());
 HoodieInstant inflight = new HoodieInstant(State.INFLIGHT, CLEAN_ACTION, 
requestedInstant.getTimestamp());
-transitionState(requestedInstant, inflight, Option.empty());
+transitionState(requestedInstant, inflight, data);
 return inflight;
   }
 
 
   private void transitionState(HoodieInstant fromInstant, HoodieInstant 
toInstant, Option data) {
 
Preconditions.checkArgument(fromInstant.getTimestamp().equals(toInstant.getTimestamp()));
-Path commitFilePath = new Path(metaClient.getMetaPath(), 
toInstant.getFileName());
 try {
-  // Re-create the .inflight file by opening a new file and write the 
commit metadata in
-  Path inflightCommitFile = new Path(metaClient.getMetaPath(), 
fromInstant.getFileName());
-  createFileInMetaPath(fromInstant.getFileName(), data);
-  boolean success = metaClient.getFs().rename(inflightCommitFile, 
commitFilePath);
-  if (!success) {
-throw new HoodieIOException("Could not rename " + inflightCommitFile + 
" to " + commitFilePath);
+  if (metaClient.getMetadataVersion().isNullVersion()) {
+// Re-create the .inflight file by opening a new file and write the 
commit metadata in
+createFileInMetaPath(fromInstant.getFileName(), data, false);
+Path fromInstantPath = new Path(metaClient.getMetaPath(), 
fromInstant.getFileName());
+Path toInstantPath = new Path(metaClient.getMetaPath(), 
toInstant.getFileName());
+boolean success = metaClient.getFs().rename(fromInstantPath, 
toInstantPath);
+if (!success) {
+  throw new HoodieIOException("Could not rename " + fromInstantPath + 
" to " + toInstantPath);
+}
+  } else {
+// Ensures old state exists in timeline
+System.out.println("Checking for file exists ?" + new 
Path(metaClient.getMetaPath(),
 
 Review comment:
   Thanks. Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354216781
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
 ##
 @@ -420,17 +427,50 @@ public String getCommitActionType() {
* @return List of Hoodie Instants generated
* @throws IOException in case of failure
*/
-  public static List 
scanHoodieInstantsFromFileSystem(FileSystem fs, Path metaPath,
-  Set includedExtensions) throws IOException {
-return Arrays.stream(HoodieTableMetaClient.scanFiles(fs, metaPath, path -> 
{
-  // Include only the meta files with extensions that needs to be included
-  String extension = FSUtils.getFileExtension(path.getName());
-  return includedExtensions.contains(extension);
-})).sorted(Comparator.comparing(
-// Sort the meta-data by the instant time (first part of the file name)
-fileStatus -> FSUtils.getInstantTime(fileStatus.getPath().getName(
-// create HoodieInstantMarkers from FileStatus, which extracts 
properties
-.map(HoodieInstant::new).collect(Collectors.toList());
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions) throws 
IOException {
+return scanHoodieInstantsFromFileSystem(fs, metaPath, includedExtensions, 
true);
+  }
+
+  /**
+   * Helper method to scan all hoodie-instant metafiles and construct 
HoodieInstant objects
+   *
+   * @param fs FileSystem
+   * @param metaPath   Meta Path where hoodie instants are present
+   * @param includedExtensions Included hoodie extensions
+   * @param excludeIntermediateStates If there are multiple states for the 
same action instant,
+   *  only include the highest state
+   * @return List of Hoodie Instants generated
+   * @throws IOException in case of failure
+   */
+  public static List scanHoodieInstantsFromFileSystem(
+  FileSystem fs, Path metaPath, Set includedExtensions, boolean 
excludeIntermediateStates)
+  throws IOException {
+Stream instantStream = Arrays.stream(
+HoodieTableMetaClient
+.scanFiles(fs, metaPath, path -> {
+  // Include only the meta files with extensions that needs to be 
included
+  String extension = FSUtils.getFileExtension(path.getName());
+  return includedExtensions.contains(extension);
+})).map(HoodieInstant::new);
+
+if (excludeIntermediateStates) {
+  // Remove intermediate states for each (ts,action) pair
+  instantStream = dedupeInstants(instantStream);
+}
+return instantStream.sorted().collect(Collectors.toList());
+  }
+
+  public static Stream dedupeInstants(Stream 
instantStream) {
+return instantStream.collect(Collectors.groupingBy(x -> 
Pair.of(x.getTimestamp(),
+x.getAction().equals(HoodieTimeline.COMPACTION_ACTION) ? 
HoodieTimeline.COMMIT_ACTION : x.getAction(
 
 Review comment:
   Refactored to reuse this logic.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350891309
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##
 @@ -731,6 +739,12 @@ public boolean rollbackToSavepoint(String savepointTime) {
* file,
*/
   public boolean rollback(final String commitTime) throws 
HoodieRollbackException {
+try {
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351017008
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieMetadataVersion.java
 ##
 @@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.model;
+
+import com.google.common.base.Preconditions;
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Metadata Layout Version. Add new version when timeline format changes
+ */
+public class HoodieMetadataVersion implements Serializable, 
Comparable {
+
+  public static Integer VERSION_0 = 0; // pre 0.5.1  version format
+  public static Integer VERSION_1 = 1; // current version with no renames
+
+  public static Integer CURR_VERSION = VERSION_1;
+
+  private Integer version;
+
+  public HoodieMetadataVersion(Integer version) {
+Preconditions.checkArgument(version <= CURR_VERSION);
+Preconditions.checkArgument(version >= VERSION_0);
+this.version = version;
+  }
+
+  /**
+   * For Pre 0.5.1 release, there was no metadata version. This method is used 
to detect
+   * this case.
+   * @return
+   */
+  public boolean isNullVersion() {
+return Objects.equals(version, VERSION_0);
+  }
+
+  public Integer getVersion() {
+return version;
+  }
+
+  @Override
+  public boolean equals(Object o) {
+if (this == o) {
 
 Review comment:
   If all the callers do use Objects.equals(), then we should be ok. But, then 
there are still equals calls in Collections (e:g HashMap) where they use 
obj1.equals(obj2).  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354204782
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTimeline.java
 ##
 @@ -250,7 +253,10 @@ static HoodieInstant getCompactionInflightInstant(final 
String timestamp) {
 return new HoodieInstant(State.INFLIGHT, COMPACTION_ACTION, timestamp);
   }
 
-  static HoodieInstant getInflightInstant(final HoodieInstant instant) {
+  static HoodieInstant getInflightInstant(final HoodieInstant instant, final 
HoodieTableType tableType) {
 
 Review comment:
   Thought about keeping in metaclient but the current structure is such that 
all file-name generation depending on state is present in HoodieTimeline. 
Metaclient APIs are currently  one level higher dealing with timelines, table 
and others. So, leaving it this way.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350897448
 
 

 ##
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##
 @@ -192,6 +210,7 @@ private boolean deleteArchivedInstants(List 
archivedInstants) thr
   return i.isCompleted() && 
(i.getAction().equals(HoodieTimeline.COMMIT_ACTION)
   || (i.getAction().equals(HoodieTimeline.DELTA_COMMIT_ACTION)));
 }).max(Comparator.comparing(HoodieInstant::getTimestamp)));
+log.info("Last Committed Instant =" + latestCommitted);
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351014321
 
 

 ##
 File path: hudi-client/src/test/java/org/apache/hudi/TestClientRollback.java
 ##
 @@ -217,10 +217,12 @@ public void testRollbackCommit() throws Exception {
 
   // simulate partial failure, where .inflight was not deleted, but data 
files were.
   HoodieTestUtils.createInflightCommitFiles(basePath, commitTime3);
+  Thread.sleep(1000);
 
 Review comment:
   Modified rollback() code to use the non-conflicting creatNewCommitTime API 
and also removed these lines


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354196674
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##
 @@ -345,6 +346,9 @@ public static SparkConf registerClasses(SparkConf conf) {
 final List fileIDPrefixes =
 IntStream.range(0, parallelism).mapToObj(i -> 
FSUtils.createNewFileIdPfx()).collect(Collectors.toList());
 
+table.getActiveTimeline().transitionRequestedToInflight(new 
HoodieInstant(State.REQUESTED,
 
 Review comment:
   Agree. This would make state transitions more manageable. Filed a jira to 
track this as a follow-up : https://jira.apache.org/jira/browse/HUDI-383


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351022082
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstant.java
 ##
 @@ -30,7 +34,23 @@
  *
  * @see HoodieTimeline
  */
-public class HoodieInstant implements Serializable {
+public class HoodieInstant implements Serializable, Comparable {
+
+  /**
+   * A COMPACTION action eventually becomes COMMIT when completed. So, when 
grouping instants
+   * for state transitions, this needs to be taken into account
+   */
+  private static final Set comparableActions = Arrays.stream(new 
String[] { HoodieTimeline.COMMIT_ACTION,
+  HoodieTimeline.COMPACTION_ACTION}).collect(Collectors.toSet());
+
+  public static final Comparator comparator = 
Comparator.comparing(HoodieInstant::getTimestamp)
 
 Review comment:
   Enums have a natural ordering defined by their ordinal. In our case, State 
is an enum and Requested < Inflight < Completed 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354200362
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##
 @@ -896,17 +919,22 @@ private void finishRestore(final Timer.Context context, 
Map stats = doRollbackAndGetStats(commitToRollback);
-  Map> statToCommit = new HashMap<>();
-  finishRollback(context, stats, Arrays.asList(commitToRollback), 
startRollbackTime);
+  // Create a Hoodie table which encapsulated the commits and files visible
+  HoodieTable table = HoodieTable.getHoodieTable(
+  createMetaClient(true), config, jsc);
+  List rollbackInstants = 
table.getActiveTimeline().getCommitsTimeline().getInstants()
+  .filter(instant -> 
HoodieActiveTimeline.EQUAL.test(instant.getTimestamp(), commitToRollback))
+  .collect(Collectors.toList());
+  if (!rollbackInstants.isEmpty()) {
+Preconditions.checkArgument(rollbackInstants.size() == 1);
+List stats = 
doRollbackAndGetStats(rollbackInstants.get(0));
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350889000
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieCleanClient.java
 ##
 @@ -92,7 +92,7 @@ protected HoodieCleanMetadata clean(String startCleanTime) 
throws HoodieIOExcept
   if ((cleanerPlan.getFilesToBeDeletedPerPartition() != null)
   && !cleanerPlan.getFilesToBeDeletedPerPartition().isEmpty()) {
 final HoodieTable hoodieTable = 
HoodieTable.getHoodieTable(createMetaClient(true), config, jsc);
-return runClean(hoodieTable, startCleanTime);
+return runClean(hoodieTable, 
HoodieTimeline.getCleanRequestedInstant(startCleanTime), cleanerPlan);
 
 Review comment:
   No, it only creates a Requested instant. The loop earlier in the same 
function will take care of inflight clean instants


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351019450
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
 ##
 @@ -125,6 +133,13 @@ public HoodieTableType getTableType() {
 return DEFAULT_TABLE_TYPE;
   }
 
+  public HoodieMetadataVersion getTableVersion() {
+if (props.containsKey(HOODIE_TABLE_VERSION_PROP_NAME)) {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351017850
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
 ##
 @@ -52,12 +53,15 @@
   public static final String HOODIE_TABLE_TYPE_PROP_NAME = "hoodie.table.type";
   public static final String HOODIE_RO_FILE_FORMAT_PROP_NAME = 
"hoodie.table.ro.file.format";
   public static final String HOODIE_RT_FILE_FORMAT_PROP_NAME = 
"hoodie.table.rt.file.format";
+  public static final String HOODIE_TABLE_VERSION_PROP_NAME = 
"hoodie.table.version";
   public static final String HOODIE_PAYLOAD_CLASS_PROP_NAME = 
"hoodie.compaction.payload.class";
   public static final String HOODIE_ARCHIVELOG_FOLDER_PROP_NAME = 
"hoodie.archivelog.folder";
 
   public static final HoodieTableType DEFAULT_TABLE_TYPE = 
HoodieTableType.COPY_ON_WRITE;
   public static final HoodieFileFormat DEFAULT_RO_FILE_FORMAT = 
HoodieFileFormat.PARQUET;
   public static final HoodieFileFormat DEFAULT_RT_FILE_FORMAT = 
HoodieFileFormat.HOODIE_LOG;
+  public static final Integer DEFAULT_TABLE_VERSION = 
HoodieMetadataVersion.VERSION_0;
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351017736
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java
 ##
 @@ -52,12 +53,15 @@
   public static final String HOODIE_TABLE_TYPE_PROP_NAME = "hoodie.table.type";
   public static final String HOODIE_RO_FILE_FORMAT_PROP_NAME = 
"hoodie.table.ro.file.format";
   public static final String HOODIE_RT_FILE_FORMAT_PROP_NAME = 
"hoodie.table.rt.file.format";
+  public static final String HOODIE_TABLE_VERSION_PROP_NAME = 
"hoodie.table.version";
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r354211539
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java
 ##
 @@ -92,15 +91,15 @@ public HoodieTimeline filterInflightsAndRequested() {
   }
 
   @Override
-  public HoodieTimeline filterInflightsExcludingCompaction() {
+  public HoodieTimeline filterPendingExcludingCompaction() {
 return new HoodieDefaultTimeline(instants.stream().filter(instant -> {
-  return instant.isInflight() && 
(!instant.getAction().equals(HoodieTimeline.COMPACTION_ACTION));
+  return (!instant.isCompleted()) && 
(!instant.getAction().equals(HoodieTimeline.COMPACTION_ACTION));
 
 Review comment:
   Yes, that was the intention. One of the places where we are using is rolling 
back pending commits


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351012476
 
 

 ##
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##
 @@ -169,7 +180,14 @@ public boolean archiveIfRequired(final JavaSparkContext 
jsc) throws IOException
   }).limit(commitTimeline.countInstants() - minCommitsToKeep));
 }
 
-return instants;
+// For archiving and cleaning instants, we need to include intermediate 
state files if they exist
+HoodieActiveTimeline rawActiveTimeline = new 
HoodieActiveTimeline(metaClient, false);
+Map, List> groupByTsAction = 
rawActiveTimeline.getInstants()
+.collect(Collectors.groupingBy(x -> Pair.of(x.getTimestamp(),
+x.getAction().equals(HoodieTimeline.COMPACTION_ACTION) ? 
HoodieTimeline.COMMIT_ACTION : x.getAction(;
+
+return instants.flatMap(hoodieInstant ->
 
 Review comment:
   we would want to archive all types of actions eventually. So, keeping it 
this way.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351012174
 
 

 ##
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##
 @@ -169,7 +180,14 @@ public boolean archiveIfRequired(final JavaSparkContext 
jsc) throws IOException
   }).limit(commitTimeline.countInstants() - minCommitsToKeep));
 }
 
-return instants;
+// For archiving and cleaning instants, we need to include intermediate 
state files if they exist
+HoodieActiveTimeline rawActiveTimeline = new 
HoodieActiveTimeline(metaClient, false);
+Map, List> groupByTsAction = 
rawActiveTimeline.getInstants()
+.collect(Collectors.groupingBy(x -> Pair.of(x.getTimestamp(),
+x.getAction().equals(HoodieTimeline.COMPACTION_ACTION) ? 
HoodieTimeline.COMMIT_ACTION : x.getAction(;
 
 Review comment:
   refactored a bit to be used here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350895583
 
 

 ##
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##
 @@ -169,7 +180,14 @@ public boolean archiveIfRequired(final JavaSparkContext 
jsc) throws IOException
   }).limit(commitTimeline.countInstants() - minCommitsToKeep));
 }
 
-return instants;
+// For archiving and cleaning instants, we need to include intermediate 
state files if they exist
+HoodieActiveTimeline rawActiveTimeline = new 
HoodieActiveTimeline(metaClient, false);
+Map, List> groupByTsAction = 
rawActiveTimeline.getInstants()
+.collect(Collectors.groupingBy(x -> Pair.of(x.getTimestamp(),
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r351015157
 
 

 ##
 File path: hudi-client/src/test/java/org/apache/hudi/index/TestHbaseIndex.java
 ##
 @@ -168,13 +169,19 @@ public void testTagLocationAndDuplicateUpdate() throws 
Exception {
 HoodieWriteConfig config = getConfig();
 HBaseIndex index = new HBaseIndex(config);
 HoodieWriteClient writeClient = new HoodieWriteClient(jsc, config);
-writeClient.startCommit();
+writeClient.startCommitWithTime(newCommitTime);
 metaClient = HoodieTableMetaClient.reload(metaClient);
 HoodieTable hoodieTable = HoodieTable.getHoodieTable(metaClient, config, 
jsc);
 
 JavaRDD writeStatues = writeClient.upsert(writeRecords, 
newCommitTime);
 JavaRDD javaRDD1 = index.tagLocation(writeRecords, jsc, 
hoodieTable);
+
 // Duplicate upsert and ensure correctness is maintained
+// We are trying to approximately imitate the case when the RRD is 
recomputed. For RRD creating, driver code is not
 
 Review comment:
   Thanks @vinothchandar  Linkedin days :) 
   
   @n3nash : Sure, We can discuss this f2f
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350888183
 
 

 ##
 File path: 
hudi-cli/src/main/java/org/apache/hudi/cli/commands/CompactionCommand.java
 ##
 @@ -96,14 +96,15 @@ public String compactionsAll(
   if (!instant.getAction().equals(HoodieTimeline.COMPACTION_ACTION)) {
 try {
   // This could be a completed compaction. Assume a compaction request 
file is present but skip if fails
-  workload = AvroUtils.deserializeCompactionPlan(activeTimeline
-  
.getInstantAuxiliaryDetails(HoodieTimeline.getCompactionRequestedInstant(instant.getTimestamp())).get());
+  workload = AvroUtils.deserializeCompactionPlan(
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1009: [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset

2019-12-05 Thread GitBox
bvaradar commented on a change in pull request #1009:  [HUDI-308] Avoid Renames 
for tracking state transitions of all actions on dataset
URL: https://github.com/apache/incubator-hudi/pull/1009#discussion_r350892589
 
 

 ##
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##
 @@ -824,14 +843,14 @@ private String startInstant() {
   "Found commits after time :" + lastCommit + ", please rollback 
greater commits first");
 }
 
-List inflights =
-
inflightCommitTimeline.getInstants().map(HoodieInstant::getTimestamp).collect(Collectors.toList());
+List inflights = 
inflightAndRequestedCommitTimeline.getInstants().map(HoodieInstant::getTimestamp)
 
 Review comment:
   Yes, For requested instant, the rollback is essentially essentially removing 
the instant file as no side-effect has happened


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
lamber-ken edited a comment on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562058599
 
 
   > Kafka 3.3.2
   
   hi, the latest version is 2.3.0 from apache-kafka web site. 
http://kafka.apache.org/downloads


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
lamber-ken commented on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562058599
 
 
   > Kafka 3.3.2
   
   hi, the lastest version is 2.3.0 from apache-kafka web site. 
http://kafka.apache.org/downloads


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] Neo2007 commented on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
Neo2007 commented on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562057619
 
 
   Kafka 3.3.2
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] lamber-ken commented on issue #1083: [SUPPORT]

2019-12-05 Thread GitBox
lamber-ken commented on issue #1083: [SUPPORT]
URL: https://github.com/apache/incubator-hudi/issues/1083#issuecomment-562053728
 
 
   hi, what's the version of kafka?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-383) Introduce TransactionHandle abstraction to manage state transitions in hudi clients

2019-12-05 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-383:
---

 Summary: Introduce TransactionHandle abstraction to manage state 
transitions in hudi clients
 Key: HUDI-383
 URL: https://issues.apache.org/jira/browse/HUDI-383
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Cleaner, Compaction, Write Client
Reporter: Balaji Varadarajan


Came up in review comment. 
https://github.com/apache/incubator-hudi/pull/1009/files#r347705820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HUDI-363) Refactor codes based on ImportOrder code style rule

2019-12-05 Thread lamber-ken (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lamber-ken resolved HUDI-363.
-
Resolution: Resolved

> Refactor codes based on ImportOrder code style rule
> ---
>
> Key: HUDI-363
> URL: https://issues.apache.org/jira/browse/HUDI-363
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>Reporter: lamber-ken
>Assignee: lamber-ken
>Priority: Critical
>
> Refactor codes based on ImportOrder code style rules. Manay places need to 
> refactor, so this rule may needs some subtasks.
> follow bellow steps to fix:
> 1, set the severity ImportOrder to error level in local env.
> 2, use command to check which module you are working on.
> {code:java}
> mvn -pl hudi-common checkstyle:check
> {code}
> 3, remember to reset severity to info before commiting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HUDI-378) Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread lamber-ken (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lamber-ken resolved HUDI-378.
-
Resolution: Resolved

> Refactor the rest codes based on new ImportOrder code style rule
> 
>
> Key: HUDI-378
> URL: https://issues.apache.org/jira/browse/HUDI-378
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>Reporter: lamber-ken
>Assignee: lamber-ken
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Refactor the rest codes based on new ImportOrder code style rule and set 
> severity error level



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] yanghua merged pull request #1078: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread GitBox
yanghua merged pull request #1078: [HUDI-378] Refactor the rest codes based on 
new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1078
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule (#1078)

2019-12-05 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository.

vinoyang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new c06d89b  [HUDI-378] Refactor the rest codes based on new ImportOrder 
code style rule (#1078)
c06d89b is described below

commit c06d89b648bbdaaad9c68fb9d3c6d53b4d68541c
Author: lamber-ken 
AuthorDate: Thu Dec 5 17:25:03 2019 +0800

[HUDI-378] Refactor the rest codes based on new ImportOrder code style rule 
(#1078)
---
 .../apache/hudi/config/HoodieStorageConfig.java|  3 +-
 .../apache/hudi/metrics/JmxMetricsReporter.java| 16 +++---
 .../src/test/java/org/apache/hudi/TestCleaner.java |  1 +
 .../apache/hudi/metrics/TestHoodieJmxMetrics.java  | 10 ++--
 .../org/apache/hudi/common/util/CleanerUtils.java  |  6 ++-
 .../versioning/clean/CleanMetadataMigrator.java|  3 +-
 .../versioning/clean/CleanV1MigrationHandler.java  |  3 +-
 .../versioning/clean/CleanV2MigrationHandler.java  |  3 +-
 .../org/apache/hudi/hive/HoodieHiveClient.java | 58 +++---
 style/checkstyle.xml   |  1 -
 10 files changed, 57 insertions(+), 47 deletions(-)

diff --git 
a/hudi-client/src/main/java/org/apache/hudi/config/HoodieStorageConfig.java 
b/hudi-client/src/main/java/org/apache/hudi/config/HoodieStorageConfig.java
index 90fdb6c..f9c98c7 100644
--- a/hudi-client/src/main/java/org/apache/hudi/config/HoodieStorageConfig.java
+++ b/hudi-client/src/main/java/org/apache/hudi/config/HoodieStorageConfig.java
@@ -18,11 +18,12 @@
 
 package org.apache.hudi.config;
 
+import javax.annotation.concurrent.Immutable;
+
 import java.io.File;
 import java.io.FileReader;
 import java.io.IOException;
 import java.util.Properties;
-import javax.annotation.concurrent.Immutable;
 
 /**
  * Storage related config
diff --git 
a/hudi-client/src/main/java/org/apache/hudi/metrics/JmxMetricsReporter.java 
b/hudi-client/src/main/java/org/apache/hudi/metrics/JmxMetricsReporter.java
index 7bc73d2..d00ec67 100644
--- a/hudi-client/src/main/java/org/apache/hudi/metrics/JmxMetricsReporter.java
+++ b/hudi-client/src/main/java/org/apache/hudi/metrics/JmxMetricsReporter.java
@@ -18,18 +18,20 @@
 
 package org.apache.hudi.metrics;
 
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieException;
+
 import com.google.common.base.Preconditions;
-import java.io.Closeable;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
 
-import java.lang.management.ManagementFactory;
-import java.rmi.registry.LocateRegistry;
 import javax.management.remote.JMXConnectorServer;
 import javax.management.remote.JMXConnectorServerFactory;
 import javax.management.remote.JMXServiceURL;
-import org.apache.hudi.config.HoodieWriteConfig;
-import org.apache.hudi.exception.HoodieException;
-import org.apache.log4j.LogManager;
-import org.apache.log4j.Logger;
+
+import java.io.Closeable;
+import java.lang.management.ManagementFactory;
+import java.rmi.registry.LocateRegistry;
 
 /**
  * Implementation of Jmx reporter, which used to report jmx metric.
diff --git a/hudi-client/src/test/java/org/apache/hudi/TestCleaner.java 
b/hudi-client/src/test/java/org/apache/hudi/TestCleaner.java
index 370021a..200575a 100644
--- a/hudi-client/src/test/java/org/apache/hudi/TestCleaner.java
+++ b/hudi-client/src/test/java/org/apache/hudi/TestCleaner.java
@@ -78,6 +78,7 @@ import java.util.TreeSet;
 import java.util.function.Predicate;
 import java.util.stream.Collectors;
 import java.util.stream.Stream;
+
 import scala.Tuple3;
 
 import static 
org.apache.hudi.common.model.HoodieTestUtils.DEFAULT_PARTITION_PATHS;
diff --git 
a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java 
b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
index 7260774..b014329 100644
--- 
a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
+++ 
b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
@@ -18,16 +18,16 @@
 
 package org.apache.hudi.metrics;
 
-import static org.apache.hudi.metrics.Metrics.registerGauge;
-import static org.junit.Assert.assertTrue;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
-
 import org.apache.hudi.config.HoodieMetricsConfig;
 import org.apache.hudi.config.HoodieWriteConfig;
 
 import org.junit.Test;
 
+import static org.apache.hudi.metrics.Metrics.registerGauge;
+import static org.junit.Assert.assertTrue;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
 /**
  * Test for the Jmx metrics report.
  */
diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/util/CleanerUtils.java 
b/hudi-common/src/main/java/org/apache/hudi/common/util/CleanerUtils.java
index 4d4ccb9..0e8c460 100644
--- a/hudi-common/src/main/java/org/apache/hudi/common/util/CleanerUtils.java

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread GitBox
yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the 
rest codes based on new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1078#discussion_r354188876
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -282,7 +282,6 @@
 
 
 
-
 
 Review comment:
   OK, since this is the last shot, we can remove it(let it keep the default 
value **error**). Ever since this time point, all the classes must respect this 
import order rule, otherwise, the checkstyle will break the build progress. cc 
@vinothchandar 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread GitBox
yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the 
rest codes based on new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1078#discussion_r354188876
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -282,7 +282,6 @@
 
 
 
-
 
 Review comment:
   OK, since this is the last shot, we can remove it(let it keep the default 
value **error**). Ever since this time point, all the classes must respect this 
import order rule, otherwise, the checkstyle plugin will break the build 
progress. cc @vinothchandar 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-382) Move TimestampBasedKeyGenerator to hudi-spark module from hudi-utilities

2019-12-05 Thread Gurudatt Kulkarni (Jira)
Gurudatt Kulkarni created HUDI-382:
--

 Summary: Move TimestampBasedKeyGenerator to hudi-spark module from 
hudi-utilities
 Key: HUDI-382
 URL: https://issues.apache.org/jira/browse/HUDI-382
 Project: Apache Hudi (incubating)
  Issue Type: Task
  Components: Common Core
Reporter: Gurudatt Kulkarni
Assignee: Gurudatt Kulkarni


Context,

[https://lists.apache.org/thread.html/7e65cd14f8f24308568c15de2bd68e3f4192e6caae741e76362d1de3%40%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] lamber-ken commented on a change in pull request #1078: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread GitBox
lamber-ken commented on a change in pull request #1078: [HUDI-378] Refactor the 
rest codes based on new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1078#discussion_r354185840
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -282,7 +282,6 @@
 
 
 
-
 
 Review comment:
   All finished.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule

2019-12-05 Thread GitBox
yanghua commented on a change in pull request #1078: [HUDI-378] Refactor the 
rest codes based on new ImportOrder code style rule
URL: https://github.com/apache/incubator-hudi/pull/1078#discussion_r354184397
 
 

 ##
 File path: style/checkstyle.xml
 ##
 @@ -282,7 +282,6 @@
 
 
 
-
 
 Review comment:
   @lamber-ken shall we keep this config option?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services