[GitHub] [incubator-hudi] bvaradar commented on issue #786: HoodieWrapperFileSystem not working with presto

2019-07-15 Thread GitBox
bvaradar commented on issue #786: HoodieWrapperFileSystem not working with presto URL: https://github.com/apache/incubator-hudi/issues/786#issuecomment-511687669 @eisig : Can you try this PR and see if it works. https://github.com/apache/incubator-hudi/pull/793 -

[GitHub] [incubator-hudi] bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-511649204 @eisig yeah your understanding is correct. Regarding RT query, can you enable debug in your spark shell and give the driver logs. It would

[GitHub] [incubator-hudi] garyli1019 closed issue #768: No Space Left On Device for upsert

2019-07-15 Thread GitBox
garyli1019 closed issue #768: No Space Left On Device for upsert URL: https://github.com/apache/incubator-hudi/issues/768 This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [incubator-hudi] eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-511644424 @bvaradar It is a file not found exception cause (3). Maybe you PR https://github.com/apache/incubator-hudi/pull/788 will fix this. --

[GitHub] [incubator-hudi] eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-511641230 @bvaradar Here the data https://raw.githubusercontent.com/eisig/sample-files/master/hudi-789/metadata.txt https://raw.githubusercon

[GitHub] [incubator-hudi] bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-511630300 @eisig : Also regarding (3) How long did you notice it to be in hanging state ? My guess is the rollback is progressing but slowly (https:

[GitHub] [incubator-hudi] vinothchandar edited a comment on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar edited a comment on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511621749 got past that by including also `hive-exec` in the spark-bundle. but hit ``` aused by: java.lang.ClassNotFou

[GitHub] [incubator-hudi] n3nash opened a new pull request #792: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null

2019-07-15 Thread GitBox
n3nash opened a new pull request #792: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null URL: https://github.com/apache/incubator-hudi/pull/792 This is an automated message from th

[GitHub] [incubator-hudi] vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511621749 got past that.. but hit ``` aused by: java.lang.ClassNotFoundException: com.uber.hoodie.org.apache.hadoop.hive.ql.

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303691056 ## File path: hoodie-client/src/main/java/com/uber/ho

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
bvaradar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303690624 ## File path: hoodie-client/src/main/java/com/uber/hoodie/

[GitHub] [incubator-hudi] bvaradar opened a new pull request #791: HUDI-140 : GCS: Log File Reading not working due to difference in seek() behavior for EOF

2019-07-15 Thread GitBox
bvaradar opened a new pull request #791: HUDI-140 : GCS: Log File Reading not working due to difference in seek() behavior for EOF URL: https://github.com/apache/incubator-hudi/pull/791 This is an automated message from the

[GitHub] [incubator-hudi] vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511616887 @bvaradar testing the fix.. seems like the demo steps by hand work since Hive registration is done via tool always in the dem

[GitHub] [incubator-hudi] garyli1019 commented on issue #768: No Space Left On Device for upsert

2019-07-15 Thread GitBox
garyli1019 commented on issue #768: No Space Left On Device for upsert URL: https://github.com/apache/incubator-hudi/issues/768#issuecomment-511612476 Thanks, that worked! I am able to config `spark.local.dir` through `spark-submit` so it's not a problem. How about ``` public stat

[GitHub] [incubator-hudi] bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511611679 > @bvaradar is this the error? > > ``` > Exception in thread "main" java.lang.NoClassDefFoundError: com/uber/hoodie/org/

[GitHub] [incubator-hudi] vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511609949 @bvaradar is this the error? ``` Exception in thread "main" java.lang.NoClassDefFoundError: com/uber/hoodie/org/apa

[GitHub] [incubator-hudi] bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511608119 > @bvaradar do you mean #782 ? It might be good to first cherry-pick that and test if you think its purely a bundling issue.

[GitHub] [incubator-hudi] vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511606150 @bvaradar do you mean #782 ? It might be good to first cherry-pick that and test if you think its purely a bundling issue.

[GitHub] [incubator-hudi] bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
bvaradar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511598480 @vinothchandar : FYI - PR (https://github.com/apache/incubator-hudi/pull/790) automates running integration test. It looks like H

[GitHub] [incubator-hudi] vinothchandar commented on issue #784: Can Hudi delete records?

2019-07-15 Thread GitBox
vinothchandar commented on issue #784: Can Hudi delete records? URL: https://github.com/apache/incubator-hudi/issues/784#issuecomment-511576299 No.. when you can pass in that class `com.uber.hoodie.EmptyHoodieRecordPayload` as the payload implementation using DeltaStreamer or the Datasourc

[GitHub] [incubator-hudi] vinothchandar commented on issue #790: Update docker_demo based on new bundling

2019-07-15 Thread GitBox
vinothchandar commented on issue #790: Update docker_demo based on new bundling URL: https://github.com/apache/incubator-hudi/pull/790#issuecomment-511575100 @bvaradar the decision is around whether spark-bundle should also include input format classes. Spark can be used for writing as wel

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #790: Update docker_demo based on new bundling

2019-07-15 Thread GitBox
bvaradar commented on a change in pull request #790: Update docker_demo based on new bundling URL: https://github.com/apache/incubator-hudi/pull/790#discussion_r303634878 ## File path: docs/docker_demo.md ## @@ -335,7 +335,7 @@ running in spark-sql ``` docker exec -it

[GitHub] [incubator-hudi] vinothchandar merged pull request #787: fix HoodieLogFileReader

2019-07-15 Thread GitBox
vinothchandar merged pull request #787: fix HoodieLogFileReader URL: https://github.com/apache/incubator-hudi/pull/787 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[incubator-hudi] branch master updated: fix HoodieLogFileReader (#787)

2019-07-15 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new c0593e7 fix HoodieLogFileReader (#787)

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303619221 ## File path: hoodie-client/src/main/java/com/uber/ho

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303618603 ## File path: hoodie-client/src/main/java/com/uber/ho

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
vinothchandar commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303617738 ## File path: hoodie-client/src/main/java/com/uber/ho

[GitHub] [incubator-hudi] vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles

2019-07-15 Thread GitBox
vinothchandar commented on issue #751: Clean up poms, unused deps and thinning bundles URL: https://github.com/apache/incubator-hudi/pull/751#issuecomment-511553594 @bvaradar @n3nash please review This is an automated message

[GitHub] [incubator-hudi] vinothchandar commented on issue #768: No Space Left On Device for upsert

2019-07-15 Thread GitBox
vinothchandar commented on issue #768: No Space Left On Device for upsert URL: https://github.com/apache/incubator-hudi/issues/768#issuecomment-511544979 the /tmp is more for spark, to shuffle the data. Hudi only writes to the `basePath` you configure. One place that we do use the /tmp fol

[GitHub] [incubator-hudi] vinothchandar commented on issue #780: Cleanup Maven POM/Classpath

2019-07-15 Thread GitBox
vinothchandar commented on issue #780: Cleanup Maven POM/Classpath URL: https://github.com/apache/incubator-hudi/pull/780#issuecomment-511531857 @thesuperzapper thanks for testing it out. I am planning to land #751 to master today. Feel free to rebase off that and rework this PR if you can

[GitHub] [incubator-hudi] vinothchandar opened a new pull request #790: Update docker_demo based on new bundling

2019-07-15 Thread GitBox
vinothchandar opened a new pull request #790: Update docker_demo based on new bundling URL: https://github.com/apache/incubator-hudi/pull/790 - spark shell commands need to also include mr bundle - small usability fixes to commands

[GitHub] [incubator-hudi] garyli1019 commented on issue #768: No Space Left On Device for upsert

2019-07-15 Thread GitBox
garyli1019 commented on issue #768: No Space Left On Device for upsert URL: https://github.com/apache/incubator-hudi/issues/768#issuecomment-511510001 Hello @vinothchandar , sure I will use email to ask future questions. This issue was caused by `/tmp` folder was too small. ``` [2019

[GitHub] [incubator-hudi] vinothchandar commented on issue #690: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null

2019-07-15 Thread GitBox
vinothchandar commented on issue #690: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null URL: https://github.com/apache/incubator-hudi/pull/690#issuecomment-511509967 @n3nash I have not seen them usually?

[GitHub] [incubator-hudi] vinothchandar commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
vinothchandar commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511508649 Hmmm.. what I notice is that every stage (3, 15, 16) that goes one pass over the input takes like 3 minutes

[GitHub] [incubator-hudi] n3nash commented on issue #690: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null

2019-07-15 Thread GitBox
n3nash commented on issue #690: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null URL: https://github.com/apache/incubator-hudi/pull/690#issuecomment-511507038 @vinothchandar These warnings aren't as a result of this PR, we should suppress them t

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
n3nash commented on a change in pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#discussion_r303562388 ## File path: hoodie-client/src/main/java/com/uber/hoodie/ta

[GitHub] [incubator-hudi] vinothchandar commented on issue #786: HoodieWrapperFileSystem not working with presto

2019-07-15 Thread GitBox
vinothchandar commented on issue #786: HoodieWrapperFileSystem not working with presto URL: https://github.com/apache/incubator-hudi/issues/786#issuecomment-511502930 @bhasudha can you triage this and take over here? This i

[GitHub] [incubator-hudi] vinothchandar commented on issue #787: fix HoodieLogFileReader

2019-07-15 Thread GitBox
vinothchandar commented on issue #787: fix HoodieLogFileReader URL: https://github.com/apache/incubator-hudi/pull/787#issuecomment-511502434 @bvaradar I think we can squash . and merge? (I started doing that) . anyways, CI seems have errors.? Something related to HBase that I have not seen

[GitHub] [incubator-hudi] vinothchandar commented on issue #779: HoodieDeltaStreamer may insert duplicate record?

2019-07-15 Thread GitBox
vinothchandar commented on issue #779: HoodieDeltaStreamer may insert duplicate record? URL: https://github.com/apache/incubator-hudi/issues/779#issuecomment-511500766 @eisig hmmm. looks like you are issuing upserts, since I see the log files.. is it possible to provide a sample file with

[GitHub] [incubator-hudi] bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
bvaradar commented on issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-511489868 @eisig : Regarding (2), Can you list the timeline (.hoodie folder) and all partition path folders ? Also, for the latest committed

[GitHub] [incubator-hudi] eisig opened a new issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-15 Thread GitBox
eisig opened a new issue #789: HoodieMergeOnReadTable rollback hangs URL: https://github.com/apache/incubator-hudi/issues/789 There seems to be two bugs with the master branch(commit: ae3c02fb3) my steps: 1. use HDFSParquetImporter to import from hive to hudi 2. use HoodieDeltaStr

[GitHub] [incubator-hudi] eisig commented on issue #786: HoodieWrapperFileSystem not working with presto

2019-07-15 Thread GitBox
eisig commented on issue #786: HoodieWrapperFileSystem not working with presto URL: https://github.com/apache/incubator-hudi/issues/786#issuecomment-511432993 @bvaradar It's the latest master. I make it fail back to fileSystem.getUri().getScheme() if it's not implemented. Or should

[GitHub] [incubator-hudi] bvaradar opened a new pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
bvaradar opened a new pull request #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788 @vinothchandar @n3nash : Grepped the code base and verified no other similar pattern exists ---

[GitHub] [incubator-hudi] bvaradar commented on issue #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called

2019-07-15 Thread GitBox
bvaradar commented on issue #788: HUDI-168 Ensure getFileStatus calls for files getting written happens after close() is called URL: https://github.com/apache/incubator-hudi/pull/788#issuecomment-511427510 @n3nash @vinothchandar : Please review and merge -

[GitHub] [incubator-hudi] bvaradar commented on issue #771: fix error: java.lang.IllegalArgumentException: Can not create a Path from an empty string

2019-07-15 Thread GitBox
bvaradar commented on issue #771: fix error: java.lang.IllegalArgumentException: Can not create a Path from an empty string URL: https://github.com/apache/incubator-hudi/pull/771#issuecomment-511424296 @cdmikechen : PR-775 which is merged had addressed this along with other changes. Can yo

[GitHub] [incubator-hudi] bvaradar commented on issue #786: HoodieWrapperFileSystem not working with presto

2019-07-15 Thread GitBox
bvaradar commented on issue #786: HoodieWrapperFileSystem not working with presto URL: https://github.com/apache/incubator-hudi/issues/786#issuecomment-511403138 @eisig : Is this with latest master ? Have you already workedaround/fixed this ? -

[GitHub] [incubator-hudi] NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511349113 I changed the driver memory and number of executors to be: spark.driver.memory = 7168m spark.exec

[GitHub] [incubator-hudi] NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511349113 I changed the driver memory and number of executors to be: spark.driver.memory = 7168m spark.exec

[GitHub] [incubator-hudi] NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511349113 I changed the driver memory and number of executors to be: spark.driver.memory = 7168m spark.exec

[GitHub] [incubator-hudi] NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
NetsanetGeb edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511349113 I changed the driver memory and number of executors to be: spark.driver.memory = 7168m spark.exec

[GitHub] [incubator-hudi] NetsanetGeb commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-15 Thread GitBox
NetsanetGeb commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-511349113 I changed the driver memory and number of executors to be: spark.driver.memory = 7168m spark.executor.me

[GitHub] [incubator-hudi] zhangxinjian123 commented on issue #784: Can Hudi delete records?

2019-07-15 Thread GitBox
zhangxinjian123 commented on issue #784: Can Hudi delete records? URL: https://github.com/apache/incubator-hudi/issues/784#issuecomment-511332025 public class TestMain { public static void main(String[] args) { System.setProperty("hadoop.home.dir", "E:\\hadoop-common-2.2.0