[GitHub] [incubator-hudi] n3nash edited a comment on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
n3nash edited a comment on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-618816317 @yanghua no worries, thanks for trying, I've pushed the changes. We will continue to have conflicting files given lots of new commits every day, we should merge

[GitHub] [incubator-hudi] n3nash edited a comment on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
n3nash edited a comment on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-618816317 @yanghua no worries, thanks for trying, I've pushed the changes This is an automated message from

[GitHub] [incubator-hudi] n3nash commented on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
n3nash commented on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-618816317 @yanghua no worries, I've pushed the changes This is an automated message from the Apache Git Service. To

[incubator-hudi] branch hudi_test_suite_refactor updated (95283be -> 08f9a76)

2020-04-23 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. discard 95283be Testing running 3 builds to limit total build time omit c13e885 [HUDI-394]

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1553: [HUDI-810] Migrate ClientTestHarness to JUnit 5

2020-04-23 Thread GitBox
codecov-io edited a comment on pull request #1553: URL: https://github.com/apache/incubator-hudi/pull/1553#issuecomment-618814181 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1553?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] codecov-io commented on pull request #1553: [HUDI-810] Migrate ClientTestHarness to JUnit 5

2020-04-23 Thread GitBox
codecov-io commented on pull request #1553: URL: https://github.com/apache/incubator-hudi/pull/1553#issuecomment-618814181 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1553?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1433: [HUDI-728]: Implement custom key generator

2020-04-23 Thread GitBox
vinothchandar commented on a change in pull request #1433: URL: https://github.com/apache/incubator-hudi/pull/1433#discussion_r411494931 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/KeyGenerator.java ## @@ -40,4 +40,22 @@ protected KeyGenerator(TypedProperties

[GitHub] [incubator-hudi] n3nash edited a comment on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
n3nash edited a comment on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618811796 @vinothchandar We do invoke the same payload when combining records during merge/compaction. For deletes, the payload has to be an empty payload and then the record

[GitHub] [incubator-hudi] n3nash commented on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
n3nash commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618811796 We do invoke the same payload when combining records during merge/compaction. For deletes, the payload has to be an empty payload and then the record should be skipped ->

[GitHub] [incubator-hudi] n3nash edited a comment on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
n3nash edited a comment on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618811796 @vinothchandar We do invoke the same payload when combining records during merge/compaction. For deletes, the payload has to be an empty payload and then the record

[GitHub] [incubator-hudi] bvaradar commented on issue #1555: [SUPPORT] Meet java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem

2020-04-23 Thread GitBox
bvaradar commented on issue #1555: URL: https://github.com/apache/incubator-hudi/issues/1555#issuecomment-618808782 @allenzhg : The exception strongly suggests you have 2 different versions of hadoop (likely 3.x and 2.x brought by Spark). Spark 2.4.x comes pre-built with Hadoop 2.7 which

[GitHub] [incubator-hudi] vinothchandar commented on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
vinothchandar commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618806837 as long as the records are the same and you are using the payload it should n't matter... Let me try to repro this myself.. I am puzzled since I do see the

[GitHub] [incubator-hudi] PhatakN1 edited a comment on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
PhatakN1 edited a comment on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618803454 These are the contents of hoodie.properties ```

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1540: [HUDI-819] Fix a bug with MergeOnReadLazyInsertIterable.

2020-04-23 Thread GitBox
vinothchandar commented on a change in pull request #1540: URL: https://github.com/apache/incubator-hudi/pull/1540#discussion_r414295319 ## File path: hudi-client/src/main/java/org/apache/hudi/io/AppendHandleFactory.java ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache

[GitHub] [incubator-hudi] PhatakN1 commented on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
PhatakN1 commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618803454 These are the contents of hoodie.properties

[GitHub] [incubator-hudi] bvaradar commented on issue #1555: [SUPPORT] Meet java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem

2020-04-23 Thread GitBox
bvaradar commented on issue #1555: URL: https://github.com/apache/incubator-hudi/issues/1555#issuecomment-618800754 Yes @vinothchandar I will handle it. This is an automated message from the Apache Git Service. To respond

[GitHub] [incubator-hudi] vinothchandar commented on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
vinothchandar commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618800202 @PhatakN1 ah okay.. Since Hudi itself is not aware of DMS or the`"Op": "D"`, it does log a data block with the deleted record.. I suspect the `AwsDMSPayload` is not

[GitHub] [incubator-hudi] PhatakN1 edited a comment on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
PhatakN1 edited a comment on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618292312 If MOR inserts go to a parquet file but updates to go a log file, then a query on the _ro table will show the inserts since the last compaction but not the updates.

[GitHub] [incubator-hudi] vinothchandar commented on issue #1556: [SUPPORT] Input path in s3 doesn't exist if the write multiple datasets to s3 in a single execution

2020-04-23 Thread GitBox
vinothchandar commented on issue #1556: URL: https://github.com/apache/incubator-hudi/issues/1556#issuecomment-618798845 trying to understand, are you concurrently writing to the same dataset using two writers? This is an

[GitHub] [incubator-hudi] vinothchandar commented on issue #1555: [SUPPORT] Meet java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem

2020-04-23 Thread GitBox
vinothchandar commented on issue #1555: URL: https://github.com/apache/incubator-hudi/issues/1555#issuecomment-618797288 @bvaradar are you able to tackle this one? This is an automated message from the Apache Git Service.

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #257

2020-04-23 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.31 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[jira] [Assigned] (HUDI-836) Implement datadog metrics reporter

2020-04-23 Thread lamber-ken (Jira)
[ https://issues.apache.org/jira/browse/HUDI-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lamber-ken reassigned HUDI-836: --- Assignee: Raymond Xu > Implement datadog metrics reporter > -- > >

[jira] [Commented] (HUDI-836) Implement datadog metrics reporter

2020-04-23 Thread lamber-ken (Jira)
[ https://issues.apache.org/jira/browse/HUDI-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091134#comment-17091134 ] lamber-ken commented on HUDI-836: - (y) > Implement datadog metrics reporter >

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
codecov-io edited a comment on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-61645 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1100?src=pr=h1) Report > Merging

[GitHub] [incubator-hudi] yanghua commented on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
yanghua commented on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-618771761 @n3nash Still conflicting files... I have tried to fix it yesterday. You may need to `pull --rebase` before force push?

[GitHub] [incubator-hudi] umehrot2 commented on issue #1550: Hudi 0.5.2 inability save complex type with nullable = true [SUPPORT]

2020-04-23 Thread GitBox
umehrot2 commented on issue #1550: URL: https://github.com/apache/incubator-hudi/issues/1550#issuecomment-618769492 @badion yeah the fix for this did not make it to 0.5.2. You can either build your custom Hudi with this patch applied on top of 0.5.2 or wait until next release.

[incubator-hudi] branch hudi_test_suite_refactor updated (7313a22 -> 95283be)

2020-04-23 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. omit 7313a22 trigger rebuild omit 908e57c [HUDI-397]Normalize log print statement

[incubator-hudi] branch hudi_test_suite_refactor updated (908e57c -> 7313a22)

2020-04-23 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. from 908e57c [HUDI-397]Normalize log print statement (#1224) add 7313a22 trigger

[jira] [Created] (HUDI-836) Implement datadog metrics reporter

2020-04-23 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-836: --- Summary: Implement datadog metrics reporter Key: HUDI-836 URL: https://issues.apache.org/jira/browse/HUDI-836 Project: Apache Hudi (incubating) Issue Type: New

[jira] [Commented] (HUDI-773) Hudi On Azure Data Lake Storage V2

2020-04-23 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091042#comment-17091042 ] Yanjia Gary Li commented on HUDI-773: - Hello [~sasikumar.venkat], could you try the following: mount

[jira] [Assigned] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-04-23 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-620: -- Assignee: Udit Mehrotra > Hive Sync Integration of bootstrapped table >

[GitHub] [incubator-hudi] satishkotha commented on pull request #1540: [HUDI-819] Fix a bug with MergeOnReadLazyInsertIterable.

2020-04-23 Thread GitBox
satishkotha commented on pull request #1540: URL: https://github.com/apache/incubator-hudi/pull/1540#issuecomment-618717557 > @satishkotha let's then break that up into a separate JIRA (tagged with Code Cleanup component). We can limit scope to these insert related handles and move on..

[jira] [Created] (HUDI-835) refactor HoodieMergeHandle into factory pattern

2020-04-23 Thread satish (Jira)
satish created HUDI-835: --- Summary: refactor HoodieMergeHandle into factory pattern Key: HUDI-835 URL: https://issues.apache.org/jira/browse/HUDI-835 Project: Apache Hudi (incubating) Issue Type:

[GitHub] [incubator-hudi] hmatu commented on pull request #1557: [HUDI-834] Concrete signature of HoodieRecordPayload#combineAndGetUpdateValue & HoodieRecordPayload#getInsertValue

2020-04-23 Thread GitBox
hmatu commented on pull request #1557: URL: https://github.com/apache/incubator-hudi/pull/1557#issuecomment-618691728 +1, LGTM This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [incubator-hudi] lamber-ken commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-04-23 Thread GitBox
lamber-ken commented on issue #1552: URL: https://github.com/apache/incubator-hudi/issues/1552#issuecomment-618683897 hi @harshi2506, need more spark info log, you can put the logfile to google drive, e.g https://drive.google.com/file/d/1zzyaySDJqPgAdTSLnKwOG667QGvZhd03

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-04-23 Thread GitBox
lamber-ken edited a comment on issue #1552: URL: https://github.com/apache/incubator-hudi/issues/1552#issuecomment-618657572 User report: upsert hoodie log ``` Started at 20/04/22 20:12:14 20/04/22 20:15:30 INFO HoodieTableMetaClient: Finished Loading Table of type

[GitHub] [incubator-hudi] lamber-ken edited a comment on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-04-23 Thread GitBox
lamber-ken edited a comment on issue #1552: URL: https://github.com/apache/incubator-hudi/issues/1552#issuecomment-618657572 User report: upsert hoodie log, cost about 30min ``` Started at 20/04/22 20:12:14 20/04/22 20:15:30 INFO HoodieTableMetaClient: Finished Loading

[GitHub] [incubator-hudi] lamber-ken commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-04-23 Thread GitBox
lamber-ken commented on issue #1552: URL: https://github.com/apache/incubator-hudi/issues/1552#issuecomment-618657572 Upsert hoodie log, cost about 30min ``` Started at 20/04/22 20:12:14 20/04/22 20:15:30 INFO HoodieTableMetaClient: Finished Loading Table of type

[GitHub] [incubator-hudi] TisonKun commented on issue #1557: [HUDI-834] Concrete signature of HoodieRecordPayload#combineAndGetUpdateValue & HoodieRecordPayload#getInsertValue

2020-04-23 Thread GitBox
TisonKun commented on issue #1557: URL: https://github.com/apache/incubator-hudi/pull/1557#issuecomment-618436893 Hold if `HoodieRecordPayload` already user facing, we cannot change signature of the interface then. This is

[GitHub] [incubator-hudi] codecov-io commented on issue #1557: [HUDI-834] Concrete signature of HoodieRecordPayload#combineAndGetUpdateValue & HoodieRecordPayload#getInsertValue

2020-04-23 Thread GitBox
codecov-io commented on issue #1557: URL: https://github.com/apache/incubator-hudi/pull/1557#issuecomment-618435496 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1557?src=pr=h1) Report > Merging [#1557](https://codecov.io/gh/apache/incubator-hudi/pull/1557?src=pr=desc)

[jira] [Resolved] (HUDI-761) Organize Rollback/Savepoint/Restore action implementation under a single package

2020-04-23 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-761. - Resolution: Fixed > Organize Rollback/Savepoint/Restore action implementation under a single >

[GitHub] [incubator-hudi] badion edited a comment on issue #1550: Hudi 0.5.2 inability save complex type with nullable = true [SUPPORT]

2020-04-23 Thread GitBox
badion edited a comment on issue #1550: URL: https://github.com/apache/incubator-hudi/issues/1550#issuecomment-618411983 @vinothchandar Seems like issue gone after building .jar file from commit(merge) - _ce0a4c64d07d6eea926d1bfb92b69ae387b88f50_, which was apparently after release of

[GitHub] [incubator-hudi] badion commented on issue #1550: Hudi 0.5.2 inability save complex type with nullable = true [SUPPORT]

2020-04-23 Thread GitBox
badion commented on issue #1550: URL: https://github.com/apache/incubator-hudi/issues/1550#issuecomment-618411983 @vinothchandar Seems like issue gone after building .jar file from commit(merge) - _ce0a4c64d07d6eea926d1bfb92b69ae387b88f50_, which was apparently after release of _Hudi

[jira] [Updated] (HUDI-834) Concrete signature of HoodieRecordPayload#combineAndGetUpdateValue & HoodieRecordPayload#getInsertValue

2020-04-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-834: Labels: pull-request-available (was: ) > Concrete signature of

[GitHub] [incubator-hudi] jenu9417 commented on issue #1528: [SUPPORT] Issue while writing to HDFS via hudi. Only `/.hoodie` folder is written.

2020-04-23 Thread GitBox
jenu9417 commented on issue #1528: URL: https://github.com/apache/incubator-hudi/issues/1528#issuecomment-618336942 @lamber-ken @vinothchandar The above mentioned suggestions works fine. Time to write has now reduced drastically. Thank you for the continued support. Closing

[GitHub] [incubator-hudi] PhatakN1 commented on issue #1549: Potential issue when using Deltastreamer with DMS

2020-04-23 Thread GitBox
PhatakN1 commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618292312 If MOR inserts go to a parquet file but updates to go a log file, then a query on the _ro table will show the inserts since the last compaction but not the updates. Isnt

[GitHub] [incubator-hudi] yanghua commented on issue #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-04-23 Thread GitBox
yanghua commented on issue #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-618251852 > @yanghua need you to lead the Azure pipelines for the test suite and other tickets assigned to you under the umbrella ticket. @n3nash Thanks for your hard work. I will

[jira] [Assigned] (HUDI-396) Provide an documentation to describe how to use test suite

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-396: - Assignee: wangxianghu > Provide an documentation to describe how to use test suite >

[jira] [Closed] (HUDI-591) Support Spark version upgrade

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-591. - Resolution: Fixed > Support Spark version upgrade > - > > Key:

[jira] [Closed] (HUDI-592) Remove duplicated dependencies in the pom file of test suite module

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-592. - Resolution: Fixed > Remove duplicated dependencies in the pom file of test suite module >

[jira] [Updated] (HUDI-592) Remove duplicated dependencies in the pom file of test suite module

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-592: -- Status: Open (was: New) > Remove duplicated dependencies in the pom file of test suite module >

[jira] [Updated] (HUDI-591) Support Spark version upgrade

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-591: -- Status: Open (was: New) > Support Spark version upgrade > - > >

[incubator-hudi] branch hudi_test_suite_refactor updated (e7b1474 -> 908e57c)

2020-04-23 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. discard e7b1474 [HUDI-397]Normalize log print statement (#1224) omit da3232e Testing

[jira] [Updated] (HUDI-704) Add unit test for RepairsCommand

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-704: -- Status: Open (was: New) > Add unit test for RepairsCommand > > >

[jira] [Closed] (HUDI-397) Normalize log print statement

2020-04-23 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-397. - Fix Version/s: 0.6.0 Resolution: Done Done via hudi_test_suite_refactor branch: 

[incubator-hudi] branch hudi_test_suite_refactor updated (da3232e -> e7b1474)

2020-04-23 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git. from da3232e Testing running 3 builds to limit total build time add e7b1474