[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492468540 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

[GitHub] [hudi] prashantwason edited a comment on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
prashantwason edited a comment on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-686688968 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492474495 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492474108 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to

[GitHub] [hudi] vinothchandar commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
vinothchandar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696264459 @bschell our spark install may be 2.11 on these images. As for hudi_spark_2.12 bundle, if we run integ-test with 2_12, I think it would happen automatically? @bvaradar

[GitHub] [hudi] vinothchandar commented on pull request #2096: [HUDI-1284] preCombine all HoodieRecords and update all fields(which is not DefaultValue) according to orderingVal

2020-09-21 Thread GitBox
vinothchandar commented on pull request #2096: URL: https://github.com/apache/hudi/pull/2096#issuecomment-696182257 @Karl-WangSK Will do! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] liujinhui1994 closed pull request #1984: [HUDI-1200] Fix NullPointerException, CustomKeyGenerator does not work

2020-09-21 Thread GitBox
liujinhui1994 closed pull request #1984: URL: https://github.com/apache/hudi/pull/1984 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #2064: URL: https://github.com/apache/hudi/pull/2064#discussion_r492405639 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieMetadata.java ## @@ -0,0 +1,272 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vishalpathak1986 commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696242666 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] satishkotha edited a comment on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha edited a comment on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696327041 > @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed. @bvaradar Incremental FileSystem resotre is the

[GitHub] [hudi] bvaradar commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
bvaradar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696339666 @bschell : For running integration tests with hudi packages built with scala 2.12, we just need to change scripts/run_travis_tests.sh. The docker container should automatically

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492460719 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to

[GitHub] [hudi] vinothchandar commented on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-696429882 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] satishkotha commented on a change in pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha commented on a change in pull request #2048: URL: https://github.com/apache/hudi/pull/2048#discussion_r492303122 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/IncrementalTimelineSyncFileSystemView.java ## @@ -251,6 +262,28 @@ private

[GitHub] [hudi] vishalpathak1986 edited a comment on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 edited a comment on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696242666 @n3nash Thanks for your comment. Can you also please elaborate on how an index will help this? Also, please let me know if you think it is possible to turn off writing

[GitHub] [hudi] n3nash commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
n3nash commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696212548 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] bvaradar commented on a change in pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

2020-09-21 Thread GitBox
bvaradar commented on a change in pull request #2012: URL: https://github.com/apache/hudi/pull/2012#discussion_r492198773 ## File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala ## @@ -364,4 +366,40 @@ object AvroConversionHelper { } }

[GitHub] [hudi] wangxianghu edited a comment on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu edited a comment on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696120725 @vinothchandar I have refactored `org.apache.hudi.table.SparkMarkerFiles` with `parallelDo ` function, it works ok in my local(`org.apache.hudi.table.TestMarkerFiles`

[GitHub] [hudi] leesf commented on pull request #2099: [HUDI-1268] fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread GitBox
leesf commented on pull request #2099: URL: https://github.com/apache/hudi/pull/2099#issuecomment-696461170 @nsivabalan would you please review this PR? This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] wangxianghu commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-695785338 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] bvaradar merged pull request #2097: [MINOR] Add description to remind users that Hudis docker images have mounted the projects workspace

2020-09-21 Thread GitBox
bvaradar merged pull request #2097: URL: https://github.com/apache/hudi/pull/2097 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] prashantwason edited a comment on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
prashantwason edited a comment on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-686688968 Remaining work items: - [ ] 1. Support for rollbacks in MOR Table - [ ] 2. Rollback of metadata if commit eventually fails on dataset - [x] 3. HUDI-CLI

[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492434405 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

[GitHub] [hudi] vinothchandar commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696181800 @wangxianghu let me check that out and circle back by your morning time/EOD PST. we can go from there. Thanks!

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492262964 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/common/HoodieSparkEngineContext.java ## @@ -0,0 +1,56 @@ +/* + * Licensed

[GitHub] [hudi] bvaradar commented on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
bvaradar commented on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696208094 @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed.

[GitHub] [hudi] vishalpathak1986 closed issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 closed issue #2095: URL: https://github.com/apache/hudi/issues/2095 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] satishkotha commented on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha commented on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696327041 > @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed. @bvaradar IncrementalTimeline resotre is the only big

[GitHub] [hudi] bvaradar edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
bvaradar edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696339666 @bschell : For running integration tests with hudi packages built with scala 2.12, we just need to change scripts/run_travis_tests.sh. The docker container should

[GitHub] [hudi] bvaradar commented on a change in pull request #2097: [MINOR] Add description to remind users that Hudis docker images have mounted the projects workspace

2020-09-21 Thread GitBox
bvaradar commented on a change in pull request #2097: URL: https://github.com/apache/hudi/pull/2097#discussion_r492303063 ## File path: docs/_docs/0_4_docker_demo.cn.md ## @@ -1106,6 +1106,9 @@ and compose scripts are carefully implemented so that they serve dual-purpose 2.

[GitHub] [hudi] yanghua commented on a change in pull request #1968: [HUDI-1192] Make create hive database automatically configurable

2020-09-21 Thread GitBox
yanghua commented on a change in pull request #1968: URL: https://github.com/apache/hudi/pull/1968#discussion_r491761859 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -290,6 +290,7 @@ object DataSourceWriteOptions { val

[GitHub] [hudi] hddong commented on pull request #1242: [HUDI-544] Archived commits command code cleanup

2020-09-21 Thread GitBox
hddong commented on pull request #1242: URL: https://github.com/apache/hudi/pull/1242#issuecomment-695916250 @n3nash: Had rebase this. This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Updated] (HUDI-1291) integration of replace with consolidated metadata

2020-09-21 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1291: - Status: Open (was: New) > integration of replace with consolidated metadata >

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492449543 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to

[GitHub] [hudi] liujinhui1994 commented on a change in pull request #1968: [HUDI-1192] Make create hive database automatically configurable

2020-09-21 Thread GitBox
liujinhui1994 commented on a change in pull request #1968: URL: https://github.com/apache/hudi/pull/1968#discussion_r492448849 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -290,6 +290,7 @@ object DataSourceWriteOptions { val

[hudi] branch asf-site updated: Travis CI build asf-site

2020-09-21 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 6cc1c37 Travis CI build asf-site 6cc1c37 is

[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492435531 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/SparkMarkerFiles.java ## @@ -0,0 +1,124 @@ +/* + * Licensed to the

[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492434405 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492434405 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

[GitHub] [hudi] wangxianghu commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492434405 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/HoodieEngineContext.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

[GitHub] [hudi] wangxianghu commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696467101 > My primary motive of suggesting parallelDo model, is to avoid splitting the classes and still reap benefits of parallel execution, provided by each engine. I don't think we

[hudi] branch asf-site updated: [MINOR] Add description to remind users that Hudis docker images have mounted the projects workspace (#2097)

2020-09-21 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new e786ae4 [MINOR] Add description to remind

[GitHub] [hudi] bvaradar merged pull request #2097: [MINOR] Add description to remind users that Hudis docker images have mounted the projects workspace

2020-09-21 Thread GitBox
bvaradar merged pull request #2097: URL: https://github.com/apache/hudi/pull/2097 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] leesf commented on pull request #2099: [HUDI-1268] fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread GitBox
leesf commented on pull request #2099: URL: https://github.com/apache/hudi/pull/2099#issuecomment-696461170 @nsivabalan would you please review this PR? This is an automated message from the Apache Git Service. To respond to

[jira] [Updated] (HUDI-1291) integration of replace with consolidated metadata

2020-09-21 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish updated HUDI-1291: - Fix Version/s: 0.7.0 > integration of replace with consolidated metadata >

[jira] [Created] (HUDI-1291) integration of replace with consolidated metadata

2020-09-21 Thread satish (Jira)
satish created HUDI-1291: Summary: integration of replace with consolidated metadata Key: HUDI-1291 URL: https://issues.apache.org/jira/browse/HUDI-1291 Project: Apache Hudi Issue Type: Sub-task

[GitHub] [hudi] vinothchandar commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #2064: URL: https://github.com/apache/hudi/pull/2064#discussion_r492406923 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieMetadataImpl.java ## @@ -0,0 +1,1064 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] vinothchandar commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #2064: URL: https://github.com/apache/hudi/pull/2064#discussion_r492405639 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieMetadata.java ## @@ -0,0 +1,272 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-696430277 Food for thought: we should also think about how we are going to add new metadata partitions in the background, as writers/cleaner/compactors keep running.

[GitHub] [hudi] vinothchandar commented on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
vinothchandar commented on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-696429882 cc @bvaradar @n3nash as well @prashantwason Here is a corner case with syncing completed compaction from data timeline to metadata timeline. Consider the following

[GitHub] [hudi] bvaradar commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
bvaradar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696339666 @bschell : For running integration tests with hudi packages built with scala 2.12, we just need to change scripts/run_travis_tests.sh. The docker container should automatically

[GitHub] [hudi] bvaradar edited a comment on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
bvaradar edited a comment on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696339666 @bschell : For running integration tests with hudi packages built with scala 2.12, we just need to change scripts/run_travis_tests.sh. The docker container should

[GitHub] [hudi] bvaradar commented on a change in pull request #2097: [MINOR] Add description to remind users that Hudis docker images have mounted the projects workspace

2020-09-21 Thread GitBox
bvaradar commented on a change in pull request #2097: URL: https://github.com/apache/hudi/pull/2097#discussion_r492303063 ## File path: docs/_docs/0_4_docker_demo.cn.md ## @@ -1106,6 +1106,9 @@ and compose scripts are carefully implemented so that they serve dual-purpose 2.

[GitHub] [hudi] satishkotha commented on a change in pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha commented on a change in pull request #2048: URL: https://github.com/apache/hudi/pull/2048#discussion_r492303122 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/IncrementalTimelineSyncFileSystemView.java ## @@ -251,6 +262,28 @@ private

[GitHub] [hudi] satishkotha edited a comment on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha edited a comment on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696327041 > @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed. @bvaradar Incremental FileSystem resotre is the

[GitHub] [hudi] satishkotha commented on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
satishkotha commented on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696327041 > @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed. @bvaradar IncrementalTimeline resotre is the only big

[GitHub] [hudi] prashantwason edited a comment on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-09-21 Thread GitBox
prashantwason edited a comment on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-686688968 Remaining work items: - [ ] 1. Support for rollbacks in MOR Table - [ ] 2. Rollback of metadata if commit eventually fails on dataset - [x] 3. HUDI-CLI

[GitHub] [hudi] vinothchandar commented on a change in pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on a change in pull request #1827: URL: https://github.com/apache/hudi/pull/1827#discussion_r492262964 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/common/HoodieSparkEngineContext.java ## @@ -0,0 +1,56 @@ +/* + * Licensed

[GitHub] [hudi] vishalpathak1986 commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696265004 Got it. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] vishalpathak1986 closed issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 closed issue #2095: URL: https://github.com/apache/hudi/issues/2095 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

2020-09-21 Thread GitBox
vinothchandar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-696264459 @bschell our spark install may be 2.11 on these images. As for hudi_spark_2.12 bundle, if we run integ-test with 2_12, I think it would happen automatically? @bvaradar

[GitHub] [hudi] n3nash commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
n3nash commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696259058 @vishalpathak1986 Hudi internally maintains an Index to tag the incoming records with the fileId that it maps to. If inserts are written to log files, we require a way from the index to

[GitHub] [hudi] vishalpathak1986 edited a comment on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 edited a comment on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696242666 @n3nash Thanks for your comment. Can you also please elaborate on how an index will help this? Also, please let me know if you think it is possible to turn off writing

[GitHub] [hudi] vishalpathak1986 commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
vishalpathak1986 commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696242666 @n3nash Thanks for your comment. Can you also please elaborate on how an index will help this? This is an

[jira] [Commented] (HUDI-1290) Implement Debezium avro source for Delta Streamer

2020-09-21 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199499#comment-17199499 ] liwei commented on HUDI-1290: - i meet some users  want hudi support Debezium or canel collect mysql binlog >

[jira] [Updated] (HUDI-1268) Fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1268: - Labels: pull-request-available (was: ) > Fix UpgradeDowngrade fs Rename issue for hdfs and

[GitHub] [hudi] lw309637554 opened a new pull request #2099: [HUDI-1268] fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread GitBox
lw309637554 opened a new pull request #2099: URL: https://github.com/apache/hudi/pull/2099 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[GitHub] [hudi] bvaradar commented on a change in pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

2020-09-21 Thread GitBox
bvaradar commented on a change in pull request #2012: URL: https://github.com/apache/hudi/pull/2012#discussion_r492198773 ## File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala ## @@ -364,4 +366,40 @@ object AvroConversionHelper { } }

[jira] [Updated] (HUDI-1268) Fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1268: Description: 一、issue:in UpgradeDowngrade.run() fs.rename(updatedPropsFilePath, propsFilePath);  the fs.rename

[jira] [Updated] (HUDI-1268) Fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss

2020-09-21 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1268: Summary: Fix UpgradeDowngrade fs Rename issue for hdfs and aliyun oss (was: Fix UpgradeDowngrade Rename Exception

[GitHub] [hudi] n3nash commented on issue #2095: Inserts in partitioned MoR RO view visible without compaction

2020-09-21 Thread GitBox
n3nash commented on issue #2095: URL: https://github.com/apache/hudi/issues/2095#issuecomment-696212548 @vishalpathak1986 Currently, Hudi supports writing inserts in columnar file fomat (parquet) for MOR tables. All inserts goto parquet while updates goto AVRO file. This is done for 2

[GitHub] [hudi] n3nash commented on issue #2098: [SUPPORT] File does not exisit(parquet) while reading Hudi Table from Spark

2020-09-21 Thread GitBox
n3nash commented on issue #2098: URL: https://github.com/apache/hudi/issues/2098#issuecomment-696209149 @RajasekarSribalan A FileNotFound error indicates that you are reading a version of the parquet file that has been deleted or no longer exists. This can happen due to the following

[GitHub] [hudi] bvaradar commented on pull request #2048: [HUDI-1072][WIP] Introduce REPLACE top level action

2020-09-21 Thread GitBox
bvaradar commented on pull request #2048: URL: https://github.com/apache/hudi/pull/2048#issuecomment-696208094 @satishkotha : Please ping me in the PR when you have updates and I can give incremental comments if needed.

[jira] [Updated] (HUDI-1290) Implement Debezium avro source for Delta Streamer

2020-09-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1290: - Status: Open (was: New) > Implement Debezium avro source for Delta Streamer >

[jira] [Assigned] (HUDI-1290) Implement Debezium avro source for Delta Streamer

2020-09-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-1290: Assignee: Balaji Varadarajan > Implement Debezium avro source for Delta Streamer >

[jira] [Created] (HUDI-1290) Implement Debezium avro source for Delta Streamer

2020-09-21 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1290: Summary: Implement Debezium avro source for Delta Streamer Key: HUDI-1290 URL: https://issues.apache.org/jira/browse/HUDI-1290 Project: Apache Hudi

[GitHub] [hudi] vinothchandar commented on pull request #2096: [HUDI-1284] preCombine all HoodieRecords and update all fields(which is not DefaultValue) according to orderingVal

2020-09-21 Thread GitBox
vinothchandar commented on pull request #2096: URL: https://github.com/apache/hudi/pull/2096#issuecomment-696182257 @Karl-WangSK Will do! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] vinothchandar commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
vinothchandar commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696181800 @wangxianghu let me check that out and circle back by your morning time/EOD PST. we can go from there. Thanks!

[GitHub] [hudi] wangxianghu edited a comment on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu edited a comment on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696120725 @vinothchandar I have refactored `org.apache.hudi.table.SparkMarkerFiles` with `parallelDo ` function, it works ok in my local(`org.apache.hudi.table.TestMarkerFiles`

[GitHub] [hudi] wangxianghu commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-09-21 Thread GitBox
wangxianghu commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-696120725 @vinothchandar I have refactored `org.apache.hudi.table.SparkMarkerFiles` with `parallelDo ` function, it works ok in my local(`org.apache.hudi.table.TestMarkerFiles` passed),

[GitHub] [hudi] Karl-WangSK commented on pull request #2096: [HUDI-1284] preCombine all HoodieRecords and update all fields(which is not DefaultValue) according to orderingVal

2020-09-21 Thread GitBox
Karl-WangSK commented on pull request #2096: URL: https://github.com/apache/hudi/pull/2096#issuecomment-696089004 @vinothchandar hi. Can you look at this pr when you are fre? This is an automated message from the Apache

[GitHub] [hudi] liujinhui1994 closed pull request #1984: [HUDI-1200] Fix NullPointerException, CustomKeyGenerator does not work

2020-09-21 Thread GitBox
liujinhui1994 closed pull request #1984: URL: https://github.com/apache/hudi/pull/1984 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] RajasekarSribalan opened a new issue #2098: [SUPPORT]

2020-09-21 Thread GitBox
RajasekarSribalan opened a new issue #2098: URL: https://github.com/apache/hudi/issues/2098 Hi Team, We are getting Parquet not found error while reading a Hudi table from Spark. What we do? We read a Hudi table from Spark(Select * from table) and do an insert and

[GitHub] [hudi] hddong commented on pull request #1242: [HUDI-544] Archived commits command code cleanup

2020-09-21 Thread GitBox
hddong commented on pull request #1242: URL: https://github.com/apache/hudi/pull/1242#issuecomment-695916250 @n3nash: Had rebase this. This is an automated message from the Apache Git Service. To respond to the message,