[GitHub] [hudi] hudi-bot removed a comment on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-974763247 ## CI report: * fc63a8f7b736b09d3f7593963c11438730e96793 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-974768543 ## CI report: * be86e20b85317383e41db5895aa5aa71dfdf85ea Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974768072 ## CI report: * b9383b77280419d54fa09206c768ca17a3683fb4 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767804 ## CI report: * b9383b77280419d54fa09206c768ca17a3683fb4 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767804 ## CI report: * b9383b77280419d54fa09206c768ca17a3683fb4 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-972637957 ## CI report: * b9383b77280419d54fa09206c768ca17a3683fb4 Azure:

[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
xiarixiaoyao commented on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974767554 @leesf addressed all comments. thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
xiarixiaoyao commented on a change in pull request #4013: URL: https://github.com/apache/hudi/pull/4013#discussion_r753758254 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala ## @@ -88,19 +88,28 @@ class TestOptimizeTable

[jira] [Closed] (HUDI-1932) Hive Sync should not always update last_commit_time_sync

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-1932. > Hive Sync should not always update last_commit_time_sync >

[jira] [Resolved] (HUDI-1932) Hive Sync should not always update last_commit_time_sync

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu resolved HUDI-1932. -- > Hive Sync should not always update last_commit_time_sync >

[jira] [Updated] (HUDI-1932) Hive Sync should not always update last_commit_time_sync

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1932: - Status: Closed (was: Patch Available) > Hive Sync should not always update last_commit_time_sync >

[jira] [Reopened] (HUDI-1932) Hive Sync should not always update last_commit_time_sync

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reopened HUDI-1932: -- > Hive Sync should not always update last_commit_time_sync >

[GitHub] [hudi] xiarixiaoyao commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
xiarixiaoyao commented on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974766598 @alexeykudinkin Thank you very much for your testing/bug fixing and code optimization. Due to the existence of rfc-27, data skipping was not considered too much in the

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
xiarixiaoyao commented on a change in pull request #4013: URL: https://github.com/apache/hudi/pull/4013#discussion_r753757069 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java ## @@ -30,16 +28,21 @@ private final String

[hudi] branch master updated (520538b -> 887787e)

2021-11-20 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 520538b [HUDI-2392] Make flink parquet reader compatible with decimal BINARY encoding (#4057) add 887787e

[GitHub] [hudi] codope merged pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
codope merged pull request #3053: URL: https://github.com/apache/hudi/pull/3053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-974763247 ## CI report: * fc63a8f7b736b09d3f7593963c11438730e96793 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-974763042 ## CI report: * fc63a8f7b736b09d3f7593963c11438730e96793 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-968575437 ## CI report: * fc63a8f7b736b09d3f7593963c11438730e96793 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3998: URL: https://github.com/apache/hudi/pull/3998#issuecomment-974763042 ## CI report: * fc63a8f7b736b09d3f7593963c11438730e96793 Azure:

[GitHub] [hudi] YannByron commented on a change in pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
YannByron commented on a change in pull request #3998: URL: https://github.com/apache/hudi/pull/3998#discussion_r753752076 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala ## @@ -0,0 +1,291 @@ +/* + *

[GitHub] [hudi] YannByron commented on a change in pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
YannByron commented on a change in pull request #3998: URL: https://github.com/apache/hudi/pull/3998#discussion_r753751992 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala ## @@ -0,0 +1,291 @@ +/* + *

[jira] [Commented] (HUDI-2649) Kick off all the Hive query issues for 0.10.0

2021-11-20 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17446929#comment-17446929 ] Sagar Sumit commented on HUDI-2649: --- This PR should help to handle differences in multiple hive

[jira] [Commented] (HUDI-2810) Make flink parquet reader compatible with decimal BINARY encoding

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17446928#comment-17446928 ] Danny Chen commented on HUDI-2810: -- Fixed via master branch: 520538b15dd83af47c32113aeebba96149493ffa >

[jira] [Resolved] (HUDI-2810) Make flink parquet reader compatible with decimal BINARY encoding

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-2810. -- > Make flink parquet reader compatible with decimal BINARY encoding >

[hudi] branch master updated (0411f73 -> 520538b)

2021-11-20 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 0411f73 [HUDI-2804] Add option to skip compaction instants for streaming read (#4051) add 520538b

[GitHub] [hudi] danny0405 merged pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
danny0405 merged pull request #4057: URL: https://github.com/apache/hudi/pull/4057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974754737 ## CI report: * 24f0a566dea29ff4d5e8cdda47b89f708e59b63b Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974758462 ## CI report: * 0663463dd90a2c6d12b233042522121a75493281 Azure:

[GitHub] [hudi] xushiyan commented on a change in pull request #3998: [HUDI-2759] extract HoodieCatalogTable as a bridge between spark cata…

2021-11-20 Thread GitBox
xushiyan commented on a change in pull request #3998: URL: https://github.com/apache/hudi/pull/3998#discussion_r753747739 ## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/catalyst/catalog/HoodieCatalogTable.scala ## @@ -0,0 +1,291 @@ +/* + *

[GitHub] [hudi] hudi-bot commented on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974756515 ## CI report: * 8bf846efc6ef3f3f9b0c10abad314014e2128eb0 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974749188 ## CI report: * fb2f4aace12d1c574e5e75ce8e7bcdb85d47e439 Azure:

[jira] [Commented] (HUDI-2804) Add option to skip compaction instants for streaming read

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17446926#comment-17446926 ] Danny Chen commented on HUDI-2804: -- Fixed via master branch: 0411f73c7d33e36bd8d12e7eae948518704a9548 >

[jira] [Resolved] (HUDI-2804) Add option to skip compaction instants for streaming read

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-2804. -- > Add option to skip compaction instants for streaming read >

[GitHub] [hudi] hudi-bot commented on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974754737 ## CI report: * 24f0a566dea29ff4d5e8cdda47b89f708e59b63b Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974754519 ## CI report: * 24f0a566dea29ff4d5e8cdda47b89f708e59b63b Azure:

[hudi] branch master updated (74b59a4 -> 0411f73)

2021-11-20 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 74b59a4 [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration (#4059) add 0411f73

[GitHub] [hudi] danny0405 merged pull request #4051: [HUDI-2804] Add option to skip compaction instants for streaming read

2021-11-20 Thread GitBox
danny0405 merged pull request #4051: URL: https://github.com/apache/hudi/pull/4051 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974754519 ## CI report: * 24f0a566dea29ff4d5e8cdda47b89f708e59b63b Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4053: URL: https://github.com/apache/hudi/pull/4053#issuecomment-974627966 ## CI report: * 24f0a566dea29ff4d5e8cdda47b89f708e59b63b Azure:

[GitHub] [hudi] dongkelun commented on a change in pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
dongkelun commented on a change in pull request #4053: URL: https://github.com/apache/hudi/pull/4053#discussion_r753745898 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HiveIncrementalPuller.java ## @@ -106,14 +106,14 @@ private Connection

[GitHub] [hudi] hudi-bot commented on pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4060: URL: https://github.com/apache/hudi/pull/4060#issuecomment-974753563 ## CI report: * 99e4f12213a255c52fb8a992b70d37e0f1ce947a Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4060: URL: https://github.com/apache/hudi/pull/4060#issuecomment-974745380 ## CI report: * 99e4f12213a255c52fb8a992b70d37e0f1ce947a Azure:

[jira] [Commented] (HUDI-2661) java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.catalog.CatalogTable.copy

2021-11-20 Thread Yann Byron (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17446921#comment-17446921 ] Yann Byron commented on HUDI-2661: -- [~wordcount]  I can't reproduce this, hudi built with master and

[GitHub] [hudi] hudi-bot commented on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974750959 ## CI report: * 0a286a3f3e82dbaf6247d0da446e4ea6a01e15c3 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974744499 ## CI report: * 6e4611a024633a1a1ed7414e7c5b3d73f9267dcb Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974744002 ## CI report: * a2b674ed53626dfa20985ecdd960084af3c9d770 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974750448 ## CI report: * 0c2140a4ea6de79fb0a993c07ba3175d371e0902 Azure:

[GitHub] [hudi] danny0405 commented on issue #4030: [SUPPORT] Flink uses updated fields to update data

2021-11-20 Thread GitBox
danny0405 commented on issue #4030: URL: https://github.com/apache/hudi/issues/4030#issuecomment-974750173 I just see that payload `OverwriteNonDefaultsWithLatestAvroPayload` solves partial of the problems, it has some space to improve: the `#combineAndGetUpdateValue` should compare

[jira] [Assigned] (HUDI-2808) Supports deduplication for streaming write

2021-11-20 Thread WangMinChao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangMinChao reassigned HUDI-2808: - Assignee: WangMinChao > Supports deduplication for streaming write >

[GitHub] [hudi] danny0405 commented on issue #4030: [SUPPORT] Flink uses updated fields to update data

2021-11-20 Thread GitBox
danny0405 commented on issue #4030: URL: https://github.com/apache/hudi/issues/4030#issuecomment-974749801 @dik111 , your use case is reasonable and valid, i have created a JIRA issue to track this request: https://issues.apache.org/jira/browse/HUDI-2815 -- This is an automated message

[jira] [Updated] (HUDI-2392) Do not send partition delete record when changelog mode enabled

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-2392: - Fix Version/s: (was: 0.10.0) > Do not send partition delete record when changelog mode enabled >

[jira] [Updated] (HUDI-2392) Do not send partition delete record when changelog mode enabled

2021-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-2392: - Fix Version/s: 0.10.0 > Do not send partition delete record when changelog mode enabled >

[jira] [Created] (HUDI-2815) Support partial update for streaming change logs

2021-11-20 Thread Danny Chen (Jira)
Danny Chen created HUDI-2815: Summary: Support partial update for streaming change logs Key: HUDI-2815 URL: https://issues.apache.org/jira/browse/HUDI-2815 Project: Apache Hudi Issue Type: New

[jira] [Updated] (HUDI-2811) Support Spark 3.2 and Parquet 1.12.x

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2811: - Summary: Support Spark 3.2 and Parquet 1.12.x (was: Support Spark 3.2) > Support Spark 3.2 and Parquet

[GitHub] [hudi] xushiyan commented on issue #3841: Schema evolution improvement in 0.9.0 brakes existing applications

2021-11-20 Thread GitBox
xushiyan commented on issue #3841: URL: https://github.com/apache/hudi/issues/3841#issuecomment-974749469 @umehrot2 have you filed the jira? I created this https://issues.apache.org/jira/browse/HUDI-2811 to track Spark/Parquet upgrade related issues and tasks. cc @nsivabalan

[GitHub] [hudi] hudi-bot commented on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974749188 ## CI report: * fb2f4aace12d1c574e5e75ce8e7bcdb85d47e439 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974749000 ## CI report: * fb2f4aace12d1c574e5e75ce8e7bcdb85d47e439 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974749000 ## CI report: * fb2f4aace12d1c574e5e75ce8e7bcdb85d47e439 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-974748990 ## CI report: * 638ca6f8424de9523e683e2af55b5c2e42fa024a Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4057: [HUDI-2392] Make flink parquet reader compatible with decimal BINARY …

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4057: URL: https://github.com/apache/hudi/pull/4057#issuecomment-974689002 ## CI report: * fb2f4aace12d1c574e5e75ce8e7bcdb85d47e439 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-974744735 ## CI report: * 638ca6f8424de9523e683e2af55b5c2e42fa024a Azure:

[jira] [Updated] (HUDI-2811) Support Spark 3.2

2021-11-20 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2811: - Description: Reported issues * [https://github.com/apache/hudi/issues/4001] *

[GitHub] [hudi] xushiyan commented on a change in pull request #3289: [HUDI-2187] Add a shim layer to support multiple hive version

2021-11-20 Thread GitBox
xushiyan commented on a change in pull request #3289: URL: https://github.com/apache/hudi/pull/3289#discussion_r753740519 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java ## @@ -120,6 +120,9 @@ @Parameter(names =

[hudi] branch master updated (305d160 -> 74b59a4)

2021-11-20 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 305d160 [MINOR] optimize in constructor of inputbatch class (#4040) add 74b59a4 [HUDI-2813] Claim RFC number

[GitHub] [hudi] hudi-bot removed a comment on pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4059: URL: https://github.com/apache/hudi/pull/4059#issuecomment-974743332 ## CI report: * 48e680fbb7a5d249930f21ad4789fcdc8705b056 Azure:

[GitHub] [hudi] xushiyan merged pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
xushiyan merged pull request #4059: URL: https://github.com/apache/hudi/pull/4059 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4059: URL: https://github.com/apache/hudi/pull/4059#issuecomment-974746805 ## CI report: * 48e680fbb7a5d249930f21ad4789fcdc8705b056 Azure:

[GitHub] [hudi] alexeykudinkin edited a comment on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
alexeykudinkin edited a comment on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454 @xiarixiaoyao thanks for addressing the issues! After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and

[GitHub] [hudi] alexeykudinkin commented on pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
alexeykudinkin commented on pull request #4013: URL: https://github.com/apache/hudi/pull/4013#issuecomment-974745454 @xiarixiaoyao thanks for addressing the issues! After our testing we've also tried to squash some bugs in https://github.com/apache/hudi/pull/4026 and

[GitHub] [hudi] hudi-bot commented on pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4060: URL: https://github.com/apache/hudi/pull/4060#issuecomment-974745380 ## CI report: * 99e4f12213a255c52fb8a992b70d37e0f1ce947a Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4060: URL: https://github.com/apache/hudi/pull/4060#issuecomment-974745161 ## CI report: * 99e4f12213a255c52fb8a992b70d37e0f1ce947a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4060: URL: https://github.com/apache/hudi/pull/4060#issuecomment-974745161 ## CI report: * 99e4f12213a255c52fb8a992b70d37e0f1ce947a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-2814) Address issues w/ Z-order Layout Optimization

2021-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2814: - Labels: pull-request-available (was: ) > Address issues w/ Z-order Layout Optimization >

[GitHub] [hudi] alexeykudinkin opened a new pull request #4060: [HUDI-2814] Addressing issues w/ Z-order Layout Optimization

2021-11-20 Thread GitBox
alexeykudinkin opened a new pull request #4060: URL: https://github.com/apache/hudi/pull/4060 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] hudi-bot commented on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-974744735 ## CI report: * 638ca6f8424de9523e683e2af55b5c2e42fa024a Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-973663414 ## CI report: * 638ca6f8424de9523e683e2af55b5c2e42fa024a Azure:

[GitHub] [hudi] leesf commented on pull request #4012: [HUDI-2777] Data import performance deteriorates because multiple Spark jobs are started when data is written to disks.

2021-11-20 Thread GitBox
leesf commented on pull request #4012: URL: https://github.com/apache/hudi/pull/4012#issuecomment-974744663 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974744499 ## CI report: * 6e4611a024633a1a1ed7414e7c5b3d73f9267dcb Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974744321 ## CI report: * 6e4611a024633a1a1ed7414e7c5b3d73f9267dcb Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974738464 ## CI report: * 6e4611a024633a1a1ed7414e7c5b3d73f9267dcb Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3053: [HUDI-1932] Update Hive sync timestamp when change detected

2021-11-20 Thread GitBox
hudi-bot commented on pull request #3053: URL: https://github.com/apache/hudi/pull/3053#issuecomment-974744321 ## CI report: * 6e4611a024633a1a1ed7414e7c5b3d73f9267dcb Azure:

[jira] [Created] (HUDI-2814) Address issues w/ Z-order Layout Optimization

2021-11-20 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-2814: - Summary: Address issues w/ Z-order Layout Optimization Key: HUDI-2814 URL: https://issues.apache.org/jira/browse/HUDI-2814 Project: Apache Hudi Issue

[jira] [Assigned] (HUDI-2814) Address issues w/ Z-order Layout Optimization

2021-11-20 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-2814: - Assignee: Alexey Kudinkin > Address issues w/ Z-order Layout Optimization >

[GitHub] [hudi] hudi-bot removed a comment on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974743775 ## CI report: * a2b674ed53626dfa20985ecdd960084af3c9d770 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974744002 ## CI report: * a2b674ed53626dfa20985ecdd960084af3c9d770 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974671377 ## CI report: * a2b674ed53626dfa20985ecdd960084af3c9d770 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4056: [HUDI-2808] Supports deduplication for streaming write

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4056: URL: https://github.com/apache/hudi/pull/4056#issuecomment-974743775 ## CI report: * a2b674ed53626dfa20985ecdd960084af3c9d770 Azure:

[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
leesf commented on a change in pull request #4013: URL: https://github.com/apache/hudi/pull/4013#discussion_r753737063 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java ## @@ -30,16 +28,21 @@ private final String

[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
leesf commented on a change in pull request #4013: URL: https://github.com/apache/hudi/pull/4013#discussion_r753737137 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/TestOptimizeTable.scala ## @@ -88,19 +88,28 @@ class TestOptimizeTable extends

[GitHub] [hudi] leesf commented on a change in pull request #4013: [HUDI-2778] Optimize statistics collection related codes and add some docs for z-order add fix some bugs

2021-11-20 Thread GitBox
leesf commented on a change in pull request #4013: URL: https://github.com/apache/hudi/pull/4013#discussion_r753737063 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieColumnRangeMetadata.java ## @@ -30,16 +28,21 @@ private final String

[GitHub] [hudi] leesf commented on a change in pull request #4018: Add the GooseFS integration document

2021-11-20 Thread GitBox
leesf commented on a change in pull request #4018: URL: https://github.com/apache/hudi/pull/4018#discussion_r753736855 ## File path: website/docs/goosefs_hoodie.md ## @@ -0,0 +1,46 @@ +--- +title: GooseFS Filesystem +keywords: [ hudi, hive, tencent, goosefs, spark, presto]

[GitHub] [hudi] hudi-bot commented on pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4059: URL: https://github.com/apache/hudi/pull/4059#issuecomment-974743332 ## CI report: * 48e680fbb7a5d249930f21ad4789fcdc8705b056 Azure:

[GitHub] [hudi] hudi-bot removed a comment on pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
hudi-bot removed a comment on pull request #4059: URL: https://github.com/apache/hudi/pull/4059#issuecomment-974743082 ## CI report: * 48e680fbb7a5d249930f21ad4789fcdc8705b056 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] leesf commented on a change in pull request #4053: [MINOR] Fix typos

2021-11-20 Thread GitBox
leesf commented on a change in pull request #4053: URL: https://github.com/apache/hudi/pull/4053#discussion_r753736718 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/HiveIncrementalPuller.java ## @@ -106,14 +106,14 @@ private Connection connection;

[hudi] branch master updated (1a5484d -> 305d160)

2021-11-20 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 1a5484d [MINOR] Claim RFC number for RFC for debezium source for deltastreamer (#4047) add 305d160 [MINOR]

[GitHub] [hudi] leesf merged pull request #4040: [MINOR] optimize in constructor of inputbatch class

2021-11-20 Thread GitBox
leesf merged pull request #4040: URL: https://github.com/apache/hudi/pull/4040 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
hudi-bot commented on pull request #4059: URL: https://github.com/apache/hudi/pull/4059#issuecomment-974743082 ## CI report: * 48e680fbb7a5d249930f21ad4789fcdc8705b056 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-2813) Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2813: - Labels: pull-request-available (was: ) > Claim RFC number for RFC for spark datasource V2

[GitHub] [hudi] leesf opened a new pull request #4059: [HUDI-2813] Claim RFC number for RFC for spark datasource V2 Integration

2021-11-20 Thread GitBox
leesf opened a new pull request #4059: URL: https://github.com/apache/hudi/pull/4059 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose

  1   2   >