[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1108113032 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * 1016017e28187458084dce16286142202b21af26 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4441: [HUDI-3085] improve bulk insert partitioner abstraction

2022-04-25 Thread GitBox
hudi-bot commented on PR #4441: URL: https://github.com/apache/hudi/pull/4441#issuecomment-1108114623 ## CI report: * 2243884a4617e11f0c2d675bc973c644744c3a1e UNKNOWN * f08642c2eefa4ca6b568eda052f968a39986f8a2 UNKNOWN * de00fb3133f06340c3bbac590dee2a19ec531748 Azure:

[GitHub] [hudi] onlywangyh commented on issue #5394: flink cdc sink hudi failed to add hive partition fields for hive sync

2022-04-25 Thread GitBox
onlywangyh commented on issue #5394: URL: https://github.com/apache/hudi/issues/5394#issuecomment-1108132152 If i keep the same params like `--partition-path-field=timestamp16, --hive-sync-partition-fields=timestamp16`. There will be some question: 1、In the schema the _timestamp16_ is a

[GitHub] [hudi] wxplovecc closed pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-25 Thread GitBox
wxplovecc closed pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex URL: https://github.com/apache/hudi/pull/5185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] wxplovecc opened a new pull request, #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-25 Thread GitBox
wxplovecc opened a new pull request, #5185: URL: https://github.com/apache/hudi/pull/5185 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1109191313 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b8529d91bd8c7eae03c3c6c41374fa6625aadfc0 UNKNOWN * 96e73e9bea606cc38a9ef65896bfebfc24164a50 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
hudi-bot commented on PR #5427: URL: https://github.com/apache/hudi/pull/5427#issuecomment-1109213490 ## CI report: * fe6cc9d4d51c6a8a6f2b8cbd969a06d835a4b8e0 Azure:

[GitHub] [hudi] alexeykudinkin opened a new pull request, #5430: [WIP] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
alexeykudinkin opened a new pull request, #5430: URL: https://github.com/apache/hudi/pull/5430 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] yihua commented on a diff in pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
yihua commented on code in PR #5430: URL: https://github.com/apache/hudi/pull/5430#discussion_r858186088 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala: ## @@ -144,6 +186,15 @@ class

[GitHub] [hudi] hudi-bot commented on pull request #5424: [HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes

2022-04-25 Thread GitBox
hudi-bot commented on PR #5424: URL: https://github.com/apache/hudi/pull/5424#issuecomment-1109244099 ## CI report: * db866f91efcb820f77f86962f6263c80bebb7db8 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1109244061 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b8529d91bd8c7eae03c3c6c41374fa6625aadfc0 UNKNOWN * 4c42f0c2d4fc7af4be3d7247faf5dc087a54fbac Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109244009 ## CI report: * 81356d0c5251f745dff71ea22bd4a4ad29f07561 Azure:

[GitHub] [hudi] nsivabalan merged pull request #5424: [HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes

2022-04-25 Thread GitBox
nsivabalan merged PR #5424: URL: https://github.com/apache/hudi/pull/5424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated (f2ba0fead2 -> 762623a15c)

2022-04-25 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from f2ba0fead2 [HUDI-3085] Improve bulk insert partitioner abstraction (#4441) add 762623a15c [HUDI-3972] Fixing

[jira] [Created] (HUDI-3977) Flink hudi table with date type partition path throws HoodieNotSupportedException

2022-04-25 Thread Danny Chen (Jira)
Danny Chen created HUDI-3977: Summary: Flink hudi table with date type partition path throws HoodieNotSupportedException Key: HUDI-3977 URL: https://issues.apache.org/jira/browse/HUDI-3977 Project:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109342212 ## CI report: * 56068124025de8998ffd1c87b65ca67e80f2d62b Azure:

[GitHub] [hudi] rahil-c commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
rahil-c commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109344571 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109347481 ## CI report: * 8c6f6e19940ce7ac04dfcfce52da3ccdaf3a8b0f UNKNOWN * dd2fea49a3161ed270b3f8f7e598beb6800178d8 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5432: [HUDI-3977] Flink hudi table with date type partition path throws Hoo…

2022-04-25 Thread GitBox
hudi-bot commented on PR #5432: URL: https://github.com/apache/hudi/pull/5432#issuecomment-1109375020 ## CI report: * 1a53ea2b021079025b6a3fe6ebb1184d26a3aa64 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1109194738 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b8529d91bd8c7eae03c3c6c41374fa6625aadfc0 UNKNOWN * 96e73e9bea606cc38a9ef65896bfebfc24164a50 Azure:

[GitHub] [hudi] yihua commented on a diff in pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
yihua commented on code in PR #5427: URL: https://github.com/apache/hudi/pull/5427#discussion_r858156982 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scala: ## @@ -324,7 +326,14 @@ object HoodieSparkUtils extends SparkAdapterSupport {

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109216635 ## CI report: * 65774002326a060b49e294793f4414fe2f31d812 Azure:

[GitHub] [hudi] YuangZhang opened a new issue, #5431: [SUPPORT] Flink Date type as partition field

2022-04-25 Thread GitBox
YuangZhang opened a new issue, #5431: URL: https://github.com/apache/hudi/issues/5431 flink sql can't use date as partition field `create TABLE hudi_sink( role_id string, log_id string, origin_json string, origin_log string, ts timestamp(3), ds

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109299801 ## CI report: * aeb42e6848d1d5b53700e92f44c95fd18283bb14 Azure:

[GitHub] [hudi] alexeykudinkin commented on pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
alexeykudinkin commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1109310149 @nsivabalan this is for 0.12 not for 0.11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] chaplinthink commented on issue #3657: [SUPPORT] Failed to insert data by flink-sql

2022-04-25 Thread GitBox
chaplinthink commented on issue #3657: URL: https://github.com/apache/hudi/issues/3657#issuecomment-1109324024 @danny0405 Hi, I try Hudi 0.10.0 version with Flink version 1.12.2 and 1.13.1, There is still such a problem. I am testing Flink CDC to Hudi, but it dose not work. -- This

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109345634 ## CI report: * 56068124025de8998ffd1c87b65ca67e80f2d62b Azure:

[jira] [Updated] (HUDI-3582) Introduce Secondary Index to Improve HUDI Query Performance

2022-04-25 Thread shibei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shibei updated HUDI-3582: - Summary: Introduce Secondary Index to Improve HUDI Query Performance (was: Support record level index based on

[jira] [Updated] (HUDI-3907) RFC for Introduce Secondary Index to Improve HUDI Query Performance

2022-04-25 Thread shibei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shibei updated HUDI-3907: - Summary: RFC for Introduce Secondary Index to Improve HUDI Query Performance (was: RFC for lucene based record

[GitHub] [hudi] suryaprasanna commented on issue #5223: [SUPPORT] - HUDI clustering - read issues

2022-04-25 Thread GitBox
suryaprasanna commented on issue #5223: URL: https://github.com/apache/hudi/issues/5223#issuecomment-1109245586 @nsivabalan I tried out both 0.8.0 and 0.10.1 versions. My job is not returning duplicates and considering only the latest files. I tried on both partitioned and

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109305034 ## CI report: * 56068124025de8998ffd1c87b65ca67e80f2d62b Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5432: [HUDI-3977] Flink hudi table with date type partition path throws Hoo…

2022-04-25 Thread GitBox
hudi-bot commented on PR #5432: URL: https://github.com/apache/hudi/pull/5432#issuecomment-1109352618 ## CI report: * 1a53ea2b021079025b6a3fe6ebb1184d26a3aa64 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] sharathkola commented on issue #5223: [SUPPORT] - HUDI clustering - read issues

2022-04-25 Thread GitBox
sharathkola commented on issue #5223: URL: https://github.com/apache/hudi/issues/5223#issuecomment-1109352692 @suryaprasanna Can you please verify the commit_files.zip that I have attached above (it has 20220404094047.commit and 20220404094203.replacecommit files) to confirm if it has

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109371737 ## CI report: * 8c6f6e19940ce7ac04dfcfce52da3ccdaf3a8b0f UNKNOWN * dd2fea49a3161ed270b3f8f7e598beb6800178d8 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
hudi-bot commented on PR #5427: URL: https://github.com/apache/hudi/pull/5427#issuecomment-1109371819 ## CI report: * c71805c763f244e9e59832b9d67f48d74f1e9c64 Azure:

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
nsivabalan commented on code in PR #5427: URL: https://github.com/apache/hudi/pull/5427#discussion_r858305310 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/InternalSchemaUtils.java: ## @@ -54,29 +58,75 @@ private InternalSchemaUtils() { */ public

[GitHub] [hudi] yihua commented on a diff in pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
yihua commented on code in PR #5427: URL: https://github.com/apache/hudi/pull/5427#discussion_r858155285 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/InternalSchemaUtils.java: ## @@ -54,13 +58,16 @@ private InternalSchemaUtils() { */ public static

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109215084 ## CI report: * 65774002326a060b49e294793f4414fe2f31d812 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
hudi-bot commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1109223888 ## CI report: * e494e1f8865a09ff4be7fe5390cc6d348671be09 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5304: [DOCS] Add faq for async compaction options

2022-04-25 Thread GitBox
nsivabalan commented on code in PR #5304: URL: https://github.com/apache/hudi/pull/5304#discussion_r858192316 ## website/learn/faq.md: ## @@ -253,6 +253,25 @@ Simplest way to run compaction on MOR dataset is to run the [compaction inline]( That said, for obvious reasons of

[GitHub] [hudi] hudi-bot commented on pull request #5428: [WIP][HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations

2022-04-25 Thread GitBox
hudi-bot commented on PR #5428: URL: https://github.com/apache/hudi/pull/5428#issuecomment-1109254506 ## CI report: * fd9570efbb7448e73976aaa8a14771f2e4daf67a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
hudi-bot commented on PR #5427: URL: https://github.com/apache/hudi/pull/5427#issuecomment-1109299948 ## CI report: * fe6cc9d4d51c6a8a6f2b8cbd969a06d835a4b8e0 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109303294 ## CI report: * aeb42e6848d1d5b53700e92f44c95fd18283bb14 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5432: [HUDI-3977] Flink hudi table with date type partition path throws Hoo…

2022-04-25 Thread GitBox
hudi-bot commented on PR #5432: URL: https://github.com/apache/hudi/pull/5432#issuecomment-1109376729 ## CI report: * 1a53ea2b021079025b6a3fe6ebb1184d26a3aa64 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109222178 ## CI report: * 65774002326a060b49e294793f4414fe2f31d812 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
hudi-bot commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1109244114 ## CI report: * e494e1f8865a09ff4be7fe5390cc6d348671be09 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5428: [WIP][HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations

2022-04-25 Thread GitBox
hudi-bot commented on PR #5428: URL: https://github.com/apache/hudi/pull/5428#issuecomment-1109246005 ## CI report: * 1e47f0288921b821bbf29d3b72b1156e82aefd5c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
hudi-bot commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1109246021 ## CI report: * e494e1f8865a09ff4be7fe5390cc6d348671be09 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5428: [WIP][HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations

2022-04-25 Thread GitBox
hudi-bot commented on PR #5428: URL: https://github.com/apache/hudi/pull/5428#issuecomment-1109247528 ## CI report: * 1e47f0288921b821bbf29d3b72b1156e82aefd5c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5428: [WIP][HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations

2022-04-25 Thread GitBox
hudi-bot commented on PR #5428: URL: https://github.com/apache/hudi/pull/5428#issuecomment-1109249115 ## CI report: * fd9570efbb7448e73976aaa8a14771f2e4daf67a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109249051 ## CI report: * 81356d0c5251f745dff71ea22bd4a4ad29f07561 Azure:

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
nsivabalan commented on code in PR #5430: URL: https://github.com/apache/hudi/pull/5430#discussion_r858193317 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieMergeOnReadRDD.scala: ## @@ -127,9 +130,9 @@ class HoodieMergeOnReadRDD(@transient sc:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109296580 ## CI report: * aeb42e6848d1d5b53700e92f44c95fd18283bb14 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5430: [WIP][Stacked on 5428] Optimize out mandatory columns when no merging is performed

2022-04-25 Thread GitBox
hudi-bot commented on PR #5430: URL: https://github.com/apache/hudi/pull/5430#issuecomment-1109296967 ## CI report: * 968ca518a9b54ecf294387e4b3d3d761c8f8a3cd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5427: [HUDI-3974] Fix schema projection to skip non-existent preCombine field

2022-04-25 Thread GitBox
hudi-bot commented on PR #5427: URL: https://github.com/apache/hudi/pull/5427#issuecomment-1109301948 ## CI report: * fe6cc9d4d51c6a8a6f2b8cbd969a06d835a4b8e0 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5428: [WIP][HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations

2022-04-25 Thread GitBox
hudi-bot commented on PR #5428: URL: https://github.com/apache/hudi/pull/5428#issuecomment-1109301959 ## CI report: * fd9570efbb7448e73976aaa8a14771f2e4daf67a Azure:

[GitHub] [hudi] rahil-c commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
rahil-c commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109330190 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109340628 ## CI report: * 56068124025de8998ffd1c87b65ca67e80f2d62b Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-04-25 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1109343824 ## CI report: * 56068124025de8998ffd1c87b65ca67e80f2d62b Azure:

[GitHub] [hudi] danny0405 opened a new pull request, #5432: [HUDI-3977] Flink hudi table with date type partition path throws Hoo…

2022-04-25 Thread GitBox
danny0405 opened a new pull request, #5432: URL: https://github.com/apache/hudi/pull/5432 …dieNotSupportedException ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull

[jira] [Updated] (HUDI-3977) Flink hudi table with date type partition path throws HoodieNotSupportedException

2022-04-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3977: - Labels: pull-request-available (was: ) > Flink hudi table with date type partition path throws

[GitHub] [hudi] hudi-bot commented on pull request #5432: [HUDI-3977] Flink hudi table with date type partition path throws Hoo…

2022-04-25 Thread GitBox
hudi-bot commented on PR #5432: URL: https://github.com/apache/hudi/pull/5432#issuecomment-1109371861 ## CI report: * 1a53ea2b021079025b6a3fe6ebb1184d26a3aa64 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index

2022-04-25 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1108205535 ## CI report: * 64743cf541772b9addb74add001e5cd57916bc9d Azure:

[jira] [Updated] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3965: -- Fix Version/s: 0.12.0 > Spark sql dml w/ spark2 and scala12 fails w/

[jira] [Updated] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3965: -- Sprint: Hudi-Sprint-Apr-19 > Spark sql dml w/ spark2 and scala12 fails w/

[jira] [Updated] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3965: -- Priority: Blocker (was: Major) > Spark sql dml w/ spark2 and scala12 fails w/

[jira] [Created] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver

2022-04-25 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3965: - Summary: Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver Key: HUDI-3965 URL: https://issues.apache.org/jira/browse/HUDI-3965

[jira] [Updated] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparlSQLCLIDriver

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3965: -- Sprint: (was: Hudi-Sprint-Apr-19) > Spark sql dml w/ spark2 and scala12 fails w/

[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #5238: [DOC]Add schema evolution doc for sparksql

2022-04-25 Thread GitBox
XuQianJin-Stars commented on code in PR #5238: URL: https://github.com/apache/hudi/pull/5238#discussion_r857356824 ## website/docs/quick-start-guide.md: ## @@ -1095,6 +1095,178 @@ Currently, the result of `show partitions` is based on the filesystem table pat ::: +##

[GitHub] [hudi] hudi-bot commented on pull request #5410: [HUDI-3953]Flink Hudi module should support low-level read and write…

2022-04-25 Thread GitBox
hudi-bot commented on PR #5410: URL: https://github.com/apache/hudi/pull/5410#issuecomment-1108293556 ## CI report: * 0cccfe39468ec699aae59d0028dcc949e9161a6d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] danny0405 commented on pull request #5087: [HUDI-3614] [DO_NOT_MERGE]Replace List with HoodieData in HoodieFlink/JavaTable and commit executors

2022-04-25 Thread GitBox
danny0405 commented on PR #5087: URL: https://github.com/apache/hudi/pull/5087#issuecomment-1108391806 > I see there have to more code design / refactoring work to achieve it. Before we do that, my suggestion is to keep the current code as it is. It is a API refactoring and we should

[GitHub] [hudi] danny0405 commented on pull request #5420: Flink hive sync got ClassNotFoundException (org/apache/hudi/org/apache/hadoop/hive/ql/metadata/Hive)

2022-04-25 Thread GitBox
danny0405 commented on PR #5420: URL: https://github.com/apache/hudi/pull/5420#issuecomment-1108363673 > Yes , I use `release 0.10.1` branch to build the project . And I found when I only build the `hudi-flink-bundle` module (in path packaging/hudi-flink-bundle) there can find the class ,

[GitHub] [hudi] JerryYue-M commented on pull request #5410: [HUDI-3953]Flink Hudi module should support low-level read and write…

2022-04-25 Thread GitBox
JerryYue-M commented on PR #5410: URL: https://github.com/apache/hudi/pull/5410#issuecomment-1108399547 > Can we give some explanation about why we need this change ? Looks like it is a code refactor but i see no gains. > > One rule is that we should not copy new

[GitHub] [hudi] JerryYue-M commented on pull request #5410: [HUDI-3953]Flink Hudi module should support low-level read and write…

2022-04-25 Thread GitBox
JerryYue-M commented on PR #5410: URL: https://github.com/apache/hudi/pull/5410#issuecomment-1108403085 @danny0405 One rule is that we should not copy new clazz with similar API with existing public APIs thanks for remind. I will make some change later. -- This is an automated

[GitHub] [hudi] SabyasachiDasTR opened a new issue, #5422: Enabling metadata on MOR table causes FileNotFound exception [SUPPORT]

2022-04-25 Thread GitBox
SabyasachiDasTR opened a new issue, #5422: URL: https://github.com/apache/hudi/issues/5422 **Describe the problem you faced** We are incrementally upserting data into our Hudi table/s every 5 minutes. As we begin to process this data we notice mentioned error occurs and the upserts

[GitHub] [hudi] wxplovecc commented on pull request #5185: [HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index

2022-04-25 Thread GitBox
wxplovecc commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1108211728 Done @garyli1019 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index

2022-04-25 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1108339791 ## CI report: * 8b79ca5bfb14ff32fd84a52ea9432862da104d50 Azure:

[GitHub] [hudi] chrischnweiss commented on issue #4887: [SUPPORT] Unexpected behaviour with partitioned hudi tables with impala as query engine

2022-04-25 Thread GitBox
chrischnweiss commented on issue #4887: URL: https://github.com/apache/hudi/issues/4887#issuecomment-1108370616 Hi guys, so it seems that we fixed the problem of reading partitioned Hudi tables with Impala by changing the Hudi-Operation from `insert_overwrite` to `upsert`. Maybe

[GitHub] [hudi] danny0405 commented on pull request #5410: [HUDI-3953]Flink Hudi module should support low-level read and write…

2022-04-25 Thread GitBox
danny0405 commented on PR #5410: URL: https://github.com/apache/hudi/pull/5410#issuecomment-1108384622 Can we give some explanation about why we need this change ? Looks like it is a code refactor but i see no gains. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] hadonchen commented on pull request #5420: Flink hive sync got ClassNotFoundException (org/apache/hudi/org/apache/hadoop/hive/ql/metadata/Hive)

2022-04-25 Thread GitBox
hadonchen commented on PR #5420: URL: https://github.com/apache/hudi/pull/5420#issuecomment-1108430967 > > Yes , I use `release 0.10.1` branch to build the project . And I found when I only build the `hudi-flink-bundle` module (in path packaging/hudi-flink-bundle) there can find the class

[jira] [Closed] (HUDI-3906) Prepare RC3 and run basic tests

2022-04-25 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-3906. Resolution: Done > Prepare RC3 and run basic tests > --- > >

[GitHub] [hudi] hudi-bot commented on pull request #5185: [HUDI-3758] Optimize flink partition table with BucketIndex

2022-04-25 Thread GitBox
hudi-bot commented on PR #5185: URL: https://github.com/apache/hudi/pull/5185#issuecomment-1108201850 ## CI report: * 64743cf541772b9addb74add001e5cd57916bc9d Azure:

[GitHub] [hudi] danny0405 commented on pull request #5087: [HUDI-3614] [DO_NOT_MERGE]Replace List with HoodieData in HoodieFlink/JavaTable and commit executors

2022-04-25 Thread GitBox
danny0405 commented on PR #5087: URL: https://github.com/apache/hudi/pull/5087#issuecomment-1108388472 There are some conflicts, can you rebase the code with latest master ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] hudi-bot commented on pull request #5406: [HUDI-3954] When ASYNC_CLEAN is false,don't keep the last commit before the earliest commit to retain

2022-04-25 Thread GitBox
hudi-bot commented on PR #5406: URL: https://github.com/apache/hudi/pull/5406#issuecomment-1108918215 ## CI report: * b8a84374628e95b881205e6dc80db3de03599898 Azure:

[jira] [Created] (HUDI-3974) Fix upgrade step wrt precombine field

2022-04-25 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-3974: --- Summary: Fix upgrade step wrt precombine field Key: HUDI-3974 URL: https://issues.apache.org/jira/browse/HUDI-3974 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-3675) Add support to shutdown Deltastreamer gracefully on certain conditions w/ continuous mode

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3675: -- Status: Patch Available (was: In Progress) > Add support to shutdown Deltastreamer

[jira] [Updated] (HUDI-3675) Add support to shutdown Deltastreamer gracefully on certain conditions w/ continuous mode

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3675: -- Status: In Progress (was: Open) > Add support to shutdown Deltastreamer gracefully on

[jira] [Updated] (HUDI-3675) Add support to shutdown Deltastreamer gracefully on certain conditions w/ continuous mode

2022-04-25 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3675: -- Reviewers: Sagar Sumit Story Points: 2 > Add support to shutdown Deltastreamer

[GitHub] [hudi] kazdy opened a new pull request, #5425: Add properities description for CloudWatch logs

2022-04-25 Thread GitBox
kazdy opened a new pull request, #5425: URL: https://github.com/apache/hudi/pull/5425 Add documentation on properities description for CloudWatch logs, based on AWS blog: https://aws.amazon.com/blogs/big-data/new-features-from-apache-hudi-0-7-0-and-0-8-0-available-on-amazon-emr/

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1108972998 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b43a42e21e9f29a38739bbd464bea23b6f7fea72 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5424: [HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes

2022-04-25 Thread GitBox
hudi-bot commented on PR #5424: URL: https://github.com/apache/hudi/pull/5424#issuecomment-1108973035 ## CI report: * 83bba83cbe6d1d1001b47c67b9ee9b9cb71ea854 Azure:

[GitHub] [hudi] rahil-c commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
rahil-c commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1108975553 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1108986967 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b8529d91bd8c7eae03c3c6c41374fa6625aadfc0 UNKNOWN * Unknown: [CANCELED](TBD) *

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5424: [HUDI-3972] Fixing hoodie.properties/tableConfig for no preCombine field with writes

2022-04-25 Thread GitBox
alexeykudinkin commented on code in PR #5424: URL: https://github.com/apache/hudi/pull/5424#discussion_r857987774 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -151,7 +154,7 @@ object HoodieSparkSqlWriter {

[GitHub] [hudi] hudi-bot commented on pull request #5419: [WIP][HUDI-3088] Use Spark 3.2 as default Spark version

2022-04-25 Thread GitBox
hudi-bot commented on PR #5419: URL: https://github.com/apache/hudi/pull/5419#issuecomment-1108991542 ## CI report: * 901cf10311e7b2f0cba88c71bf1d8c6998bbd953 UNKNOWN * b8529d91bd8c7eae03c3c6c41374fa6625aadfc0 UNKNOWN * Unknown: [CANCELED](TBD) *

[GitHub] [hudi] nsivabalan commented on issue #5223: [SUPPORT] - HUDI clustering - read issues

2022-04-25 Thread GitBox
nsivabalan commented on issue #5223: URL: https://github.com/apache/hudi/issues/5223#issuecomment-1109018000 @suryaprasanna : thanks for assisting here. Can you take hudi 0.8.0 oss, try out clustering after few commits. And trigger a query and paste what you see in spark sql tab in spark

[jira] [Assigned] (HUDI-3973) Implement GENERATE manifest command for Snowflake integration

2022-04-25 Thread Joyan Sil (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joyan Sil reassigned HUDI-3973: --- Assignee: Joyan Sil > Implement GENERATE manifest command for Snowflake integration >

[jira] [Created] (HUDI-3973) Implement GENERATE manifest command for Snowflake integration

2022-04-25 Thread Joyan Sil (Jira)
Joyan Sil created HUDI-3973: --- Summary: Implement GENERATE manifest command for Snowflake integration Key: HUDI-3973 URL: https://issues.apache.org/jira/browse/HUDI-3973 Project: Apache Hudi Issue

[jira] [Commented] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle

2022-04-25 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527718#comment-17527718 ] Ethan Guo commented on HUDI-3961: - so after discussion, the quick fix is these usage changes: *

[jira] [Created] (HUDI-3975) Checksum can be wrong after table upgrade from version 3 to 4

2022-04-25 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-3975: --- Summary: Checksum can be wrong after table upgrade from version 3 to 4 Key: HUDI-3975 URL: https://issues.apache.org/jira/browse/HUDI-3975 Project: Apache Hudi Issue

  1   2   3   >