[PR] [HUDI-7130] Adding support for configuring value serializer with JsonKakfaSource [hudi]

2023-11-20 Thread via GitHub
nsivabalan opened a new pull request, #10149: URL: https://github.com/apache/hudi/pull/10149 ### Change Logs Adding support for configuring value serializer with JsonKakfaSource ### Impact Adding support for configuring value serializer with JsonKakfaSource ###

[jira] [Updated] (HUDI-7130) Add support to configure value serializer with JsonKafkaSource

2023-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7130: - Labels: pull-request-available (was: ) > Add support to configure value serializer with

[jira] [Created] (HUDI-7130) Add support to configure value serializer with JsonKafkaSource

2023-11-20 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-7130: - Summary: Add support to configure value serializer with JsonKafkaSource Key: HUDI-7130 URL: https://issues.apache.org/jira/browse/HUDI-7130 Project: Apache

[jira] [Updated] (HUDI-7128) DeleteProcedures support delete in batch mode

2023-11-20 Thread xy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7128: - Description: DeleteMarkerProcedures support delete in batch mode eg: if user want to delete 100 or more markers,before

Re: [PR] [HUDI-7128] DeleteProcedures support batch mode [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10148: URL: https://github.com/apache/hudi/pull/10148#issuecomment-1820369599 ## CI report: * 785bae873c7e3a67c03b7516ba1bdf2cd18718c9 Azure:

Re: [PR] [HUDI-7129] Fix bug when upgrade from table version three using UpgradeOrDowngradeProcedure [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10147: URL: https://github.com/apache/hudi/pull/10147#issuecomment-1820369550 ## CI report: * 8b2189bea8fc0d58b17656bc429442f240530bc1 Azure:

Re: [PR] [HUDI-7120] Performance improvements in deltastreamer executor code path [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10135: URL: https://github.com/apache/hudi/pull/10135#issuecomment-1820369436 ## CI report: * 2e26a7d1b87f4ca4e1f818612decfe0eb130a5fb Azure:

Re: [PR] [HUDI-7097] Fixing instantiation of Hms Uri with HiveSync tool [hudi]

2023-11-20 Thread via GitHub
xushiyan commented on code in PR #10099: URL: https://github.com/apache/hudi/pull/10099#discussion_r1400123695 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java: ## @@ -103,15 +103,29 @@ public class HiveSyncTool extends HoodieSyncTool implements

Re: [PR] [HUDI-7128] DeleteProcedures support batch mode [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10148: URL: https://github.com/apache/hudi/pull/10148#issuecomment-1820361063 ## CI report: * 785bae873c7e3a67c03b7516ba1bdf2cd18718c9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7129] Fix bug when upgrade from table version three using UpgradeOrDowngradeProcedure [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10147: URL: https://github.com/apache/hudi/pull/10147#issuecomment-1820360990 ## CI report: * 8b2189bea8fc0d58b17656bc429442f240530bc1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7120] Performance improvements in deltastreamer executor code path [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10135: URL: https://github.com/apache/hudi/pull/10135#issuecomment-1820360875 ## CI report: * 2e26a7d1b87f4ca4e1f818612decfe0eb130a5fb Azure:

Re: [PR] [HUDI-7083] Adding support for multiple tables with Prometheus Reporter [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10068: URL: https://github.com/apache/hudi/pull/10068#issuecomment-1820360630 ## CI report: * b91463465fe0eee81d69706909a877a8d4737556 Azure:

(hudi) branch master updated (eaba1146afc -> 0c4f3a3164c)

2023-11-20 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from eaba1146afc [HUDI-7107] Reused MetricsReporter fails to publish metrics in Spark streaming job (#10132) add

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
nsivabalan merged PR #10146: URL: https://github.com/apache/hudi/pull/10146 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
nsivabalan commented on PR #10146: URL: https://github.com/apache/hudi/pull/10146#issuecomment-1820359743 sg. I was about to suggest the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [HUDI-7003] Add option to fallback to full table scan if files are deleted due to… [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9941: URL: https://github.com/apache/hudi/pull/9941#issuecomment-182035 ## CI report: * 0a42984cc0d6d1e21b7e40b0fc08a8d6e902414c Azure:

Re: [PR] [HUDI-7083] Adding support for multiple tables with Prometheus Reporter [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10068: URL: https://github.com/apache/hudi/pull/10068#issuecomment-1820352430 ## CI report: * b91463465fe0eee81d69706909a877a8d4737556 Azure:

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
codope commented on PR #10146: URL: https://github.com/apache/hudi/pull/10146#issuecomment-1820346125 The failed test `testMultiWriterWithAsyncTableServicesWithConflict` is a flaky one which is unrelated to the fix here. Since this PR fixes a more critical problem with test setup, I am

Re: [PR] [HUDI-7083] Adding support for multiple tables with Prometheus Reporter [hudi]

2023-11-20 Thread via GitHub
xushiyan commented on code in PR #10068: URL: https://github.com/apache/hudi/pull/10068#discussion_r1400108246 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/prometheus/PrometheusReporter.java: ## @@ -18,42 +18,75 @@ package

[jira] [Updated] (HUDI-7128) DeleteProcedures support delete in batch mode

2023-11-20 Thread xy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7128: - Summary: DeleteProcedures support delete in batch mode (was: DeleteMarkerProcedures support delete in batch mode) >

[jira] [Updated] (HUDI-7128) DeleteMarkerProcedures support delete in batch mode

2023-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7128: - Labels: pull-request-available (was: ) > DeleteMarkerProcedures support delete in batch mode >

[PR] [HUDI-7128] DeleteMarkerProcedures support batch mode [hudi]

2023-11-20 Thread via GitHub
xuzifu666 opened a new pull request, #10148: URL: https://github.com/apache/hudi/pull/10148 ### Change Logs DeleteMarkerProcedures support batch mode ### Impact none ### Risk level (write none, low medium or high below) low ### Documentation Update

[jira] [Updated] (HUDI-7129) Fail to upgrade from table version 3 to table version 4 using UpgradeOrDowngradeProcedure

2023-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7129: - Labels: pull-request-available (was: ) > Fail to upgrade from table version 3 to table version 4

[PR] [HUDI-7129] Fix bug when upgrade from table version three using UpgradeOrDowngradeProcedure [hudi]

2023-11-20 Thread via GitHub
beyond1920 opened a new pull request, #10147: URL: https://github.com/apache/hudi/pull/10147 ### Change Logs When upgrade from table version 3 to table version 4 using UpgradeOrDowngradeProcedure, the following exception would be thrown out. >

Re: [PR] [HUDI-7083] Adding support for multiple tables with Prometheus Reporter [hudi]

2023-11-20 Thread via GitHub
codope commented on code in PR #10068: URL: https://github.com/apache/hudi/pull/10068#discussion_r1400096785 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/prometheus/PrometheusReporter.java: ## @@ -18,42 +18,75 @@ package

[jira] [Created] (HUDI-7129) Fail to upgrade from table version 3 to table version 4 using UpgradeOrDowngradeProcedure

2023-11-20 Thread Jing Zhang (Jira)
Jing Zhang created HUDI-7129: Summary: Fail to upgrade from table version 3 to table version 4 using UpgradeOrDowngradeProcedure Key: HUDI-7129 URL: https://issues.apache.org/jira/browse/HUDI-7129

Re: [PR] [HUDI-7120] Performance improvements in deltastreamer executor code path [hudi]

2023-11-20 Thread via GitHub
lokeshj1703 commented on code in PR #10135: URL: https://github.com/apache/hudi/pull/10135#discussion_r1400093042 ## hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java: ## @@ -474,8 +474,11 @@ public static boolean isLogFile(Path logPath) { } public static

Re: [PR] [HUDI-7120] Performance improvements in deltastreamer executor code path [hudi]

2023-11-20 Thread via GitHub
lokeshj1703 commented on code in PR #10135: URL: https://github.com/apache/hudi/pull/10135#discussion_r1400093409 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionUtils.scala: ## @@ -242,4 +243,57 @@ object AvroConversionUtils { val nameParts =

[jira] [Created] (HUDI-7128) DeleteMarkerProcedures support delete in batch mode

2023-11-20 Thread xy (Jira)
xy created HUDI-7128: Summary: DeleteMarkerProcedures support delete in batch mode Key: HUDI-7128 URL: https://issues.apache.org/jira/browse/HUDI-7128 Project: Apache Hudi Issue Type: Improvement

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10146: URL: https://github.com/apache/hudi/pull/10146#issuecomment-1820301455 ## CI report: * b16bf3dc24cbf5fa63d14454adfce5f6117e9d5f Azure:

Re: [PR] [HUDI-7004] Add support of snapshotLoadQuerySplitter(interface) in s3/gcs sources [hudi]

2023-11-20 Thread via GitHub
nsivabalan commented on PR #9943: URL: https://github.com/apache/hudi/pull/9943#issuecomment-1820294545 please do rebase. you can also wait forhttps://github.com/apache/hudi/pull/10146 before rebasing since that fixes CI. -- This is an automated message from the Apache Git Service. To

Re: [PR] [HUDI-7003] Add option to fallback to full table scan if files are deleted due to… [hudi]

2023-11-20 Thread via GitHub
nsivabalan commented on PR #9941: URL: https://github.com/apache/hudi/pull/9941#issuecomment-1820293486 please rebase once https://github.com/apache/hudi/pull/10146 is landed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [HUDI-7107] Reused MetricsReporter fails to publish metrics in Spark streaming job [hudi]

2023-11-20 Thread via GitHub
aajisaka commented on PR #10132: URL: https://github.com/apache/hudi/pull/10132#issuecomment-1820265381 Thank you @danny0405 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [HUDI-7003] Add option to fallback to full table scan if files are deleted due to… [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9941: URL: https://github.com/apache/hudi/pull/9941#issuecomment-1820254481 ## CI report: * bd1873d4b2244e5bff5419482590999158f1ce28 Azure:

Re: [PR] [HUDI-7003] Add option to fallback to full table scan if files are deleted due to… [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9941: URL: https://github.com/apache/hudi/pull/9941#issuecomment-1820249052 ## CI report: * bd1873d4b2244e5bff5419482590999158f1ce28 Azure:

Re: [PR] [MINOR] Build failed using master [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9726: URL: https://github.com/apache/hudi/pull/9726#issuecomment-1820248878 ## CI report: * 08350f3ef185fc7509ecba7f2780f6ae0edddc86 Azure:

Re: [PR] [HUDI-7102] Fix a bug for time travel queries on MOR tables [hudi]

2023-11-20 Thread via GitHub
linliu-code commented on code in PR #10102: URL: https://github.com/apache/hudi/pull/10102#discussion_r1400029309 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java: ## @@ -260,7 +260,7 @@ private void scanInternalV1(Option keySpecOpt)

[jira] [Assigned] (HUDI-7102) A file slice bug for the time travel queries for MOR tables

2023-11-20 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu reassigned HUDI-7102: - Assignee: Danny Chen (was: Lin Liu) > A file slice bug for the time travel queries for MOR tables >

[jira] [Updated] (HUDI-7102) A file slice bug for the time travel queries for MOR tables

2023-11-20 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7102: -- Summary: A file slice bug for the time travel queries for MOR tables (was: A bug for the time travel queries

[jira] [Updated] (HUDI-7102) A bug for the time travel queries for MOR tables

2023-11-20 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7102: -- Description: Issue: # Based on the provided TIMESTAMP_AS_OF, a list of file slices are returned. However,

Re: [PR] [HUDI-7120] Performance improvements in deltastreamer executor code path [hudi]

2023-11-20 Thread via GitHub
codope commented on code in PR #10135: URL: https://github.com/apache/hudi/pull/10135#discussion_r1400018858 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionUtils.scala: ## @@ -242,4 +243,57 @@ object AvroConversionUtils { val nameParts =

Re: [PR] [HUDI-7127] Adding framework based setup and clean up of spark context. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10145: URL: https://github.com/apache/hudi/pull/10145#issuecomment-1820220154 ## CI report: * ec1dd82d5d4d2fb2857935bba129ef7746234572 Azure:

Re: [PR] [HUDI-7105] support filesystem view configuable to avoid clean oom [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10116: URL: https://github.com/apache/hudi/pull/10116#issuecomment-1820220066 ## CI report: * ed40a83a0a42ce92dca8a613afb33ac4cbcd4ab0 Azure:

[jira] [Updated] (HUDI-7102) A bug for the time travel queries for MOR tables

2023-11-20 Thread Lin Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7102: -- Description: Issue: # Based on the provided TIMESTAMP_AS_OF, a list of file slices are returned. However,

[jira] [Updated] (HUDI-7126) Bug where every instant in archived timeline are loaded to memory

2023-11-20 Thread Yongkyun Daniel Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongkyun Daniel Lee updated HUDI-7126: -- Component/s: archiving > Bug where every instant in archived timeline are loaded to

Re: [PR] [HUDI-7105] support filesystem view configuable to avoid clean oom [hudi]

2023-11-20 Thread via GitHub
ksmou commented on PR #10116: URL: https://github.com/apache/hudi/pull/10116#issuecomment-1820163720 > This PR may fix the similiar issue: [#10002 (comment)](https://github.com/apache/hudi/pull/10002#issuecomment-1819075803) Nice~ That pr is the root cause for clean oom problem. But

[jira] [Updated] (HUDI-7126) Bug where every instant in archived timeline are loaded to memory

2023-11-20 Thread Yongkyun Daniel Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongkyun Daniel Lee updated HUDI-7126: -- Summary: Bug where every instant in archived timeline are loaded to memory (was:

Re: [PR] [HUDI-5823][RFC-65] RFC for Partition Lifecycle Management [hudi]

2023-11-20 Thread via GitHub
stream2000 commented on code in PR #8062: URL: https://github.com/apache/hudi/pull/8062#discussion_r1399985395 ## rfc/rfc-65/rfc-65.md: ## @@ -0,0 +1,248 @@ +## Proposers + +- @stream2000 +- @hujincalrin +- @huberylee +- @YuweiXiao + +## Approvers + +## Status + +JIRA:

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10002: URL: https://github.com/apache/hudi/pull/10002#issuecomment-1820151050 ## CI report: * 0d2b1cf6af86fed4821e9ec5b41e8477ad915b64 Azure:

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10002: URL: https://github.com/apache/hudi/pull/10002#issuecomment-1820145804 ## CI report: * 0d2b1cf6af86fed4821e9ec5b41e8477ad915b64 Azure:

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10146: URL: https://github.com/apache/hudi/pull/10146#issuecomment-1820140262 ## CI report: * b16bf3dc24cbf5fa63d14454adfce5f6117e9d5f Azure:

Re: [PR] [HUDI-7106] Fix sqs deletes, deltasync service close and error table default configs. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10117: URL: https://github.com/apache/hudi/pull/10117#issuecomment-1820140147 ## CI report: * d34454306916251f8548db0e2729afbbf178e025 Azure:

Re: [PR] [HUDI-7110] Add call procedure for show column stats information [hudi]

2023-11-20 Thread via GitHub
stream2000 commented on code in PR #10120: URL: https://github.com/apache/hudi/pull/10120#discussion_r1399973545 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/ShowMetadataTableColumnStatsProcedure.scala: ## @@ -0,0 +1,109 @@ +/*

Re: [PR] [HUDI-7110] Add call procedure for show column stats information [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on PR #10120: URL: https://github.com/apache/hudi/pull/10120#issuecomment-1820131169 There are some compile errors: https://github.com/apache/hudi/actions/runs/6926101159/job/18837735144?pr=10120 -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-20 Thread via GitHub
zhuanshenbsj1 commented on code in PR #10002: URL: https://github.com/apache/hudi/pull/10002#discussion_r1399962836 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java: ## @@ -491,10 +491,13 @@ public Pair> getDeletePaths(String

[jira] [Closed] (HUDI-7107) Reused MetricsReporter fails to publish metrics in Spark streaming job

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7107. Resolution: Fixed Fixed via master branch: eaba1146afc83e5e70ef520704a76a15a75c9aad > Reused

[jira] [Updated] (HUDI-7107) Reused MetricsReporter fails to publish metrics in Spark streaming job

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7107: - Fix Version/s: 0.14.1 > Reused MetricsReporter fails to publish metrics in Spark streaming job >

(hudi) branch master updated: [HUDI-7107] Reused MetricsReporter fails to publish metrics in Spark streaming job (#10132)

2023-11-20 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new eaba1146afc [HUDI-7107] Reused MetricsReporter

Re: [PR] [HUDI-7107] Reused MetricsReporter fails to publish metrics in Spark streaming job [hudi]

2023-11-20 Thread via GitHub
danny0405 merged PR #10132: URL: https://github.com/apache/hudi/pull/10132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR] support log index [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on PR #10143: URL: https://github.com/apache/hudi/pull/10143#issuecomment-1820120616 Can you wrap up a general design of the changes, so that we are more eaiser to reach concensus for the general direction. -- This is an automated message from the Apache Git Service.

Re: [PR] [HUDI-7105] support filesystem view configuable to avoid clean oom [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on PR #10116: URL: https://github.com/apache/hudi/pull/10116#issuecomment-1820119203 This PR may fix the similiar issue: https://github.com/apache/hudi/pull/10002#issuecomment-1819075803 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on code in PR #10122: URL: https://github.com/apache/hudi/pull/10122#discussion_r1399959143 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnReadWithCompact.java: ## @@ -159,6 +159,8 @@ public void

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on code in PR #10122: URL: https://github.com/apache/hudi/pull/10122#discussion_r1399958120 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/embedded/EmbeddedTimelineService.java: ## @@ -146,19 +214,65 @@ public FileSystemViewManager

Re: [PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10146: URL: https://github.com/apache/hudi/pull/10146#issuecomment-1820111266 ## CI report: * b16bf3dc24cbf5fa63d14454adfce5f6117e9d5f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7105] support filesystem view configuable to avoid clean oom [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10116: URL: https://github.com/apache/hudi/pull/10116#issuecomment-182015 ## CI report: * c5b222a65a99b5d9717cf29eb45a6b56e60a8ca1 Azure:

Re: [PR] [HUDI-7127] Adding framework based setup and clean up of spark context. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10145: URL: https://github.com/apache/hudi/pull/10145#issuecomment-1820111228 ## CI report: * ec1dd82d5d4d2fb2857935bba129ef7746234572 Azure:

Re: [PR] [MINOR] Build failed using master [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9726: URL: https://github.com/apache/hudi/pull/9726#issuecomment-1820110748 ## CI report: * 9d7049828d1817720d3cb944de32e109304dae34 Azure:

[jira] [Updated] (HUDI-7127) Fix closure of Spark context in tests

2023-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7127: - Labels: pull-request-available (was: ) > Fix closure of Spark context in tests >

Re: [PR] [HUDI-7127] Adding framework based setup and clean up of spark context. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10145: URL: https://github.com/apache/hudi/pull/10145#issuecomment-1820105413 ## CI report: * ec1dd82d5d4d2fb2857935bba129ef7746234572 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7127) Fix closure of Spark context in tests

2023-11-20 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7127: -- Epic Link: HUDI-4302 Labels: (was: pull-request-available) > Fix closure of

[jira] [Updated] (HUDI-7127) Fix closure of Spark context in tests

2023-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7127: - Labels: pull-request-available (was: ) > Fix closure of Spark context in tests >

Re: [PR] [HUDI-7105] support filesystem view configuable to avoid clean oom [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10116: URL: https://github.com/apache/hudi/pull/10116#issuecomment-1820105273 ## CI report: * c5b222a65a99b5d9717cf29eb45a6b56e60a8ca1 Azure:

[PR] [HUDI-7127] Fixing set up and tear down in tests [hudi]

2023-11-20 Thread via GitHub
nsivabalan opened a new pull request, #10146: URL: https://github.com/apache/hudi/pull/10146 ### Change Logs Fixing set up and tear down in tests ### Impact Unblock CI instability ### Risk level (write none, low medium or high below) low ###

Re: [PR] [MINOR] Build failed using master [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #9726: URL: https://github.com/apache/hudi/pull/9726#issuecomment-1820104836 ## CI report: * 9d7049828d1817720d3cb944de32e109304dae34 Azure:

[jira] [Created] (HUDI-7127) Fix closure of Spark context in tests

2023-11-20 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-7127: - Summary: Fix closure of Spark context in tests Key: HUDI-7127 URL: https://issues.apache.org/jira/browse/HUDI-7127 Project: Apache Hudi Issue

[jira] [Closed] (HUDI-7118) Conf 'spark.sql.parquet.enableVectorizedReader' does not work properly

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7118. Resolution: Fixed Fixed via master branch: 578e756cee6117cca223325bac8f350c0644547b > Conf

[jira] [Updated] (HUDI-7118) Conf 'spark.sql.parquet.enableVectorizedReader' does not work properly

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7118: - Fix Version/s: 0.14.1 > Conf 'spark.sql.parquet.enableVectorizedReader' does not work properly >

(hudi) branch master updated (d24220a4804 -> 578e756cee6)

2023-11-20 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from d24220a4804 [HUDI-7111] Fix performance regression of tag when written into simple bucket index table (#10130)

Re: [PR] [HUDI-7118] set conf 'spark.sql.parquet.enableVectorizedReader' to true automatically only if the value is not explicitly set [hudi]

2023-11-20 Thread via GitHub
danny0405 merged PR #10134: URL: https://github.com/apache/hudi/pull/10134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
danny0405 commented on code in PR #10122: URL: https://github.com/apache/hudi/pull/10122#discussion_r1399943850 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnReadWithCompact.java: ## @@ -159,6 +159,8 @@ public void

[PR] [MINOR] Adding framework based setup and clean up of spark context. [hudi]

2023-11-20 Thread via GitHub
lokesh-lingarajan-0310 opened a new pull request, #10145: URL: https://github.com/apache/hudi/pull/10145 This file has missing clean up code for spark context. ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact

[jira] [Updated] (HUDI-7111) Performance regression of spark job which written into simple bucket index table

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7111: - Fix Version/s: 0.14.1 > Performance regression of spark job which written into simple bucket index >

[jira] [Closed] (HUDI-7111) Performance regression of spark job which written into simple bucket index table

2023-11-20 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7111. Resolution: Fixed Fixed via master branch: d24220a4804ee6e04346a03a4ddbf2d2711ae301 > Performance

(hudi) branch master updated: [HUDI-7111] Fix performance regression of tag when written into simple bucket index table (#10130)

2023-11-20 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new d24220a4804 [HUDI-7111] Fix performance

Re: [PR] [HUDI-7111] Fix performance regression of tag when written into simple bucket index table [hudi]

2023-11-20 Thread via GitHub
danny0405 merged PR #10130: URL: https://github.com/apache/hudi/pull/10130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Updated] (HUDI-7126) Timestamp filter is not being applied when loading archived timeline to memory

2023-11-20 Thread Yongkyun Daniel Lee (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongkyun Daniel Lee updated HUDI-7126: -- Status: In Progress (was: Open) > Timestamp filter is not being applied when loading

[jira] [Created] (HUDI-7126) Timestamp filter is not being applied when loading archived timeline to memory

2023-11-20 Thread Yongkyun Daniel Lee (Jira)
Yongkyun Daniel Lee created HUDI-7126: - Summary: Timestamp filter is not being applied when loading archived timeline to memory Key: HUDI-7126 URL: https://issues.apache.org/jira/browse/HUDI-7126

Re: [PR] [HUDI-7111] Fix performance regression of tag when written into simple bucket index table [hudi]

2023-11-20 Thread via GitHub
beyond1920 commented on PR #10130: URL: https://github.com/apache/hudi/pull/10130#issuecomment-1820080037 @danny0405 This regression was existed since version 0.12. Should we address the problem also in versions 0.12, 0.13, and 0.14? -- This is an automated message from the Apache Git

Re: [PR] [HUDI-7071] Throw exceptions when clustering/index job fail [hudi]

2023-11-20 Thread via GitHub
ksmou commented on code in PR #10050: URL: https://github.com/apache/hudi/pull/10050#discussion_r1399931880 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieIndexer.java: ## @@ -149,19 +149,18 @@ public static void main(String[] args) { if (cfg.help ||

Re: [PR] [HUDI-7106] Fix sqs deletes, deltasync service close and error table default configs. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10117: URL: https://github.com/apache/hudi/pull/10117#issuecomment-1820065670 ## CI report: * a5e668a9ac87c52da2ddcade0f6d36784124a734 Azure:

Re: [PR] [HUDI-7106] Fix sqs deletes, deltasync service close and error table default configs. [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10117: URL: https://github.com/apache/hudi/pull/10117#issuecomment-1820059971 ## CI report: * a5e668a9ac87c52da2ddcade0f6d36784124a734 Azure:

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1820054386 ## CI report: * c78d9570c87afce7a27725b37a3ebb77913199b7 Azure:

Re: [PR] [HUDI-7106] Fix sqs deletes, deltasync service close and error table default configs. [hudi]

2023-11-20 Thread via GitHub
rmahindra123 commented on code in PR #10117: URL: https://github.com/apache/hudi/pull/10117#discussion_r1399899229 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/HoodieStreamer.java: ## @@ -844,8 +844,6 @@ public void ingestOnce() {

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-20 Thread via GitHub
ckonehouse commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1399898447 ## website/docs/transforms.md: ## @@ -64,3 +64,9 @@ Set the config as: ### Custom Transformer Implementation You can write your own custom transformer by

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-20 Thread via GitHub
ckonehouse commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1399898447 ## website/docs/transforms.md: ## @@ -64,3 +64,9 @@ Set the config as: ### Custom Transformer Implementation You can write your own custom transformer by

[jira] [Updated] (HUDI-7034) Refresh view does not work(due to cache)

2023-11-20 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7034: Fix Version/s: 0.14.1 > Refresh view does not work(due to cache) >

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1820002829 ## CI report: * faf61fb4c40584fd9dbdd4aafc85e699c3d9d8ba Azure:

Re: [PR] [HUDI-7103] Support time travel queies for COW tables [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10109: URL: https://github.com/apache/hudi/pull/10109#issuecomment-1820002729 ## CI report: * 6db6a7a27667641c51208dc20a2ac50d2211dc66 Azure:

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1819962855 ## CI report: * faf61fb4c40584fd9dbdd4aafc85e699c3d9d8ba Azure:

Re: [PR] [HUDI-7103] Support time travel queies for COW tables [hudi]

2023-11-20 Thread via GitHub
hudi-bot commented on PR #10109: URL: https://github.com/apache/hudi/pull/10109#issuecomment-1819948466 ## CI report: * 06b1fa3a36d83a607cd13e832d67a76b324cbb82 Azure:

  1   2   >