[GitHub] [hudi] danny0405 commented on pull request #8555: [HUDI-6130] Adding docs for 0.12.3

2023-04-23 Thread via GitHub
danny0405 commented on PR #8555: URL: https://github.com/apache/hudi/pull/8555#issuecomment-1519491589 Checked the links and all are good, only one quesion: why we leave both 0.12.2 and 0.12.3 on the page? -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [hudi] ad1happy2go commented on issue #8532: [SUPPORT]org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 11 partition 1

2023-04-23 Thread via GitHub
ad1happy2go commented on issue #8532: URL: https://github.com/apache/hudi/issues/8532#issuecomment-1519489519 @gtwuser You can open Spark UI and se the stages and jobs running. Try to follow this guide - https://hudi.apache.org/docs/tuning-guide/ and set the parameters mentioned to optim

[GitHub] [hudi] ad1happy2go commented on issue #8532: [SUPPORT]org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 11 partition 1

2023-04-23 Thread via GitHub
ad1happy2go commented on issue #8532: URL: https://github.com/apache/hudi/issues/8532#issuecomment-1519488103 @gtwuser You can open Spark UI and se the stages and jobs running. Can you provide a small reproducible script with mock up data so that we can reproduce and look into it further?

[jira] [Updated] (HUDI-6132) Fix multiple streaming writers w/ streaming sink

2023-04-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6132: - Labels: pull-request-available (was: ) > Fix multiple streaming writers w/ streaming sink > -

[GitHub] [hudi] nsivabalan opened a new pull request, #8558: [HUDI-6132] Fixing checkpoint management for multiple streaming writers

2023-04-23 Thread via GitHub
nsivabalan opened a new pull request, #8558: URL: https://github.com/apache/hudi/pull/8558 ### Change Logs - Fixing checkpoint management for multiple streaming writers. Fix is that, each writer updates the checkpoint in commit metadata with its own batchId info only. When checking t

[jira] [Created] (HUDI-6132) Fix multiple streaming writers w/ streaming sink

2023-04-23 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-6132: - Summary: Fix multiple streaming writers w/ streaming sink Key: HUDI-6132 URL: https://issues.apache.org/jira/browse/HUDI-6132 Project: Apache Hudi

[GitHub] [hudi] yihua opened a new pull request, #8557: [HUDI-5895] Remove bootstrap key generator configs

2023-04-23 Thread via GitHub
yihua opened a new pull request, #8557: URL: https://github.com/apache/hudi/pull/8557 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance i

[jira] [Updated] (HUDI-5895) Simplify write configs for bootstrap

2023-04-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5895: - Labels: pull-request-available (was: ) > Simplify write configs for bootstrap > -

[GitHub] [hudi] littleeleventhwolf commented on pull request #8536: [MINOR](hudi-metaserver) fix typos in README.md

2023-04-23 Thread via GitHub
littleeleventhwolf commented on PR #8536: URL: https://github.com/apache/hudi/pull/8536#issuecomment-1519459027 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [hudi] hudi-bot commented on pull request #8490: [HUDI-5968] Fix global index duplicate and handle custom payload when update partition

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8490: URL: https://github.com/apache/hudi/pull/8490#issuecomment-1519457915 ## CI report: * dbe76d556e272ce45d700bfd33253583fd1eda38 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1658

[GitHub] [hudi] hudi-bot commented on pull request #8480: [HUDI-6090] Optimise payload size for list of FileGroupDTO

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8480: URL: https://github.com/apache/hudi/pull/8480#issuecomment-1519457851 ## CI report: * aadc218154cc2a80344fee1e18c02dd3f19ed4f0 UNKNOWN * 5ea64f04d9d1e008cef03e3c92e22f02ab961ae8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8478: [HUDI-6086] Improve HiveSchemaUtil#generateCreateDDL With ST

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8478: URL: https://github.com/apache/hudi/pull/8478#issuecomment-1519457813 ## CI report: * d169a9929d6c8dcf1ce3b350687592ad6e12314f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1659

[GitHub] [hudi] hudi-bot commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519450737 ## CI report: * d242b57634551d7d4dcdd37a810613c178760232 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1654

[GitHub] [hudi] hudi-bot commented on pull request #8490: [HUDI-5968] Fix global index duplicate and handle custom payload when update partition

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8490: URL: https://github.com/apache/hudi/pull/8490#issuecomment-1519450797 ## CI report: * dbe76d556e272ce45d700bfd33253583fd1eda38 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1658

[GitHub] [hudi] wkhappy1 commented on issue #8485: [SUPPORT] hudi connector partition filter only support equal

2023-04-23 Thread via GitHub
wkhappy1 commented on issue #8485: URL: https://github.com/apache/hudi/issues/8485#issuecomment-1519450211 > @wkhappy1 your analysis seems to be correct. i'm going to see if using HivePartitionManager in Presto makes that difference (break for other types). In the meantime, if you have a fi

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1174790151 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bucket/HoodieSparkConsistentBucketIndex.java: ## @@ -275,4 +278,46 @@ public Option getRecordLoc

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8480: [HUDI-6090] Optimise payload size for list of FileGroupDTO

2023-04-23 Thread via GitHub
lokeshj1703 commented on code in PR #8480: URL: https://github.com/apache/hudi/pull/8480#discussion_r1174827172 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/dto/FileGroupDTO.java: ## @@ -46,17 +48,28 @@ public class FileGroupDTO { TimelineDTO timeline;

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8480: [HUDI-6090] Optimise payload size for list of FileGroupDTO

2023-04-23 Thread via GitHub
lokeshj1703 commented on code in PR #8480: URL: https://github.com/apache/hudi/pull/8480#discussion_r1174827034 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/dto/DTOUtils.java: ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] [hudi] hudi-bot commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519443253 ## CI report: * d242b57634551d7d4dcdd37a810613c178760232 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1654

[GitHub] [hudi] hudi-bot commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1519442821 ## CI report: * 084ba2be0afd9343ae97d25e5d6edd2029b909fc Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1659

[GitHub] [hudi] codope commented on issue #8485: [SUPPORT] hudi connector partition filter only support equal

2023-04-23 Thread via GitHub
codope commented on issue #8485: URL: https://github.com/apache/hudi/issues/8485#issuecomment-1519437480 @wkhappy1 your analysis seems to be correct. i'm going to see if using HivePartitionManager in Presto makes that difference (break for other types). In the meantime, if you have a fix, c

[GitHub] [hudi] danny0405 commented on a diff in pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #8550: URL: https://github.com/apache/hudi/pull/8550#discussion_r1174794009 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -522,7 +522,10 @@ private boolean commitInstant(String

[GitHub] [hudi] danny0405 commented on a diff in pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #8550: URL: https://github.com/apache/hudi/pull/8550#discussion_r1174794009 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -522,7 +522,10 @@ private boolean commitInstant(String

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1174790151 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bucket/HoodieSparkConsistentBucketIndex.java: ## @@ -275,4 +278,46 @@ public Option getRecordLoc

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1174790151 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bucket/HoodieSparkConsistentBucketIndex.java: ## @@ -275,4 +278,46 @@ public Option getRecordLoc

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1174790151 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bucket/HoodieSparkConsistentBucketIndex.java: ## @@ -275,4 +278,46 @@ public Option getRecordLoc

[GitHub] [hudi] hudi-bot commented on pull request #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8556: URL: https://github.com/apache/hudi/pull/8556#issuecomment-1519405443 ## CI report: * 93f3c14ff5951673d5a9805781aa2d50bd3e679c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1659

[GitHub] [hudi] hudi-bot commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519405161 ## CI report: * d242b57634551d7d4dcdd37a810613c178760232 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1654

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8480: [HUDI-6090] Optimise payload size for list of FileGroupDTO

2023-04-23 Thread via GitHub
lokeshj1703 commented on code in PR #8480: URL: https://github.com/apache/hudi/pull/8480#discussion_r1174787456 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/dto/FileGroupDTO.java: ## @@ -46,17 +49,50 @@ public class FileGroupDTO { TimelineDTO timeline;

[GitHub] [hudi] XuQianJin-Stars commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
XuQianJin-Stars commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519401807 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [hudi] XuQianJin-Stars commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
XuQianJin-Stars commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519401522 > @XuQianJin-Stars Can we ready to land it~ The CI is failed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] hudi-bot commented on pull request #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8556: URL: https://github.com/apache/hudi/pull/8556#issuecomment-1519399103 ## CI report: * 93f3c14ff5951673d5a9805781aa2d50bd3e679c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] Zouxxyy commented on pull request #8488: [HUDI-5957] Fix table not exist when using 'db.table' in createHoodieClientFromPath

2023-04-23 Thread via GitHub
Zouxxyy commented on PR #8488: URL: https://github.com/apache/hudi/pull/8488#issuecomment-1519393901 @XuQianJin-Stars Can we ready to land it~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] stream2000 commented on a diff in pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
stream2000 commented on code in PR #8550: URL: https://github.com/apache/hudi/pull/8550#discussion_r1174773967 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -522,7 +522,10 @@ private boolean commitInstant(String

[GitHub] [hudi] SteNicholas commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
SteNicholas commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1174772541 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/index/bucket/HoodieSparkConsistentBucketIndex.java: ## @@ -275,4 +278,46 @@ public Option getRecordLoca

[GitHub] [hudi] Zouxxyy commented on pull request #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread via GitHub
Zouxxyy commented on PR #8556: URL: https://github.com/apache/hudi/pull/8556#issuecomment-1519377175 In fact, I have a question: Why do we have to convert to a full scan when find file is cleaned during incremental reading? -- This is an automated message from the Apache Git Service. To r

[jira] [Updated] (HUDI-6131) Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6131: - Labels: pull-request-available (was: ) > Refactor getWritePathsOfInstants in Flink WriteProfiles

[GitHub] [hudi] Zouxxyy opened a new pull request, #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread via GitHub
Zouxxyy opened a new pull request, #8556: URL: https://github.com/apache/hudi/pull/8556 ### Change Logs Refactor `getWritePathsOfInstants` and `getRawWritePathsOfInstants` in Flink WriteProfiles, combine them into one function and return early when file doesn't exist to reduce the co

[jira] [Created] (HUDI-6131) Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-23 Thread zouxxyy (Jira)
zouxxyy created HUDI-6131: - Summary: Refactor getWritePathsOfInstants in Flink WriteProfiles Key: HUDI-6131 URL: https://issues.apache.org/jira/browse/HUDI-6131 Project: Apache Hudi Issue Type: Impro

[jira] [Updated] (HUDI-6130) Update website docs for 0.12.3

2023-04-23 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6130: - Labels: pull-request-available (was: ) > Update website docs for 0.12.3 > ---

[GitHub] [hudi] nsivabalan opened a new pull request, #8555: [HUDI-6130] Adding docs for 0.12.3

2023-04-23 Thread via GitHub
nsivabalan opened a new pull request, #8555: URL: https://github.com/apache/hudi/pull/8555 ### Change Logs Adding docs for 0.12.3 ### Impact Adding docs for 0.12.3 ### Risk level (write none, low medium or high below) none ### Documentation Update

[jira] [Created] (HUDI-6130) Update website docs for 0.12.3

2023-04-23 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-6130: - Summary: Update website docs for 0.12.3 Key: HUDI-6130 URL: https://issues.apache.org/jira/browse/HUDI-6130 Project: Apache Hudi Issue Type: Improv

[GitHub] [hudi] xicm commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2023-04-23 Thread via GitHub
xicm commented on PR #7355: URL: https://github.com/apache/hudi/pull/7355#issuecomment-1519368205 > Is the problematic table partitioning in hive style: `par1=val1` ? I was astonished that the partition path queries for Hive return nulls. Hive style and none hive style all returns nul

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519367667 ## CI report: * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] xccui opened a new issue, #8554: [SUPPORT] Some resources should be reset after failure recovery of Flink

2023-04-23 Thread via GitHub
xccui opened a new issue, #8554: URL: https://github.com/apache/hudi/issues/8554 We hit some S3 http connection pool issues when running a Flink writer job and it caused the connection pool on `StreamWriteOperatorCoordinator` to close. However, after failure recovery, the connection pool wo

[GitHub] [hudi] hudi-bot commented on pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8550: URL: https://github.com/apache/hudi/pull/8550#issuecomment-1519363323 ## CI report: * 563e10e0492a8194d789772de6bb9ced9f8c0721 UNKNOWN * 25a2ebf3646b2abf99bfba54d947066d3fc16c6b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519363195 ## CI report: * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-23 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1174758600 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommit

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519357440 ## CI report: * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1647

[GitHub] [hudi] danny0405 commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2023-04-23 Thread via GitHub
danny0405 commented on PR #7355: URL: https://github.com/apache/hudi/pull/7355#issuecomment-1519347743 Does the table partitioning is in hive style: `par1=val1` ? I was astonished that the partition path queries for Hive return nulls. -- This is an automated message from the Apache Git Se

[GitHub] [hudi] danny0405 commented on pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-23 Thread via GitHub
danny0405 commented on PR #8503: URL: https://github.com/apache/hudi/pull/8503#issuecomment-1519343709 cc @SteNicholas for reviewing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] danny0405 commented on a diff in pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #7627: URL: https://github.com/apache/hudi/pull/7627#discussion_r1174745393 ## hudi-common/src/test/java/org/apache/hudi/common/table/timeline/TestHoodieInstant.java: ## @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] [hudi] hudi-bot commented on pull request #8553: [MINOR] Updating DOAP file for 0.12.3

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8553: URL: https://github.com/apache/hudi/pull/8553#issuecomment-1519328792 ## CI report: * 7d8294c2c877a6b12c47dbe6cd46ed9946ef6281 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1659

[GitHub] [hudi] hudi-bot commented on pull request #8478: [HUDI-6086] Improve HiveSchemaUtil#generateCreateDDL With ST

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8478: URL: https://github.com/apache/hudi/pull/8478#issuecomment-1519328613 ## CI report: * 893e94ecf28bff27889da5a95117c9cb894a4e24 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1644

[GitHub] [hudi] hudi-bot commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1519328391 ## CI report: * 86db1f4f7be8f8b8dcd56e13bd945977fc804591 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1654

[GitHub] [hudi] danny0405 commented on a diff in pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #7627: URL: https://github.com/apache/hudi/pull/7627#discussion_r1174743866 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/MergeOnReadIncrementalRelation.scala: ## @@ -185,7 +185,11 @@ trait HoodieIncrementalRelationTr

[GitHub] [hudi] SteNicholas commented on a diff in pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
SteNicholas commented on code in PR #8550: URL: https://github.com/apache/hudi/pull/8550#discussion_r1174733402 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -522,7 +522,10 @@ private boolean commitInstant(Strin

[GitHub] [hudi] danny0405 commented on a diff in pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #7627: URL: https://github.com/apache/hudi/pull/7627#discussion_r1174743356 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstant.java: ## @@ -83,7 +85,7 @@ public static String getTimelineFileExtension(String fileNam

[GitHub] [hudi] hudi-bot commented on pull request #8553: [MINOR] Updating DOAP file for 0.12.3

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8553: URL: https://github.com/apache/hudi/pull/8553#issuecomment-1519324493 ## CI report: * 7d8294c2c877a6b12c47dbe6cd46ed9946ef6281 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8478: [HUDI-6086] Improve HiveSchemaUtil#generateCreateDDL With ST

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8478: URL: https://github.com/apache/hudi/pull/8478#issuecomment-1519324282 ## CI report: * 893e94ecf28bff27889da5a95117c9cb894a4e24 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1644

[GitHub] [hudi] danny0405 commented on a diff in pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #7627: URL: https://github.com/apache/hudi/pull/7627#discussion_r1174742688 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieArchivedTimeline.java: ## @@ -179,6 +179,8 @@ public HoodieArchivedTimeline reload() { privat

[GitHub] [hudi] hudi-bot commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-23 Thread via GitHub
hudi-bot commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1519324054 ## CI report: * 86db1f4f7be8f8b8dcd56e13bd945977fc804591 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1654

[GitHub] [hudi] danny0405 commented on pull request #8536: [MINOR](hudi-metaserver) fix typos in README.md

2023-04-23 Thread via GitHub
danny0405 commented on PR #8536: URL: https://github.com/apache/hudi/pull/8536#issuecomment-1519322360 Hi, @littleeleventhwolf , can you rebase with the latest master code and force-push to fix the test failures? -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [hudi] danny0405 commented on pull request #8361: [HUDI-6018] set owner of table created by Flink to Kerberos ShortUserName

2023-04-23 Thread via GitHub
danny0405 commented on PR #8361: URL: https://github.com/apache/hudi/pull/8361#issuecomment-1519321051 Will cc @xiarixiaoyao for a final review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [hudi] xuzifu666 closed issue #8541: [SUPPORT] write to a mor table 12times, but no compaction instance with default configuration

2023-04-23 Thread via GitHub
xuzifu666 closed issue #8541: [SUPPORT] write to a mor table 12times, but no compaction instance with default configuration URL: https://github.com/apache/hudi/issues/8541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [hudi] xuzifu666 commented on issue #8541: [SUPPORT] write to a mor table 12times, but no compaction instance with default configuration

2023-04-23 Thread via GitHub
xuzifu666 commented on issue #8541: URL: https://github.com/apache/hudi/issues/8541#issuecomment-1519310644 > > at is exact the behavior you want for th > > ok,no problem,we would set inline compaction from false to true,thanks @danny0405 -- This is an automated message fr

[GitHub] [hudi] xuzifu666 commented on issue #8541: [SUPPORT] write to a mor table 12times, but no compaction instance with default configuration

2023-04-23 Thread via GitHub
xuzifu666 commented on issue #8541: URL: https://github.com/apache/hudi/issues/8541#issuecomment-1519310465 > at is exact the behavior you want for th ok,no problem,we would set inline compaction from false to true,thanks -- This is an automated message from the Apache Git Service.

[GitHub] [hudi] SteNicholas commented on a diff in pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-23 Thread via GitHub
SteNicholas commented on code in PR #8550: URL: https://github.com/apache/hudi/pull/8550#discussion_r1174733402 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -522,7 +522,10 @@ private boolean commitInstant(Strin

[GitHub] [hudi] huangxiaopingRD commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-23 Thread via GitHub
huangxiaopingRD commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1519304518 > Can you rebase with latest master and force push again~ Done. Please approve the workflows. BTW, why ordinary contributors can't trigger the workflows? I remember it was poss

[GitHub] [hudi] danny0405 commented on a diff in pull request #8546: [MINOR] Add log in flink compact/cluster commit sink for troubleshoot…

2023-04-23 Thread via GitHub
danny0405 commented on code in PR #8546: URL: https://github.com/apache/hudi/pull/8546#discussion_r1174732825 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringCommitSink.java: ## @@ -120,6 +120,11 @@ private void commitIfNecessary(Strin

[jira] [Created] (HUDI-6129) Rate limit for streaming source

2023-04-23 Thread Danny Chen (Jira)
Danny Chen created HUDI-6129: Summary: Rate limit for streaming source Key: HUDI-6129 URL: https://issues.apache.org/jira/browse/HUDI-6129 Project: Apache Hudi Issue Type: New Feature C

[GitHub] [hudi] danny0405 commented on issue #8544: [SUPPORT] Support rate limit when reading Hudi table

2023-04-23 Thread via GitHub
danny0405 commented on issue #8544: URL: https://github.com/apache/hudi/issues/8544#issuecomment-1519301587 Thanks, this is a useful feature, I have created a JIRA issue: https://issues.apache.org/jira/browse/HUDI-6129 -- This is an automated message from the Apache Git Service. To respon

[GitHub] [hudi] danny0405 commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-23 Thread via GitHub
danny0405 commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1519298606 Can you rebase with latest master and force push again~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [hudi] nsivabalan opened a new pull request, #8553: [MINOR] Updating DOAP file for 0.12.3

2023-04-23 Thread via GitHub
nsivabalan opened a new pull request, #8553: URL: https://github.com/apache/hudi/pull/8553 ### Change Logs Updating DOAP file for 0.12.3 ### Impact Updating DOAP file for 0.12.3 ### Risk level (write none, low medium or high below) none ### Documentat

[jira] [Updated] (HUDI-5728) HoodieTimelineArchiver archives the latest instant before inflight replacecommit

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5728: -- Fix Version/s: 0.13.1 (was: 0.12.3) > HoodieTimelineArchiver arch

[jira] [Updated] (HUDI-5979) Replace individual hudi modules by hudi-trino-bundle in Trino Hudi connector

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5979: -- Fix Version/s: (was: 0.12.3) > Replace individual hudi modules by hudi-trino-bundle

[jira] [Updated] (HUDI-5329) spark reads hudi table error when flink creates the table without preCombine fields

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5329: -- Fix Version/s: (was: 0.12.3) > spark reads hudi table error when flink creates the t

[jira] [Updated] (HUDI-5688) schema field of EmptyRelation subtype of BaseRelation should not be null

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5688: -- Fix Version/s: (was: 0.12.3) > schema field of EmptyRelation subtype of BaseRelation

[jira] [Updated] (HUDI-5292) Exclude the test resources from every module packaging

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5292: -- Fix Version/s: (was: 0.12.3) > Exclude the test resources from every module packagin

[jira] [Updated] (HUDI-5507) SparkSQL can not read the latest change data without execute "refresh table xxx"

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5507: -- Fix Version/s: (was: 0.12.3) > SparkSQL can not read the latest change data without

[jira] [Updated] (HUDI-4557) Support validation of column stats of avro log files in tests

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4557: -- Fix Version/s: (was: 0.12.3) > Support validation of column stats of avro log files

[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5609: -- Fix Version/s: (was: 0.12.3) > Hudi table not queryable by SQL on Databricks Spark >

[jira] [Updated] (HUDI-3114) Kafka Connect can not connect Hive by jdbc

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3114: -- Fix Version/s: (was: 0.12.3) > Kafka Connect can not connect Hive by jdbc >

[jira] [Updated] (HUDI-2782) Fix marker based strategy for structured streaming

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2782: -- Fix Version/s: (was: 0.12.3) > Fix marker based strategy for structured streaming >

[jira] [Updated] (HUDI-3646) The Hudi update syntax should not modify the nullability attribute of a column

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3646: -- Fix Version/s: (was: 0.12.3) > The Hudi update syntax should not modify the nullabil

[jira] [Updated] (HUDI-5095) Flink: Stores a special watermark(flag) to identify the current progress of writing data

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5095: -- Fix Version/s: (was: 0.12.3) > Flink: Stores a special watermark(flag) to identify t

[jira] [Updated] (HUDI-5721) Add Github actions on more validations

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5721: -- Fix Version/s: (was: 0.12.3) > Add Github actions on more validations >

[jira] [Updated] (HUDI-5444) FileNotFound issue w/ metadata enabled

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5444: -- Fix Version/s: (was: 0.12.3) > FileNotFound issue w/ metadata enabled >

[jira] [Updated] (HUDI-4090) Fix flaky IT tests ITTestHoodieDataSource.testStreamReadAppendData

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4090: -- Fix Version/s: (was: 0.12.3) > Fix flaky IT tests ITTestHoodieDataSource.testStreamR

[jira] [Updated] (HUDI-5835) spark cannot read mor table after execute update statement

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5835: -- Fix Version/s: (was: 0.12.3) > spark cannot read mor table after execute update stat

[jira] [Updated] (HUDI-3517) Unicode in partition path causes it to be resolved wrongly

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3517: -- Fix Version/s: (was: 0.12.3) > Unicode in partition path causes it to be resolved wr

[jira] [Updated] (HUDI-3674) Remove unnecessary HBase-related dependencies from bundles if there is any

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3674: -- Fix Version/s: (was: 0.12.3) > Remove unnecessary HBase-related dependencies from bu

[jira] [Updated] (HUDI-1779) Fail to bootstrap/upsert a table which contains timestamp column

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1779: -- Fix Version/s: (was: 0.12.3) > Fail to bootstrap/upsert a table which contains times

[jira] [Updated] (HUDI-2458) Relax compaction in metadata being fenced based on inflight requests in data table

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2458: -- Fix Version/s: (was: 0.12.3) > Relax compaction in metadata being fenced based on in

[jira] [Updated] (HUDI-1369) Bootstrap datasource jobs from hanging via spark-submit

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1369: -- Fix Version/s: (was: 0.12.3) > Bootstrap datasource jobs from hanging via spark-subm

[jira] [Updated] (HUDI-3879) Suppress exceptions that are not fatal in HoodieMetadataTableValidator

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3879: -- Fix Version/s: (was: 0.12.3) > Suppress exceptions that are not fatal in HoodieMetad

[jira] [Updated] (HUDI-4585) Optimize query performance on Presto Hudi connector

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4585: -- Fix Version/s: 0.13.1 (was: 0.12.3) > Optimize query performance

[jira] [Updated] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-992: - Fix Version/s: (was: 0.12.3) > For hive-style partitioned source data, partition column

[jira] [Updated] (HUDI-5498) Update docs for reading Hudi tables on Databricks runtime

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5498: -- Fix Version/s: 0.13.1 (was: 0.12.3) > Update docs for reading Hud

[jira] [Updated] (HUDI-5824) COMBINE_BEFORE_UPSERT=false option does not work for upsert

2023-04-23 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-5824: -- Fix Version/s: (was: 0.12.3) > COMBINE_BEFORE_UPSERT=false option does not work for

  1   2   >