[GitHub] [hudi] hudi-bot commented on pull request #6625: [HUDI-4799] improve analyzer exception tip when can not resolve expre…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6625: URL: https://github.com/apache/hudi/pull/6625#issuecomment-1240256644 ## CI report: * a6d1f537e3a4fee7b9fb913de0ab531fc8d4be83 Azure:

[GitHub] [hudi] alexeykudinkin commented on pull request #6525: [HUDI-4237] should not sync partition parameters when create non-partition table in spark

2022-09-07 Thread GitBox
alexeykudinkin commented on PR #6525: URL: https://github.com/apache/hudi/pull/6525#issuecomment-1240241085 Approved already. @nsivabalan can you please help landing this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] alexeykudinkin commented on pull request #6046: [HUDI-4363] Support Clustering row writer to improve performance

2022-09-07 Thread GitBox
alexeykudinkin commented on PR #6046: URL: https://github.com/apache/hudi/pull/6046#issuecomment-1240240348 @boneanxs will do -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
hudi-bot commented on PR #6502: URL: https://github.com/apache/hudi/pull/6502#issuecomment-1240221361 ## CI report: * fbedf9a29c4c574ad4d69406416dbb057c080345 UNKNOWN * 8b1585464429a60d9eff4cfa2cb9f937b1ac6f0d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6631: [HUDI-4810] Fixing Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread GitBox
hudi-bot commented on PR #6631: URL: https://github.com/apache/hudi/pull/6631#issuecomment-1240221583 ## CI report: * e8e8c4d8047b5985764f7534bd84e82763c3ad28 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
hudi-bot commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240220669 ## CI report: * 7a9f87cb94043c2447da84ff07ff93009c891174 Azure:

[jira] [Updated] (HUDI-4810) Fix Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4810: - Labels: pull-request-available (was: ) > Fix Hudi bundles requiring log4j2 on the classpath >

[GitHub] [hudi] hudi-bot commented on pull request #6631: [HUDI-4810] Fixing Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread GitBox
hudi-bot commented on PR #6631: URL: https://github.com/apache/hudi/pull/6631#issuecomment-1240218501 ## CI report: * e8e8c4d8047b5985764f7534bd84e82763c3ad28 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
hudi-bot commented on PR #6502: URL: https://github.com/apache/hudi/pull/6502#issuecomment-1240218305 ## CI report: * fbedf9a29c4c574ad4d69406416dbb057c080345 UNKNOWN * 8b1585464429a60d9eff4cfa2cb9f937b1ac6f0d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
hudi-bot commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240217636 ## CI report: * 7a9f87cb94043c2447da84ff07ff93009c891174 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6616: Add Postgres Schema Name to Postgres Debezium Source

2022-09-07 Thread GitBox
hudi-bot commented on PR #6616: URL: https://github.com/apache/hudi/pull/6616#issuecomment-1240215471 ## CI report: * 25a5a5c619d56e686e6fb38e20e841ef9a1e Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6628: [HUDI-4806] Use Avro version from the root pom for Flink bundle

2022-09-07 Thread GitBox
hudi-bot commented on PR #6628: URL: https://github.com/apache/hudi/pull/6628#issuecomment-1240215510 ## CI report: * 2504fd6b17a7a3fb2a77f755d7fe6b6c7f83c96f Azure:

[GitHub] [hudi] praveenkmr commented on issue #6623: [SUPPORT] java.lang.ClassNotFoundException: Class org.apache.hadoop.hbase.client.ClusterStatusListener$MulticastListener with HBase Index

2022-09-07 Thread GitBox
praveenkmr commented on issue #6623: URL: https://github.com/apache/hudi/issues/6623#issuecomment-1240213423 @yihua Thanks a lot, Ethan.. I tried the suggestion and it worked fine... Still, wondering during further upgradation do we need to follow the same approach of loading all the jars

[GitHub] [hudi] jsbali commented on a diff in pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
jsbali commented on code in PR #6502: URL: https://github.com/apache/hudi/pull/6502#discussion_r965494212 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsConfig.java: ## @@ -83,6 +83,11 @@ public class HoodieMetricsConfig extends

[GitHub] [hudi] LXin96 commented on a diff in pull request #6614: [DOCS] Asf site update flink option 'read.tasks & write.tasks' description

2022-09-07 Thread GitBox
LXin96 commented on code in PR #6614: URL: https://github.com/apache/hudi/pull/6614#discussion_r965490917 ## website/docs/configurations.md: ## @@ -978,8 +978,8 @@ Actual value obtained by invoking .toString(), default '' --- > write.tasks -> Parallelism of tasks that

[GitHub] [hudi] jsbali commented on a diff in pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
jsbali commented on code in PR #6502: URL: https://github.com/apache/hudi/pull/6502#discussion_r965487305 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/HoodieMetrics.java: ## @@ -130,6 +140,13 @@ public Timer.Context getIndexCtx() { return

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240196482 clustering causes data duplication or presto engine adapter issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Created] (HUDI-4810) Fix Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-4810: - Summary: Fix Hudi bundles requiring log4j2 on the classpath Key: HUDI-4810 URL: https://issues.apache.org/jira/browse/HUDI-4810 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-4810) Fix Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4810: -- Status: In Progress (was: Open) > Fix Hudi bundles requiring log4j2 on the classpath >

[jira] [Updated] (HUDI-4810) Fix Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4810: -- Sprint: 2022/09/05 > Fix Hudi bundles requiring log4j2 on the classpath >

[GitHub] [hudi] alexeykudinkin opened a new pull request, #6631: [WIP] Fixing Hudi bundles requiring log4j2 on the classpath

2022-09-07 Thread GitBox
alexeykudinkin opened a new pull request, #6631: URL: https://github.com/apache/hudi/pull/6631 ### Change Logs In XXX, we've rebased Hudi to instead mostly rely on Log4j2 bridge and implementations (in tests). However we actually missed the fact that `log4j-1.2-api` isn't

[GitHub] [hudi] jsbali commented on a diff in pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
jsbali commented on code in PR #6502: URL: https://github.com/apache/hudi/pull/6502#discussion_r965477406 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsConfig.java: ## @@ -83,6 +83,11 @@ public class HoodieMetricsConfig extends

[GitHub] [hudi] hudi-bot commented on pull request #5269: [HUDI-3636] Create new write clients for async table services in DeltaStreamer

2022-09-07 Thread GitBox
hudi-bot commented on PR #5269: URL: https://github.com/apache/hudi/pull/5269#issuecomment-1240185610 ## CI report: * 6f8d22ccc5efbd87ff993a46ea1977355842602f Azure:

[GitHub] [hudi] jsbali commented on a diff in pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
jsbali commented on code in PR #6502: URL: https://github.com/apache/hudi/pull/6502#discussion_r965475615 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java: ## @@ -64,13 +69,18 @@ public void lock() { boolean

[GitHub] [hudi] hudi-bot commented on pull request #5269: [HUDI-3636] Create new write clients for async table services in DeltaStreamer

2022-09-07 Thread GitBox
hudi-bot commented on PR #5269: URL: https://github.com/apache/hudi/pull/5269#issuecomment-1240183216 ## CI report: * 6f8d22ccc5efbd87ff993a46ea1977355842602f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240181354 ## CI report: * 85a8f5166c17ec5ce9fa00e2c38846f440582acf Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6629: [HUDI-4807] Use base table instant for metadata table initialization

2022-09-07 Thread GitBox
hudi-bot commented on PR #6629: URL: https://github.com/apache/hudi/pull/6629#issuecomment-1240181342 ## CI report: * c88a869d5d8e748edac75698c7c504176a06e47d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6574: Keep a clustering running at the same time.#6573

2022-09-07 Thread GitBox
hudi-bot commented on PR #6574: URL: https://github.com/apache/hudi/pull/6574#issuecomment-1240181238 ## CI report: * 7ced8cc1e89594e2a074a546a165ce3ef744841f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240178449 ## CI report: * 85a8f5166c17ec5ce9fa00e2c38846f440582acf UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6574: Keep a clustering running at the same time.#6573

2022-09-07 Thread GitBox
hudi-bot commented on PR #6574: URL: https://github.com/apache/hudi/pull/6574#issuecomment-1240178259 ## CI report: * 7ced8cc1e89594e2a074a546a165ce3ef744841f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6629: [HUDI-4807] Use base table instant for metadata table initialization

2022-09-07 Thread GitBox
hudi-bot commented on PR #6629: URL: https://github.com/apache/hudi/pull/6629#issuecomment-1240178412 ## CI report: * c88a869d5d8e748edac75698c7c504176a06e47d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] Gatsby-Lee closed issue #6024: [SUPPORT] DELETE_PARTITION causes AWS Athena Query failure

2022-09-07 Thread GitBox
Gatsby-Lee closed issue #6024: [SUPPORT] DELETE_PARTITION causes AWS Athena Query failure URL: https://github.com/apache/hudi/issues/6024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] Gatsby-Lee commented on issue #6024: [SUPPORT] DELETE_PARTITION causes AWS Athena Query failure

2022-09-07 Thread GitBox
Gatsby-Lee commented on issue #6024: URL: https://github.com/apache/hudi/issues/6024#issuecomment-1240177021 Hi, let's close issue if I am the only one facing the issue. Let me write more details before I forget. A couple of months ago, I tried DELETE_PARTITION operation with

[GitHub] [hudi] yihua commented on issue #6590: [SUPPORT] HoodieDeltaStreamer AWSDmsAvroPayload fails to handle deletes in MySQL

2022-09-07 Thread GitBox
yihua commented on issue #6590: URL: https://github.com/apache/hudi/issues/6590#issuecomment-1240175913 This is the same issue as #6552. cc @rahil-c -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] yihua commented on issue #6552: [SUPPORT] AWSDmsAvroPayload does not work correctly with any version above 0.10.0

2022-09-07 Thread GitBox
yihua commented on issue #6552: URL: https://github.com/apache/hudi/issues/6552#issuecomment-1240175667 @rahil-c and I discussed this today. The proper fix is to call the corresponding API instead of repeating the invocation of `handleDeleteOperation`: ``` FIXED -> @Override

[GitHub] [hudi] xiarixiaoyao commented on pull request #6322: [HUDI-4559] Support hiveSync command based on Call Produce Command

2022-09-07 Thread GitBox
xiarixiaoyao commented on PR #6322: URL: https://github.com/apache/hudi/pull/6322#issuecomment-1240173783 @XuQianJin-Stars pls resolve the conflicts, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] wangp-nhlab commented on pull request #6544: When Hudi choose Append save mode in Spark , the basepath may be error codes

2022-09-07 Thread GitBox
wangp-nhlab commented on PR #6544: URL: https://github.com/apache/hudi/pull/6544#issuecomment-1240166688 > @wangp-nhlab[您可以按照此处](https://hudi.apache.org/contribute/developer-setup#filing-jiras)的流程创建并申请 JIRA 票并将票号附加到 PR吗? Okay -- This is an automated message from the Apache Git

[GitHub] [hudi] TJX2014 commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-07 Thread GitBox
TJX2014 commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240166454 Hi, @danny0405 , this is another patch for https://github.com/apache/hudi/pull/6595 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] wangp-nhlab commented on a diff in pull request #6544: When Hudi choose Append save mode in Spark , the basepath may be error codes

2022-09-07 Thread GitBox
wangp-nhlab commented on code in PR #6544: URL: https://github.com/apache/hudi/pull/6544#discussion_r965463084 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java: ## @@ -176,7 +178,8 @@ private T executeRequest(String

[GitHub] [hudi] TJX2014 commented on pull request #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-07 Thread GitBox
TJX2014 commented on PR #6630: URL: https://github.com/apache/hudi/pull/6630#issuecomment-1240165726 @minihippo hi, please help me check this, I think this patch could fix the HoodieSimpleBucketIndex firstly. -- This is an automated message from the Apache Git Service. To respond to the

[jira] [Updated] (HUDI-4808) HoodieSimpleBucketIndex should also consider bucket num in log file not in base file which written by flink mor table

2022-09-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4808: - Labels: pull-request-available (was: ) > HoodieSimpleBucketIndex should also consider bucket num

[GitHub] [hudi] codope commented on a diff in pull request #6016: [HUDI-4465] Optimizing file-listing sequence of Metadata Table

2022-09-07 Thread GitBox
codope commented on code in PR #6016: URL: https://github.com/apache/hudi/pull/6016#discussion_r965462152 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java: ## @@ -46,6 +47,12 @@ public SimpleKeyGenerator(TypedProperties props) {

[GitHub] [hudi] TJX2014 opened a new pull request, #6630: [HUDI-4808] Fix HoodieSimpleBucketIndex not consider bucket num in lo…

2022-09-07 Thread GitBox
TJX2014 opened a new pull request, #6630: URL: https://github.com/apache/hudi/pull/6630 ### Change Logs Make HoodieSimpleBucketIndex also load bucket index from log file ### Impact Spark will read bucket index correctly where the log file is written by flink to mor table.

[GitHub] [hudi] dongkelun commented on a diff in pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
dongkelun commented on code in PR #5478: URL: https://github.com/apache/hudi/pull/5478#discussion_r965462117 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java: ## @@ -539,4 +543,19 @@ public void handle(@NotNull Context context) throws

[jira] [Created] (HUDI-4809) Hudi Support AWS Glue DropPartitions

2022-09-07 Thread XixiHua (Jira)
XixiHua created HUDI-4809: - Summary: Hudi Support AWS Glue DropPartitions Key: HUDI-4809 URL: https://issues.apache.org/jira/browse/HUDI-4809 Project: Apache Hudi Issue Type: New Feature

[jira] [Created] (HUDI-4808) HoodieSimpleBucketIndex should also consider bucket num in log file not in base file which written by flink mor table

2022-09-07 Thread JinxinTang (Jira)
JinxinTang created HUDI-4808: Summary: HoodieSimpleBucketIndex should also consider bucket num in log file not in base file which written by flink mor table Key: HUDI-4808 URL:

[GitHub] [hudi] dongkelun commented on a diff in pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
dongkelun commented on code in PR #5478: URL: https://github.com/apache/hudi/pull/5478#discussion_r965460402 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java: ## @@ -539,4 +543,19 @@ public void handle(@NotNull Context context) throws

[GitHub] [hudi] TJX2014 commented on pull request #6595: [HUDI-4777] Fix flink gen bucket index of mor table not consistent wi…

2022-09-07 Thread GitBox
TJX2014 commented on PR #6595: URL: https://github.com/apache/hudi/pull/6595#issuecomment-1240160173 > I will fix give pr fix in spark side too, but in flink side, I think deduplicate should also open as default option for mor table , when duplicate write to log file, very hard for

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240156176 > Remove these config, then data duplication disappeared. why? > > ``` > // option("hoodie.clustering.inline", "true"). > //

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240156022 > still repeated according to the new patch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] arunb2w commented on issue #6626: [SUPPORT] HUDI merge into via spark sql not working

2022-09-07 Thread GitBox
arunb2w commented on issue #6626: URL: https://github.com/apache/hudi/issues/6626#issuecomment-1240154803 @nsivabalan Can you please provide some help on this issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240154151 Remove these config, then data duplication disappeared. why? ``` // option("hoodie.clustering.inline", "true"). // option("hoodie.clustering.inline.max.commits",

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240151995 still repeated according to the new patch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Updated] (HUDI-4722) Add support for metrics for locking infra

2022-09-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4722: -- Status: Patch Available (was: In Progress) > Add support for metrics for locking infra

[jira] [Updated] (HUDI-4722) Add support for metrics for locking infra

2022-09-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4722: -- Status: In Progress (was: Open) > Add support for metrics for locking infra >

[jira] [Updated] (HUDI-4807) Use correct instant in metadata initialization

2022-09-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4807: - Labels: pull-request-available (was: ) > Use correct instant in metadata initialization >

[GitHub] [hudi] YuweiXiao opened a new pull request, #6629: [HUDI-4807] Use base table instant for metadata table initialization

2022-09-07 Thread GitBox
YuweiXiao opened a new pull request, #6629: URL: https://github.com/apache/hudi/pull/6629 ### Change Logs Use base table instant for metadata table initialization ### Impact No public API change. **Risk level: none | low | medium | high** None. ###

[jira] [Closed] (HUDI-4615) Fix empty commits being made by deltastreamer with S3EventsSource when there is no data in SQS on starting a new pipeline

2022-09-07 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-4615. - Resolution: Fixed > Fix empty commits being made by deltastreamer with S3EventsSource

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5030: [HUDI-3617] MOR compact improve

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #5030: URL: https://github.com/apache/hudi/pull/5030#discussion_r965448515 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java: ## @@ -123,25 +133,24 @@ public long getNumMergedRecordsInLog() {

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-07 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240147859 ## CI report: * c0dc922eec0ffe4c93f250dcf91dd313713057db Azure:

[jira] [Created] (HUDI-4807) Use correct instant in metadata initialization

2022-09-07 Thread Yuwei Xiao (Jira)
Yuwei Xiao created HUDI-4807: Summary: Use correct instant in metadata initialization Key: HUDI-4807 URL: https://issues.apache.org/jira/browse/HUDI-4807 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] hudi-bot commented on pull request #5091: [HUDI-3453] Fix HoodieBackedTableMetadata concurrent reading issue

2022-09-07 Thread GitBox
hudi-bot commented on PR #5091: URL: https://github.com/apache/hudi/pull/5091#issuecomment-1240145111 ## CI report: * c0dc922eec0ffe4c93f250dcf91dd313713057db Azure:

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5030: [HUDI-3617] MOR compact improve

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #5030: URL: https://github.com/apache/hudi/pull/5030#discussion_r965446760 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java: ## @@ -280,8 +281,11 @@ HoodieCompactionPlan

[GitHub] [hudi] dongkelun commented on a diff in pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
dongkelun commented on code in PR #5478: URL: https://github.com/apache/hudi/pull/5478#discussion_r965446536 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java: ## @@ -539,4 +543,19 @@ public void handle(@NotNull Context context) throws

[GitHub] [hudi] hudi-bot commented on pull request #6615: [HUDI-4758] Add validations to java spark examples

2022-09-07 Thread GitBox
hudi-bot commented on PR #6615: URL: https://github.com/apache/hudi/pull/6615#issuecomment-1240143195 ## CI report: * 3b37307093cf2c6eb20a4e5f738f8bac38f1dba7 Azure:

[GitHub] [hudi] dongkelun commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
dongkelun commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240139114 > also, a good practice to follow. whenever you are addressing feedback, try to add it as new commits. Easier for reviewer to re-review just the new changes. if not, I have to review

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6502: HUDI-4722 Added locking metrics for Hudi

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #6502: URL: https://github.com/apache/hudi/pull/6502#discussion_r965438383 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/HoodieMetrics.java: ## @@ -130,6 +140,13 @@ public Timer.Context getIndexCtx() { return

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #5478: URL: https://github.com/apache/hudi/pull/5478#discussion_r965436152 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java: ## @@ -539,4 +543,19 @@ public void handle(@NotNull Context context) throws

[GitHub] [hudi] nsivabalan commented on pull request #6536: [HUDI-4736] Fix inflight clean action preventing clean service to continue when multiple cleans are not allowed

2022-09-07 Thread GitBox
nsivabalan commented on PR #6536: URL: https://github.com/apache/hudi/pull/6536#issuecomment-1240127060 @yihua : can you check the CI failure. please file a tracking jira for enhancing tests. once CI succeeds, you can go ahead and land it in. -- This is an automated message from the

[GitHub] [hudi] nsivabalan commented on pull request #5478: [HUDI-3998] Fix getCommitsSinceLastCleaning failed when async cleaning

2022-09-07 Thread GitBox
nsivabalan commented on PR #5478: URL: https://github.com/apache/hudi/pull/5478#issuecomment-1240126607 also, a good practice to follow. whenever you are addressing feedback, try to add it as new commits. Easier for reviewer to re-review just the new changes. if not, I have to review

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6031: [HUDI-4282] Repair IOException in some other dfs, except hdfs,when check block corrupted in HoodieLogFileReader

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #6031: URL: https://github.com/apache/hudi/pull/6031#discussion_r965431934 ## hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java: ## @@ -632,6 +635,15 @@ public static boolean isGCSFileSystem(FileSystem fs) { return

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240123729 > Thanks, today we'll test according to the new patch. If there's any news, we'll sync it with you again -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] Gump518 commented on issue #6609: hudi upsert occured data duplication by spark streaming (cow table)

2022-09-07 Thread GitBox
Gump518 commented on issue #6609: URL: https://github.com/apache/hudi/issues/6609#issuecomment-1240123350 Thanks, today we'll test according to the new patch. If there's any news, we'll sync it with you again -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6031: [HUDI-4282] Repair IOException in some other dfs, except hdfs,when check block corrupted in HoodieLogFileReader

2022-09-07 Thread GitBox
nsivabalan commented on code in PR #6031: URL: https://github.com/apache/hudi/pull/6031#discussion_r965431369 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java: ## @@ -516,4 +521,23 @@ private static FSDataInputStream

[GitHub] [hudi] santoshsb opened a new issue, #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-07 Thread GitBox
santoshsb opened a new issue, #5452: URL: https://github.com/apache/hudi/issues/5452 Hi Team, We are currently evaluating Hudi for our analytical use cases and as part of this exercise we are facing few issues with schema evolution and data loss. The current issue which we have

[GitHub] [hudi] xiarixiaoyao closed issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-07 Thread GitBox
xiarixiaoyao closed issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert. URL: https://github.com/apache/hudi/issues/5452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-07 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1240122505 @santoshsb you need use schema evolution and hoodie.datasource.write.reconcile.schema, see the follow codes ``` def perf(spark: SparkSession) = { import

[GitHub] [hudi] xushiyan commented on a diff in pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-07 Thread GitBox
xushiyan commented on code in PR #6476: URL: https://github.com/apache/hudi/pull/6476#discussion_r965411452 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java: ## @@ -399,9 +451,65 @@ protected void writeIncomingRecords() throws

[GitHub] [hudi] nsivabalan commented on pull request #5406: [HUDI-3954] Don't keep the last commit before the earliest commit to retain

2022-09-07 Thread GitBox
nsivabalan commented on PR #5406: URL: https://github.com/apache/hudi/pull/5406#issuecomment-1240120146 hey @danny0405 : may be there is some rational behind the original intent. Its just deducting 1 commit from what user wants right. as of now, I don't feel this is giving us much or

[GitHub] [hudi] yuzhaojing commented on a diff in pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-09-07 Thread GitBox
yuzhaojing commented on code in PR #4309: URL: https://github.com/apache/hudi/pull/4309#discussion_r965426924 ## rfc/rfc-43/rfc-43.md: ## @@ -0,0 +1,316 @@ + + +# RFC-43: Implement Table Management ServiceTable Management Service for Hudi + +## Proposers + +- @yuzhaojing + +##

[GitHub] [hudi] yuzhaojing commented on a diff in pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-09-07 Thread GitBox
yuzhaojing commented on code in PR #4309: URL: https://github.com/apache/hudi/pull/4309#discussion_r965424691 ## rfc/rfc-43/rfc-43.md: ## @@ -0,0 +1,316 @@ + + +# RFC-43: Implement Table Management ServiceTable Management Service for Hudi + +## Proposers + +- @yuzhaojing + +##

[GitHub] [hudi] yuzhaojing commented on a diff in pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-09-07 Thread GitBox
yuzhaojing commented on code in PR #4309: URL: https://github.com/apache/hudi/pull/4309#discussion_r965424454 ## rfc/rfc-43/rfc-43.md: ## @@ -0,0 +1,316 @@ + + +# RFC-43: Implement Table Management ServiceTable Management Service for Hudi + +## Proposers + +- @yuzhaojing + +##

[GitHub] [hudi] nsivabalan commented on issue #6552: [SUPPORT] AWSDmsAvroPayload does not work correctly with any version above 0.10.0

2022-09-07 Thread GitBox
nsivabalan commented on issue #6552: URL: https://github.com/apache/hudi/issues/6552#issuecomment-1240113529 yeah. Udit pointed out the right commit. here is the fix that worked out for me locally. ``` diff --git

[GitHub] [hudi] yuzhaojing commented on a diff in pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-09-07 Thread GitBox
yuzhaojing commented on code in PR #4309: URL: https://github.com/apache/hudi/pull/4309#discussion_r965423222 ## rfc/rfc-43/rfc-43.md: ## @@ -0,0 +1,316 @@ + + +# RFC-43: Implement Table Management ServiceTable Management Service for Hudi + +## Proposers + +- @yuzhaojing + +##

[GitHub] [hudi] danny0405 commented on a diff in pull request #6628: [HUDI-4806] Use Avro version from the root pom for Flink bundle

2022-09-07 Thread GitBox
danny0405 commented on code in PR #6628: URL: https://github.com/apache/hudi/pull/6628#discussion_r965422148 ## packaging/hudi-flink-bundle/pom.xml: ## @@ -501,8 +501,7 @@ org.apache.avro avro - - 1.10.0 + ${avro.version} Review Comment:

[GitHub] [hudi] hudi-bot commented on pull request #6548: [HUDI-4749] Fixing full cleaning to leverage metadata table

2022-09-07 Thread GitBox
hudi-bot commented on PR #6548: URL: https://github.com/apache/hudi/pull/6548#issuecomment-1240110458 ## CI report: * 78d0f8bb6487e55b91443dcade5285e4a2412e3b Azure:

[jira] [Updated] (HUDI-4766) Fix HoodieFlinkClusteringJob

2022-09-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-4766: - Fix Version/s: 0.12.1 > Fix HoodieFlinkClusteringJob > > >

[jira] [Resolved] (HUDI-4766) Fix HoodieFlinkClusteringJob

2022-09-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-4766. -- > Fix HoodieFlinkClusteringJob > > > Key: HUDI-4766 >

[jira] [Commented] (HUDI-4766) Fix HoodieFlinkClusteringJob

2022-09-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601571#comment-17601571 ] Danny Chen commented on HUDI-4766: -- Fixed via master branch: adf36093d2454c7e3cd7090a0cb3fd5af140b919 >

[GitHub] [hudi] yuzhaojing commented on a diff in pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-09-07 Thread GitBox
yuzhaojing commented on code in PR #4309: URL: https://github.com/apache/hudi/pull/4309#discussion_r965421349 ## rfc/rfc-43/rfc-43.md: ## @@ -0,0 +1,316 @@ + + +# RFC-43: Implement Table Management ServiceTable Management Service for Hudi + +## Proposers + +- @yuzhaojing + +##

[hudi] branch master updated (e8aee84c7c -> adf36093d2)

2022-09-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from e8aee84c7c [HUDI-4793] Fixing ScalaTest tests to properly respect Log4j2 configs (#6617) add adf36093d2

[GitHub] [hudi] danny0405 merged pull request #6566: [HUDI-4766] Strengthen flink clustering job

2022-09-07 Thread GitBox
danny0405 merged PR #6566: URL: https://github.com/apache/hudi/pull/6566 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #6625: [HUDI-4799] improve analyzer exception tip when can not resolve expre…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6625: URL: https://github.com/apache/hudi/pull/6625#issuecomment-1240107831 ## CI report: * 5f385a174df1fa344b87a3a4ada3f3f6d61f1d76 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6625: [HUDI-4799] improve analyzer exception tip when can not resolve expre…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6625: URL: https://github.com/apache/hudi/pull/6625#issuecomment-1240104388 ## CI report: * 5f385a174df1fa344b87a3a4ada3f3f6d61f1d76 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6575: [HUDI-4754] Add compliance check in github actions

2022-09-07 Thread GitBox
hudi-bot commented on PR #6575: URL: https://github.com/apache/hudi/pull/6575#issuecomment-1240093961 ## CI report: * 1600e31836157c8d05e3bc8b9e08e1717471f1a6 UNKNOWN * 4d02f2c64a5fc4b89889677ee639a20b53cec26a UNKNOWN * 48147d19c835e7868102fd2d083659e6ee2ac343 UNKNOWN *

[GitHub] [hudi] xushiyan commented on a diff in pull request #6476: [HUDI-3478] Support CDC for Spark in Hudi

2022-09-07 Thread GitBox
xushiyan commented on code in PR #6476: URL: https://github.com/apache/hudi/pull/6476#discussion_r956951504 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java: ## @@ -116,12 +125,18 @@ public List close() { String key =

[jira] [Commented] (HUDI-4485) Hudi cli got empty result for command show fsview all

2022-09-07 Thread Yao Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601562#comment-17601562 ] Yao Zhang commented on HUDI-4485: - Hi [~codope] , Finally all unit test issues have been resolved and CI

[GitHub] [hudi] paul8263 commented on pull request #6489: [HUDI-4485] [cli] Bumped spring shell to 2.1.1. Updated the default …

2022-09-07 Thread GitBox
paul8263 commented on PR #6489: URL: https://github.com/apache/hudi/pull/6489#issuecomment-1240058067 > Hi @codope and @yihua , Errors of hudi-integ-test are almost cleared. The only one left is: > >

[GitHub] [hudi] hudi-bot commented on pull request #6520: [HUDI-4726] Incremental input splits result is not as expected when f…

2022-09-07 Thread GitBox
hudi-bot commented on PR #6520: URL: https://github.com/apache/hudi/pull/6520#issuecomment-1240042837 ## CI report: * e55d28bdafa64d4a5180fd46191a420e702a58dc UNKNOWN * 360821a2d0110a82ac3c56eb65bcc3ad9b9659bf Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6628: [HUDI-4806] Use Avro version from the root pom for Flink bundle

2022-09-07 Thread GitBox
hudi-bot commented on PR #6628: URL: https://github.com/apache/hudi/pull/6628#issuecomment-1240039267 ## CI report: * 2504fd6b17a7a3fb2a77f755d7fe6b6c7f83c96f Azure:

  1   2   3   4   >