[GitHub] [hudi] hudi-bot commented on pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
hudi-bot commented on PR #6396: URL: https://github.com/apache/hudi/pull/6396#issuecomment-1216168371 ## CI report: * c75abbdcd21abaf0348aa98eecb1922f608b7932 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216168173 ## CI report: * 2d1989eb70e240a765c072d553c4aaa5fc47222c Azure:

[GitHub] [hudi] nsivabalan commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

2022-08-15 Thread GitBox
nsivabalan commented on issue #5765: URL: https://github.com/apache/hudi/issues/5765#issuecomment-1216145403 @yihua : do you think we can document the solution proposed by @dohongdayi above in some FAQ. ``` I resolved this by my own, by packaging a new version of hbase 2.4.9 with

[GitHub] [hudi] nsivabalan commented on issue #5741: [SUPPORT] Hudi table copy failed for some partitions in 0.11.0

2022-08-15 Thread GitBox
nsivabalan commented on issue #5741: URL: https://github.com/apache/hudi/issues/5741#issuecomment-1216143866 @bkosuru : do you have any updates here. if you are not looking for any further assistance, feel free to close out the issue. If not, let us know how we can help. -- This is an

[GitHub] [hudi] nsivabalan commented on issue #5735: No hudi dataset was saved to s3

2022-08-15 Thread GitBox
nsivabalan commented on issue #5735: URL: https://github.com/apache/hudi/issues/5735#issuecomment-1216143296 @pratyakshsharma : can you follow up here please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] nsivabalan commented on issue #5717: [SUPPORT] Hudi 0.10.1 Reconcile schema not working

2022-08-15 Thread GitBox
nsivabalan commented on issue #5717: URL: https://github.com/apache/hudi/issues/5717#issuecomment-1216142966 guess the root cause is that, when you upgraded the table schema w/ new columns, it should have been nullable. likely in your case it was non-nullable column. Which is not backwards

[GitHub] [hudi] nsivabalan commented on issue #5715: [SUPPORT] Hudi 0.11.0 time travel not working

2022-08-15 Thread GitBox
nsivabalan commented on issue #5715: URL: https://github.com/apache/hudi/issues/5715#issuecomment-1216136018 @mgerlach : any updates on this regard. If you got the issue resolved, feel free to close out the issue. thanks! -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan commented on issue #5689: [SUPPORT] Duplicates appears on the some attempts of rewrite

2022-08-15 Thread GitBox
nsivabalan commented on issue #5689: URL: https://github.com/apache/hudi/issues/5689#issuecomment-1216135653 @eshu : can you update w/ the requested info. We won't be able to help much w/o further info. If you got the issue resolved, feel free to close out the issue. thanks! -- This is

svn commit: r56305 - in /dev/hudi/hudi-0.12.0: ./ hudi-0.12.0.src.tgz hudi-0.12.0.src.tgz.asc hudi-0.12.0.src.tgz.sha512

2022-08-15 Thread codope
Author: codope Date: Tue Aug 16 04:37:14 2022 New Revision: 56305 Log: Add source distribution for hudi-0.12.0 Added: dev/hudi/hudi-0.12.0/ dev/hudi/hudi-0.12.0/hudi-0.12.0.src.tgz (with props) dev/hudi/hudi-0.12.0/hudi-0.12.0.src.tgz.asc

[GitHub] [hudi] nsivabalan commented on issue #5599: [SUPPORT] File names in S3 do not match the file names in the latest .commit file

2022-08-15 Thread GitBox
nsivabalan commented on issue #5599: URL: https://github.com/apache/hudi/issues/5599#issuecomment-1216134630 thanks. going ahead and closing the github issue. feel free to open new one if you run into any issues. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan closed issue #5599: [SUPPORT] File names in S3 do not match the file names in the latest .commit file

2022-08-15 Thread GitBox
nsivabalan closed issue #5599: [SUPPORT] File names in S3 do not match the file names in the latest .commit file URL: https://github.com/apache/hudi/issues/5599 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216130106 ## CI report: * c13c77b28b3dcd95461dbb19ab6d0caf2c0c0dc7 Azure:

[GitHub] [hudi] nsivabalan commented on issue #6305: Hudi Delta Streamer unable to read Older Dates

2022-08-15 Thread GitBox
nsivabalan commented on issue #6305: URL: https://github.com/apache/hudi/issues/6305#issuecomment-1216129737 CC @rmahindra123 gentle ping @alexeykudinkin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Closed] (HUDI-4565) Docs writing for 0.12.0: archival beyond savepoint, bundle changes, presto update

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4565. - Resolution: Done > Docs writing for 0.12.0: archival beyond savepoint, bundle changes, presto > update >

[jira] [Closed] (HUDI-4307) Document version where replaced filegroups arennot being filtered out

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4307. - Resolution: Done > Document version where replaced filegroups arennot being filtered out >

[jira] [Commented] (HUDI-3013) Docs for Presto and Hudi

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580035#comment-17580035 ] Sagar Sumit commented on HUDI-3013: --- [https://prestodb.io/docs/current/connector/hudi.html]

[jira] [Closed] (HUDI-3013) Docs for Presto and Hudi

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3013. - Resolution: Done > Docs for Presto and Hudi > > > Key: HUDI-3013

[jira] [Updated] (HUDI-3013) Docs for Presto and Hudi

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3013: -- Status: Patch Available (was: In Progress) > Docs for Presto and Hudi > > >

[jira] [Closed] (HUDI-4580) [DOCS] Update quickstart: Spark SQL create table statement fails with "partitioned by"

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4580. - Resolution: Done > [DOCS] Update quickstart: Spark SQL create table statement fails with > "partitioned

[jira] [Updated] (HUDI-4580) [DOCS] Update quickstart: Spark SQL create table statement fails with "partitioned by"

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4580: -- Status: Patch Available (was: In Progress) > [DOCS] Update quickstart: Spark SQL create table

[jira] [Updated] (HUDI-4580) [DOCS] Update quickstart: Spark SQL create table statement fails with "partitioned by"

2022-08-15 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4580: -- Status: In Progress (was: Open) > [DOCS] Update quickstart: Spark SQL create table statement fails

[GitHub] [hudi] nsivabalan commented on issue #6226: [SUPPORT] OCC locks with data on S3 and DynamoDB fails to acquire

2022-08-15 Thread GitBox
nsivabalan commented on issue #6226: URL: https://github.com/apache/hudi/issues/6226#issuecomment-1216128068 @atharvai : can you respond to @zhedoubushishi 's request above when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan commented on issue #6224: [SUPPORT] Caused by: java.lang.IllegalArgumentException: Cannot use marker based rollback strategy on completed instant

2022-08-15 Thread GitBox
nsivabalan commented on issue #6224: URL: https://github.com/apache/hudi/issues/6224#issuecomment-1216127634 closing the issue as we have the fix with 0.12. thanks for reaching out. appreciate you keeping the community buzzing. -- This is an automated message from the Apache Git

[GitHub] [hudi] nsivabalan closed issue #6224: [SUPPORT] Caused by: java.lang.IllegalArgumentException: Cannot use marker based rollback strategy on completed instant

2022-08-15 Thread GitBox
nsivabalan closed issue #6224: [SUPPORT] Caused by: java.lang.IllegalArgumentException: Cannot use marker based rollback strategy on completed instant URL: https://github.com/apache/hudi/issues/6224 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] codope merged pull request #6370: [HUDI-4565][HUDI-4307][HUDI-3013] Updated docs for presto-hudi integration

2022-08-15 Thread GitBox
codope merged PR #6370: URL: https://github.com/apache/hudi/pull/6370 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on issue #6342: [SUPPORT] Reconcile schema fails when multiple fields missing

2022-08-15 Thread GitBox
nsivabalan commented on issue #6342: URL: https://github.com/apache/hudi/issues/6342#issuecomment-1216126792 closing this as not feasible. Feel free to raise new issue if you have more. thanks for keeping the community buzzing. -- This is an automated message from the Apache Git

[GitHub] [hudi] nsivabalan closed issue #6342: [SUPPORT] Reconcile schema fails when multiple fields missing

2022-08-15 Thread GitBox
nsivabalan closed issue #6342: [SUPPORT] Reconcile schema fails when multiple fields missing URL: https://github.com/apache/hudi/issues/6342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] nsivabalan closed issue #6343: [SUPPORT] Reconcile schema fails with promoting field

2022-08-15 Thread GitBox
nsivabalan closed issue #6343: [SUPPORT] Reconcile schema fails with promoting field URL: https://github.com/apache/hudi/issues/6343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] nsivabalan commented on issue #6343: [SUPPORT] Reconcile schema fails with promoting field

2022-08-15 Thread GitBox
nsivabalan commented on issue #6343: URL: https://github.com/apache/hudi/issues/6343#issuecomment-1216126705 closing this as not feasible. Feel free to raise new issue if you have more. thanks for keeping the community buzzing. -- This is an automated message from the Apache Git

[GitHub] [hudi] nsivabalan closed issue #6316: [SUPPORT] Running `--continuous` mode with HoodieMultiTableDeltaStreamer seems to only ingest first table

2022-08-15 Thread GitBox
nsivabalan closed issue #6316: [SUPPORT] Running `--continuous` mode with HoodieMultiTableDeltaStreamer seems to only ingest first table URL: https://github.com/apache/hudi/issues/6316 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] nsivabalan commented on issue #6316: [SUPPORT] Running `--continuous` mode with HoodieMultiTableDeltaStreamer seems to only ingest first table

2022-08-15 Thread GitBox
nsivabalan commented on issue #6316: URL: https://github.com/apache/hudi/issues/6316#issuecomment-1216125929 https://issues.apache.org/jira/browse/HUDI-1881 Lets track the updates from jira. thanks for reporting. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan commented on issue #6316: [SUPPORT] Running `--continuous` mode with HoodieMultiTableDeltaStreamer seems to only ingest first table

2022-08-15 Thread GitBox
nsivabalan commented on issue #6316: URL: https://github.com/apache/hudi/issues/6316#issuecomment-1216125257 @yihua : if you have a tracking jira, can you post it here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] nsivabalan closed issue #6367: [SUPPORT] Failed Job - doing partition and writing data - in Hudi 0.11.0

2022-08-15 Thread GitBox
nsivabalan closed issue #6367: [SUPPORT] Failed Job - doing partition and writing data - in Hudi 0.11.0 URL: https://github.com/apache/hudi/issues/6367 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] nsivabalan commented on issue #6367: [SUPPORT] Failed Job - doing partition and writing data - in Hudi 0.11.0

2022-08-15 Thread GitBox
nsivabalan commented on issue #6367: URL: https://github.com/apache/hudi/issues/6367#issuecomment-1216124839 Will close out the github issue for now. let us know if you have any more questions. thanks for reaching out. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan commented on issue #6403: [SUPPORT] java.lang.IllegalStateException: Duplicate key Option{val=org.apache.hudi.common.HoodiePendingRollbackInfo

2022-08-15 Thread GitBox
nsivabalan commented on issue #6403: URL: https://github.com/apache/hudi/issues/6403#issuecomment-1216124267 We have made some fixed around this. https://github.com/apache/hudi/pull/4123 https://github.com/apache/hudi/pull/4971 If you upgrade to 0.11, you should not see above

[GitHub] [hudi] nsivabalan commented on issue #6389: [SUPPORT] HELP :: Using TWO FIELDS to precombine :: 'hoodie.datasource.write.precombine.field': "column1,column2"

2022-08-15 Thread GitBox
nsivabalan commented on issue #6389: URL: https://github.com/apache/hudi/issues/6389#issuecomment-1216119319 Unfortunately, there is no out of the box solution to use two fields as preCombine for now. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] JoshuaZhuCN opened a new issue, #6405: [SUPPORT] Hoodie table not found in path Unable to find a hudi table for the user provided paths.

2022-08-15 Thread GitBox
JoshuaZhuCN opened a new issue, #6405: URL: https://github.com/apache/hudi/issues/6405 What causes this error when using sparksql to create a table。 Error:Hoodie table not found in path Unable to find a hudi table for the user provided paths. SQL DDL like: ``` CREATE

[GitHub] [hudi] Hexiaoqiao commented on a diff in pull request #6384: [HUDI-4613] Avoid the use of regex expressions when call hoodieFileGroup#addLogFile function

2022-08-15 Thread GitBox
Hexiaoqiao commented on code in PR #6384: URL: https://github.com/apache/hudi/pull/6384#discussion_r946309161 ## hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java: ## @@ -64,19 +64,17 @@ import java.util.function.Function; import java.util.function.Predicate;

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216100427 ## CI report: * c13c77b28b3dcd95461dbb19ab6d0caf2c0c0dc7 Azure:

[GitHub] [hudi] SteNicholas commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
SteNicholas commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216098971 @danny0405, I have applied the above patch. PTAL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] hudi-bot commented on pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
hudi-bot commented on PR #6396: URL: https://github.com/apache/hudi/pull/6396#issuecomment-1216098307 ## CI report: * b8b7d966955c6b731ddc00b28300c18c2a630651 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216098215 ## CI report: * c13c77b28b3dcd95461dbb19ab6d0caf2c0c0dc7 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
hudi-bot commented on PR #6396: URL: https://github.com/apache/hudi/pull/6396#issuecomment-1216096302 ## CI report: * b8b7d966955c6b731ddc00b28300c18c2a630651 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216096212 ## CI report: * c13c77b28b3dcd95461dbb19ab6d0caf2c0c0dc7 Azure:

[GitHub] [hudi] Zhangshunyu commented on issue #6398: [SUPPORT] Metadata table thows hbase exceptions

2022-08-15 Thread GitBox
Zhangshunyu commented on issue #6398: URL: https://github.com/apache/hudi/issues/6398#issuecomment-1216095469 the version of hbase api is 2.4.9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] Zhangshunyu closed issue #6380: [SUPPORT] Will clustering update metadata table?

2022-08-15 Thread GitBox
Zhangshunyu closed issue #6380: [SUPPORT] Will clustering update metadata table? URL: https://github.com/apache/hudi/issues/6380 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] Zhangshunyu commented on issue #6380: [SUPPORT] Will clustering update metadata table?

2022-08-15 Thread GitBox
Zhangshunyu commented on issue #6380: URL: https://github.com/apache/hudi/issues/6380#issuecomment-1216093360 > yes, metadata table is updated. clustering has been tested w/ metadata as well. Do you see any strange behavior ? which version of hudi are you using. OK, thanks for

[GitHub] [hudi] SteNicholas commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-15 Thread GitBox
SteNicholas commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216081958 @danny0405, I have applied above patch. PTAL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] danny0405 commented on pull request #6401: [HUDI-4623] Write mor log by suffix for different flink jobs

2022-08-15 Thread GitBox
danny0405 commented on PR #6401: URL: https://github.com/apache/hudi/pull/6401#issuecomment-1216080119 Can we explain a little why we need a special suffix for log, we do not even support multi-writer concurrency for flink side now. -- This is an automated message from the Apache Git

[jira] [Commented] (HUDI-4574) Failed to create timeline-server marker due to HoodieRemoteException

2022-08-15 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580011#comment-17580011 ] Danny Chen commented on HUDI-4574: -- One fix via master branch: bad954c3b27b9e5236b93c6b0e2be219337fa179

[GitHub] [hudi] danny0405 merged pull request #6383: [HUDI-4574]fixed timeline based marker thread safaty issue

2022-08-15 Thread GitBox
danny0405 merged PR #6383: URL: https://github.com/apache/hudi/pull/6383 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [HUDI-4574] Fixed timeline based marker thread safety issue (#6383)

2022-08-15 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new bad954c3b2 [HUDI-4574] Fixed timeline based

[jira] [Updated] (HUDI-4574) Failed to create timeline-server marker due to HoodieRemoteException

2022-08-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4574: - Labels: pull-request-available (was: ) > Failed to create timeline-server marker due to

[GitHub] [hudi] novisfff commented on a diff in pull request #6383: [HUDI-4574]fixed timeline based marker thread safaty issue

2022-08-15 Thread GitBox
novisfff commented on code in PR #6383: URL: https://github.com/apache/hudi/pull/6383#discussion_r946276905 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/handlers/marker/MarkerCreationDispatchingRunnable.java: ## @@ -66,8 +66,9 @@ public void run() {

[GitHub] [hudi] danny0405 commented on a diff in pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
danny0405 commented on code in PR #6396: URL: https://github.com/apache/hudi/pull/6396#discussion_r946274840 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieTableFactory.java: ## @@ -293,6 +293,47 @@ void testSetupReadOptionsForSource() {

[GitHub] [hudi] danny0405 commented on a diff in pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
danny0405 commented on code in PR #6396: URL: https://github.com/apache/hudi/pull/6396#discussion_r946274840 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieTableFactory.java: ## @@ -293,6 +293,47 @@ void testSetupReadOptionsForSource() {

[GitHub] [hudi] danny0405 commented on a diff in pull request #6396: [HUDI-4621] all data fill in the same bucket because not check INDEX_KEY_FIELD

2022-08-15 Thread GitBox
danny0405 commented on code in PR #6396: URL: https://github.com/apache/hudi/pull/6396#discussion_r946273670 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -207,10 +208,23 @@ private static void

[GitHub] [hudi] KnightChess commented on issue #6400: [SUPPORT] MergeInto syntax merge_condition does not support Non-Equal condition

2022-08-15 Thread GitBox
KnightChess commented on issue #6400: URL: https://github.com/apache/hudi/issues/6400#issuecomment-1216049140 now, hudi not support `Non-Equal` merge condition in `merge sql` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[jira] [Commented] (HUDI-741) Fix Hoodie's schema evolution checks

2022-08-15 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579978#comment-17579978 ] Alexey Kudinkin commented on HUDI-741: -- +1 on [~yx3...@gmail.com] comment: Dropping columns should not

[jira] [Updated] (HUDI-4586) Address S3 timeouts in Bloom Index with metadata table

2022-08-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4586: Description: For partitioned table, there are significant number of S3 requests timeout causing the

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-08-15 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1216025522 ## CI report: * 9cb5a7a62af7c2a6bf418b7556caa56348522a00 Azure:

[jira] [Updated] (HUDI-4586) Address S3 timeouts in Bloom Index with metadata table

2022-08-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4586: Attachment: Screen Shot 2022-08-15 at 17.39.01.png > Address S3 timeouts in Bloom Index with metadata table

[jira] [Updated] (HUDI-4586) Address S3 timeouts in Bloom Index with metadata table

2022-08-15 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4586: Description: For partitioned table, there are significant number of S3 requests timeout causing the

[jira] [Created] (HUDI-4626) Partitioning table by `_hoodie_partition_path` fails

2022-08-15 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-4626: - Summary: Partitioning table by `_hoodie_partition_path` fails Key: HUDI-4626 URL: https://issues.apache.org/jira/browse/HUDI-4626 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #6386: [HUDI-4616] Adding `PulsarSource` to `DeltaStreamer` to support ingesting from Apache Pulsar

2022-08-15 Thread GitBox
hudi-bot commented on PR #6386: URL: https://github.com/apache/hudi/pull/6386#issuecomment-1215988196 ## CI report: * 2dae584421add6a89fbf4c7b51775581d37c03c5 Azure:

[hudi] branch master updated (2633e88a39 -> f5bb78bbef)

2022-08-15 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 2633e88a39 [HUDI-4608] Fix upgrade command in Hudi CLI (#6374) add f5bb78bbef [HUDI-4609] Improve usability of

[GitHub] [hudi] yihua merged pull request #6377: [HUDI-4609] Improve usability of upgrade/downgrade commands in Hudi CLI

2022-08-15 Thread GitBox
yihua merged PR #6377: URL: https://github.com/apache/hudi/pull/6377 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on a diff in pull request #5632: [HUDI-4122] Fix NPE caused by adding kafka nodes

2022-08-15 Thread GitBox
nsivabalan commented on code in PR #5632: URL: https://github.com/apache/hudi/pull/5632#discussion_r946216914 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java: ## @@ -287,6 +292,32 @@ public OffsetRange[] getNextOffsetRanges(Option

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-08-15 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1215948792 ## CI report: * 17ca1f72569520fecd9eeb509b9925da9134f898 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-08-15 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1215945372 ## CI report: * 17ca1f72569520fecd9eeb509b9925da9134f898 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6377: [HUDI-4609] Improve usability of upgrade/downgrade commands in Hudi CLI

2022-08-15 Thread GitBox
hudi-bot commented on PR #6377: URL: https://github.com/apache/hudi/pull/6377#issuecomment-1215876750 ## CI report: * ee73326b4a8468f5460cca1a3b525eea42096b20 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6386: [HUDI-4616] Adding `PulsarSource` to `DeltaStreamer` to support ingesting from Apache Pulsar

2022-08-15 Thread GitBox
hudi-bot commented on PR #6386: URL: https://github.com/apache/hudi/pull/6386#issuecomment-1215876870 ## CI report: * 2dae584421add6a89fbf4c7b51775581d37c03c5 Azure:

[hudi] branch asf-site updated: [HUDI-4580][DOCS] Update docs of Spark SQL create table statement (#6402)

2022-08-15 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 0d73341513 [HUDI-4580][DOCS] Update docs of

[GitHub] [hudi] xushiyan merged pull request #6402: [HUDI-4580][DOCS] Update docs of Spark SQL create table statement

2022-08-15 Thread GitBox
xushiyan merged PR #6402: URL: https://github.com/apache/hudi/pull/6402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #6386: [HUDI-4616] Adding `PulsarSource` to `DeltaStreamer` to support ingesting from Apache Pulsar

2022-08-15 Thread GitBox
hudi-bot commented on PR #6386: URL: https://github.com/apache/hudi/pull/6386#issuecomment-1215864807 ## CI report: * 2dae584421add6a89fbf4c7b51775581d37c03c5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6377: [HUDI-4609] Improve usability of upgrade/downgrade commands in Hudi CLI

2022-08-15 Thread GitBox
hudi-bot commented on PR #6377: URL: https://github.com/apache/hudi/pull/6377#issuecomment-1215864616 ## CI report: * ee73326b4a8468f5460cca1a3b525eea42096b20 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6377: [HUDI-4609] Improve usability of upgrade/downgrade commands in Hudi CLI

2022-08-15 Thread GitBox
hudi-bot commented on PR #6377: URL: https://github.com/apache/hudi/pull/6377#issuecomment-1215852245 ## CI report: * ee73326b4a8468f5460cca1a3b525eea42096b20 Azure:

[GitHub] [hudi] yihua commented on pull request #6377: [HUDI-4609] Improve usability of upgrade/downgrade commands in Hudi CLI

2022-08-15 Thread GitBox
yihua commented on PR #6377: URL: https://github.com/apache/hudi/pull/6377#issuecomment-1215780358 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[hudi] branch master updated (997200f27f -> 2633e88a39)

2022-08-15 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 997200f27f [MINOR] fix progress field calculate logic in HoodieLogRecordReader (#6291) add 2633e88a39 [HUDI-4608]

[GitHub] [hudi] yihua merged pull request #6374: [HUDI-4608] Fix upgrade command in Hudi CLI

2022-08-15 Thread GitBox
yihua merged PR #6374: URL: https://github.com/apache/hudi/pull/6374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch asf-site updated: [HUDI-4579] Add docs on upgrading and downgrading table through CLI (#6376)

2022-08-15 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new a05eac89fd [HUDI-4579] Add docs on upgrading

[GitHub] [hudi] yihua merged pull request #6376: [HUDI-4579] Add docs on upgrading and downgrading table through CLI

2022-08-15 Thread GitBox
yihua merged PR #6376: URL: https://github.com/apache/hudi/pull/6376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] alexeykudinkin commented on pull request #5111: [HUDI-3695] Add a ORC reader in HoodieBaseRelation

2022-08-15 Thread GitBox
alexeykudinkin commented on PR #5111: URL: https://github.com/apache/hudi/pull/5111#issuecomment-1215754809 @miomiocat can you please rebase this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] alexeykudinkin commented on pull request #6386: [WIP][HUDI-4616] Adding `PulsarSource` to `DeltaStreamer` to support ingesting from Apache Pulsar

2022-08-15 Thread GitBox
alexeykudinkin commented on PR #6386: URL: https://github.com/apache/hudi/pull/6386#issuecomment-1215750513 One issue we have currently is that current Pulsar's Spark Connector doesn't shutdown cleanly (it doesn't shutdown the Netty's EVL thread-group, which keeps running even after client

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6386: [WIP][HUDI-4616] Adding `PulsarSource` to `DeltaStreamer` to support ingesting from Apache Pulsar

2022-08-15 Thread GitBox
alexeykudinkin commented on code in PR #6386: URL: https://github.com/apache/hudi/pull/6386#discussion_r946069343 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java: ## @@ -290,6 +292,7 @@ public OffsetRange[]

[jira] [Created] (HUDI-4625) Clean up KafkaOffsetGen

2022-08-15 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-4625: - Summary: Clean up KafkaOffsetGen Key: HUDI-4625 URL: https://issues.apache.org/jira/browse/HUDI-4625 Project: Apache Hudi Issue Type: Bug

[jira] [Created] (HUDI-4624) Make sure all DeltaStreamer Sources are Closeable

2022-08-15 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-4624: - Summary: Make sure all DeltaStreamer Sources are Closeable Key: HUDI-4624 URL: https://issues.apache.org/jira/browse/HUDI-4624 Project: Apache Hudi Issue

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-08-15 Thread GitBox
alexeykudinkin commented on code in PR #5629: URL: https://github.com/apache/hudi/pull/5629#discussion_r946077270 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java: ## @@ -20,62 +20,33 @@ package

[GitHub] [hudi] ankitchandnani opened a new issue, #6404: [SUPPORT] Hudi Deltastreamer CSV ingestion issue

2022-08-15 Thread GitBox
ankitchandnani opened a new issue, #6404: URL: https://github.com/apache/hudi/issues/6404 ### Tips before filing an issue Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? Yes ### Describe the problem you faced Below is a sample chunk from a csv that

[GitHub] [hudi] yihua commented on a diff in pull request #6374: [HUDI-4608] Fix upgrade command in Hudi CLI

2022-08-15 Thread GitBox
yihua commented on code in PR #6374: URL: https://github.com/apache/hudi/pull/6374#discussion_r946055947 ## hudi-cli/src/test/java/org/apache/hudi/cli/commands/TestUpgradeDowngradeCommand.java: ## @@ -83,10 +88,32 @@ public void init() throws Exception {

[GitHub] [hudi] Armelabdelkbir opened a new issue, #6403: [SUPPORT] java.lang.IllegalStateException: Duplicate key Option{val=org.apache.hudi.common.HoodiePendingRollbackInfo

2022-08-15 Thread GitBox
Armelabdelkbir opened a new issue, #6403: URL: https://github.com/apache/hudi/issues/6403 Hello community, i'm using Hudi to change data capture with spark structured streaming + kafka + debezium , my jobs works well, but since few days i got some errors on some of my streams :

[GitHub] [hudi] yihua commented on a diff in pull request #6376: [HUDI-4579] Add docs on upgrading and downgrading table through CLI

2022-08-15 Thread GitBox
yihua commented on code in PR #6376: URL: https://github.com/apache/hudi/pull/6376#discussion_r946050350 ## website/docs/cli.md: ## @@ -419,4 +487,66 @@ savepoints show savepoint rollback --savepoint 20220128160245447 --sparkMaster local[2] ``` +### Upgrade and Downgrade

[GitHub] [hudi] yihua opened a new pull request, #6402: [HUDI-4580][DOCS] Update docs of Spark SQL create table statement

2022-08-15 Thread GitBox
yihua opened a new pull request, #6402: URL: https://github.com/apache/hudi/pull/6402 ### Change Logs This PR updates the docs of Spark SQL create table statement. Based on #4584, the create table statement for an existing Hudi table does not require `partitioned by` or `options`

[jira] [Updated] (HUDI-4580) [DOCS] Update quickstart: Spark SQL create table statement fails with "partitioned by"

2022-08-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4580: - Labels: pull-request-available (was: ) > [DOCS] Update quickstart: Spark SQL create table

[GitHub] [hudi] yihua commented on a diff in pull request #6383: [HUDI-4607]fixed timeline based marker thread safaty issue

2022-08-15 Thread GitBox
yihua commented on code in PR #6383: URL: https://github.com/apache/hudi/pull/6383#discussion_r946041860 ## hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/handlers/marker/MarkerCreationDispatchingRunnable.java: ## @@ -66,8 +66,9 @@ public void run() {

[GitHub] [hudi] nsivabalan commented on issue #6367: [SUPPORT] Failed Job - doing partition and writing data - in Hudi 0.11.0

2022-08-15 Thread GitBox
nsivabalan commented on issue #6367: URL: https://github.com/apache/hudi/issues/6367#issuecomment-1215514513 thanks for confirming. wrt enabling debug logs, I know the usual way to enable for any spark job. ``` --conf

[GitHub] [hudi] nsivabalan commented on issue #6342: [SUPPORT] Reconcile schema fails when multiple fields missing

2022-08-15 Thread GitBox
nsivabalan commented on issue #6342: URL: https://github.com/apache/hudi/issues/6342#issuecomment-1215479286 reconcilation of schemas work only if missing columns are in the end. For eg: Commit1: Schema: col1, col2, col3, col4 Commit2: Schema: col1, col2, col3,

[GitHub] [hudi] nsivabalan commented on issue #6343: [SUPPORT] Reconcile schema fails with promoting field

2022-08-15 Thread GitBox
nsivabalan commented on issue #6343: URL: https://github.com/apache/hudi/issues/6343#issuecomment-1215471603 Long to Integer is not backwards compatible data type promotion. Don't think hudi supports this promotion. Let us know if you have any more questions. -- This is an

[GitHub] [hudi] hudi-bot commented on pull request #6401: [HUDI-4623] Write mor log by suffix for different flink jobs

2022-08-15 Thread GitBox
hudi-bot commented on PR #6401: URL: https://github.com/apache/hudi/pull/6401#issuecomment-1215362490 ## CI report: * 19f985687513ba5e5c450201fc05932ddfc189e8 UNKNOWN * 326ea240eb871d20001616da955a58ac12713ca7 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6365: [HUDI-4601] read error from MOR table after compaction

2022-08-15 Thread GitBox
hudi-bot commented on PR #6365: URL: https://github.com/apache/hudi/pull/6365#issuecomment-1215337975 ## CI report: * 3b63d1c4e110296732cb5042dd97a2fa4e715bfd Azure:

  1   2   >