Re: [I] Tracking ticket for folks to be added to slack group [hudi]

2024-01-29 Thread via GitHub
moweiyang0214 commented on issue #143: URL: https://github.com/apache/hudi/issues/143#issuecomment-1916257093 hi,can you add me to the slack? gabriellawang...@gmail.com -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Hudi 6868 - Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10577: URL: https://github.com/apache/hudi/pull/10577#issuecomment-1916253667 ## CI report: * ff12f8a7d10731760db2cfab799618a406507979 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1916253424 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [I] [SUPPORT] Querying Hudi tables with Spark+Velox(C++), ObjectSizeCalculator.getObjectSize hangs causing about a 50-second delay in queries [hudi]

2024-01-29 Thread via GitHub
majian1998 commented on issue #10580: URL: https://github.com/apache/hudi/issues/10580#issuecomment-1916204120 @KnightChess It seems like you encountered this issue just once and got stuck for a long time, right? On my end, I can consistently reproduce the problem, but it only gets stuck

Re: [PR] Hudi 6868 - Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10577: URL: https://github.com/apache/hudi/pull/10577#issuecomment-1916198453 ## CI report: * ff12f8a7d10731760db2cfab799618a406507979 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1916198206 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-6939] Add async archiving for Flink [hudi]

2024-01-29 Thread via GitHub
zhuanshenbsj1 commented on PR #9854: URL: https://github.com/apache/hudi/pull/9854#issuecomment-1916190926 We need to add this pr to 0.14.2. @danny0405 Async archiving for flink cannot be performed in 0.14.1. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1916190235 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [I] Upsert method not working while using "Record level index" in Apache Hudi 0.14 through EMR 6.15 [hudi]

2024-01-29 Thread via GitHub
ad1happy2go commented on issue #10587: URL: https://github.com/apache/hudi/issues/10587#issuecomment-1916163491 @SudhirSaxena As the error suggests, you are writing on existing hudi table for which database name is different. Can you confirm value for tgt_db. ```

[jira] [Updated] (HUDI-7359) Add serialVersionUID in org.apache.hudi.common.model.HoodieRecord

2024-01-29 Thread wangguo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangguo updated HUDI-7359: -- Affects Version/s: 0.14.1 0.14.0 0.13.0

[jira] [Created] (HUDI-7359) Add serialVersionUID in org.apache.hudi.common.model.HoodieRecord

2024-01-29 Thread wangguo (Jira)
wangguo created HUDI-7359: - Summary: Add serialVersionUID in org.apache.hudi.common.model.HoodieRecord Key: HUDI-7359 URL: https://issues.apache.org/jira/browse/HUDI-7359 Project: Apache Hudi Issue

Re: [PR] [HUDI-7340] Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer [hudi]

2024-01-29 Thread via GitHub
linliu-code commented on code in PR #10588: URL: https://github.com/apache/hudi/pull/10588#discussion_r1470621041 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala: ##

Re: [PR] [HUDI-7340] Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10588: URL: https://github.com/apache/hudi/pull/10588#issuecomment-1916119837 ## CI report: * 5773aa616a22e32db4cb07ce361ce7e9a7c728e0 Azure:

Re: [PR] [HUDI-7357] Introduce generic StorageConfiguration [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10586: URL: https://github.com/apache/hudi/pull/10586#issuecomment-1916119795 ## CI report: * e6a99b7319648fce943abc73b460239350ff18d3 Azure:

Re: [I] [SUPPORT] AWS Athena query fail when compaction is scheduled for MOR table [hudi]

2024-01-29 Thread via GitHub
brightwon commented on issue #9907: URL: https://github.com/apache/hudi/issues/9907#issuecomment-1916086956 @ad1happy2go Sorry I'm late. The above error log provided by the Athena console is all. To fix this problem, Athena's hudi version need to be upgraded? -- This is an

Re: [PR] [HUDI-7340] Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10588: URL: https://github.com/apache/hudi/pull/10588#issuecomment-1916078225 ## CI report: * 5773aa616a22e32db4cb07ce361ce7e9a7c728e0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7340] Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10588: URL: https://github.com/apache/hudi/pull/10588#discussion_r1470587289 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala: ## @@

[jira] [Updated] (HUDI-7340) Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer

2024-01-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7340: - Labels: pull-request-available (was: ) > Use spillable map for cached log records in

[PR] [HUDI-7340] Use spillable map for cached log records in HoodieBaseFileGroupRecordBuffer [hudi]

2024-01-29 Thread via GitHub
danny0405 opened a new pull request, #10588: URL: https://github.com/apache/hudi/pull/10588 ### Change Logs should use the spillable map instead. ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1916026401 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10575: URL: https://github.com/apache/hudi/pull/10575#issuecomment-1916021664 ## CI report: * 24d06d5c92ebb9ef98c4689365eabd1e197c7197 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1916021555 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10570: URL: https://github.com/apache/hudi/pull/10570#discussion_r1470552546 ## hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java: ## @@ -190,7 +191,7 @@ private int copyNonArchivedInstants(List instants, int limit, Str

[I] Upsert method not working while using "Record level index" in Apache Hudi 0.14 through EMR 6.15 [hudi]

2024-01-29 Thread via GitHub
SudhirSaxena opened a new issue, #10587: URL: https://github.com/apache/hudi/issues/10587 Hi, I am facing issues for upsert mode in hudi 0.14 RLI in EMR 6.15 Spark 3.4.1 using "Record level Index". i see insert mode working as expected but upsert mode is not working with existing

Re: [PR] [HUDI-7357] Introduce generic StorageConfiguration [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10586: URL: https://github.com/apache/hudi/pull/10586#issuecomment-1915988465 ## CI report: * e6a99b7319648fce943abc73b460239350ff18d3 Azure:

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915988414 ## CI report: * f0497f561388edb7339fa8afb2537fc36f3eef1a Azure:

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10570: URL: https://github.com/apache/hudi/pull/10570#issuecomment-1915988385 ## CI report: * 9da671c05be917089dd8d158be5900671e16a65f Azure:

Re: [PR] [HUDI-7184] Add IncrementalQueryAnalyzer for completion time based in… [hudi]

2024-01-29 Thread via GitHub
vinothchandar commented on code in PR #10255: URL: https://github.com/apache/hudi/pull/10255#discussion_r1470522333 ## hudi-common/src/main/java/org/apache/hudi/common/table/read/IncrementalQueryAnalyzer.java: ## @@ -0,0 +1,428 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7357] Introduce generic StorageConfiguration [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10586: URL: https://github.com/apache/hudi/pull/10586#issuecomment-1915982631 ## CI report: * e6a99b7319648fce943abc73b460239350ff18d3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
yihua commented on code in PR #10570: URL: https://github.com/apache/hudi/pull/10570#discussion_r1470526275 ## hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java: ## @@ -190,7 +191,7 @@ private int copyNonArchivedInstants(List instants, int limit, Str

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915974411 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * dfb687aa6a2a353a93dd04d6a68eec1797103006 Azure:

Re: [PR] [WIP] [HUDI-6787] Implement the HoodieFileGroupReader API for Hive [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10422: URL: https://github.com/apache/hudi/pull/10422#discussion_r1470524689 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HiveHoodieReaderContext.java: ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Updated] (HUDI-7358) Support more metadata services

2024-01-29 Thread liujinhui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liujinhui updated HUDI-7358: Description: Currently, jdbc catalog is supported. We may need more catalog support. hivecatalog, rest

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10575: URL: https://github.com/apache/hudi/pull/10575#discussion_r1470523474 ## hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopSeekableDataInputStream.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software

[jira] [Created] (HUDI-7358) Support more metadata services

2024-01-29 Thread liujinhui (Jira)
liujinhui created HUDI-7358: --- Summary: Support more metadata services Key: HUDI-7358 URL: https://issues.apache.org/jira/browse/HUDI-7358 Project: Apache Hudi Issue Type: New Feature

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
yihua commented on code in PR #10575: URL: https://github.com/apache/hudi/pull/10575#discussion_r1470519754 ## hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopSeekableDataInputStream.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation

[jira] [Updated] (HUDI-7357) Introduce generic StorageConfiguration

2024-01-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7357: - Labels: pull-request-available (was: ) > Introduce generic StorageConfiguration >

[PR] [HUDI-7357] Introduce generic StorageConfiguration [hudi]

2024-01-29 Thread via GitHub
yihua opened a new pull request, #10586: URL: https://github.com/apache/hudi/pull/10586 ### Change Logs This PR introduces the generic `StorageConfiguration` to store configuration for I/O with `HoodieStorage`. Given there's overhead of reinitializing Hadoop's `Configuration`

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1915920579 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10575: URL: https://github.com/apache/hudi/pull/10575#issuecomment-1915914349 ## CI report: * b990f6f3976e0e8374b159fbdb515fd154f5caf7 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1915914177 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10570: URL: https://github.com/apache/hudi/pull/10570#discussion_r1470482734 ## hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java: ## @@ -190,7 +191,7 @@ private int copyNonArchivedInstants(List instants, int limit, Str

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
danny0405 commented on code in PR #10575: URL: https://github.com/apache/hudi/pull/10575#discussion_r1470480291 ## hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopSeekableDataInputStream.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
jonvex commented on code in PR #10278: URL: https://github.com/apache/hudi/pull/10278#discussion_r1470470333 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestSpark35RecordPositionMetadataColumn.scala: ## @@ -1,147 +0,0 @@ -/* Review

Re: [PR] added new videos for hudi oss site [hudi]

2024-01-29 Thread via GitHub
nfarah86 commented on PR #10563: URL: https://github.com/apache/hudi/pull/10563#issuecomment-1915877966 @bhasudha updated - I also took the liberty to update other tags too so they are consistent https://github.com/apache/hudi/assets/5392555/9de2fbc8-aa0d-4205-ad17-29cea62eeeb8;>

Re: [PR] [HUDI-7347][Stacked on HUDI-7335] Introduce SeekableDataInputStream for random access [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10575: URL: https://github.com/apache/hudi/pull/10575#issuecomment-1915865237 ## CI report: * b990f6f3976e0e8374b159fbdb515fd154f5caf7 Azure:

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915865170 ## CI report: * 18a71c4f2baf775d5ba4c34e289a6e0986051e7e Azure:

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10570: URL: https://github.com/apache/hudi/pull/10570#issuecomment-1915865121 ## CI report: * 2b57e0f308524c9495941f4c3dcff6e1e62220a3 Azure:

[jira] [Updated] (HUDI-7357) Introduce generic StorageConfiguration

2024-01-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7357: Priority: Blocker (was: Major) > Introduce generic StorageConfiguration >

[jira] [Created] (HUDI-7357) Introduce generic StorageConfiguration

2024-01-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7357: --- Summary: Introduce generic StorageConfiguration Key: HUDI-7357 URL: https://issues.apache.org/jira/browse/HUDI-7357 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-7357) Introduce generic StorageConfiguration

2024-01-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7357: Fix Version/s: 1.0.0 > Introduce generic StorageConfiguration > -- > >

[jira] [Assigned] (HUDI-7357) Introduce generic StorageConfiguration

2024-01-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7357: --- Assignee: Ethan Guo > Introduce generic StorageConfiguration >

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915857211 ## CI report: * 18a71c4f2baf775d5ba4c34e289a6e0986051e7e Azure:

Re: [PR] [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10570: URL: https://github.com/apache/hudi/pull/10570#issuecomment-1915857145 ## CI report: * 2b57e0f308524c9495941f4c3dcff6e1e62220a3 Azure:

Re: [PR] [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil [hudi]

2024-01-29 Thread via GitHub
yihua commented on code in PR #10571: URL: https://github.com/apache/hudi/pull/10571#discussion_r1470429485 ## hudi-io/src/main/java/org/apache/hudi/common/util/StringUtils.java: ## @@ -33,8 +33,10 @@ */ public class StringUtils { - public static final char[] HEX_CHAR =

Re: [PR] added new videos for hudi oss site [hudi]

2024-01-29 Thread via GitHub
nfarah86 commented on PR #10563: URL: https://github.com/apache/hudi/pull/10563#issuecomment-1915822819 > @nfarah86 There are still tags for the previous videos with plural. For ex: > > * https://hudi.apache.org/videos/tags/deletes > *

Re: [PR] [WIP] [HUDI-6902] Create a dummy PR to trigger tests [hudi]

2024-01-29 Thread via GitHub
linliu-code commented on PR #10464: URL: https://github.com/apache/hudi/pull/10464#issuecomment-1915819242 Not needed anymore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [WIP] [HUDI-6902] Create a dummy PR to trigger tests [hudi]

2024-01-29 Thread via GitHub
linliu-code closed pull request #10464: [WIP] [HUDI-6902] Create a dummy PR to trigger tests URL: https://github.com/apache/hudi/pull/10464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]

2024-01-29 Thread via GitHub
linliu-code commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1915818720 This is not needed anymore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]

2024-01-29 Thread via GitHub
linliu-code closed pull request #10534: [HUDI-6902] Take a dump at hadoop-mr-java-client module URL: https://github.com/apache/hudi/pull/10534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

(hudi) branch HUDI-7336-hoodie-storage-abstraction deleted (was 8b765537863)

2024-01-29 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch HUDI-7336-hoodie-storage-abstraction in repository https://gitbox.apache.org/repos/asf/hudi.git was 8b765537863 Address review comments and add public API annotations The revisions that

(hudi) branch master updated (565e7c566ed -> e9389ffde53)

2024-01-29 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 565e7c566ed [HUDI-6902] Disable a flaky test (#10551) add e9389ffde53 [HUDI-7346] Remove usage of

Re: [PR] [HUDI-7346] Remove usage of org.apache.hadoop.hbase.util.Bytes [hudi]

2024-01-29 Thread via GitHub
yihua merged PR #10574: URL: https://github.com/apache/hudi/pull/10574 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

(hudi) branch asf-site updated: took down the banner (#10585)

2024-01-29 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new a5a9c9c5533 took down the banner

Re: [PR] took down the banner [hudi]

2024-01-29 Thread via GitHub
bhasudha merged PR #10585: URL: https://github.com/apache/hudi/pull/10585 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915722406 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * c95e415569c43fb60e97c4fb92b7df6fe833cf64 Azure:

Re: [PR] added new videos for hudi oss site [hudi]

2024-01-29 Thread via GitHub
bhasudha commented on PR #10563: URL: https://github.com/apache/hudi/pull/10563#issuecomment-1915715215 @nfarah86 There are still tags for the previous videos with plural. For ex: - https://hudi.apache.org/videos/tags/deletes - https://hudi.apache.org/videos/tags/bulk-inserts -

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915713150 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * c95e415569c43fb60e97c4fb92b7df6fe833cf64 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915705420 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * c95e415569c43fb60e97c4fb92b7df6fe833cf64 Azure:

[PR] took down the banner [hudi]

2024-01-29 Thread via GitHub
nfarah86 opened a new pull request, #10585: URL: https://github.com/apache/hudi/pull/10585 took down the banner @bhasudha https://github.com/apache/hudi/assets/5392555/9e1994d7-4f06-43aa-abc2-94ff68940ba6;> -- This is an automated message from the Apache Git Service. To

Re: [PR] [HUDI-7346] Remove usage of org.apache.hadoop.hbase.util.Bytes [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10574: URL: https://github.com/apache/hudi/pull/10574#issuecomment-1915638274 ## CI report: * 7210467ed560266f5c0f9ecb680c1b636ced0d27 UNKNOWN * 8b9054b382c2f8bd1c0246ae26a762b0c836922d UNKNOWN * bd687d133c4c1b3a2128254667c0bfd66c4fc082 Azure:

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915627647 ## CI report: * 18a71c4f2baf775d5ba4c34e289a6e0986051e7e Azure:

Re: [PR] [HUDI-7334] Remove EMBEDDED_KV_STORE based FSV usage in tests [hudi]

2024-01-29 Thread via GitHub
linliu-code commented on PR #10551: URL: https://github.com/apache/hudi/pull/10551#issuecomment-1915604986 > hey @linliu-code : if we have root caused it to embedded kv store, why not re-enable tests to use in-memory FSV. we should not keep them disabled. I see some of core index tests are

(hudi) branch asf-site updated: Improve left nav for better layout (#10584)

2024-01-29 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 2ca7b8faa60 Improve left nav for better

Re: [PR] Improve left nav for better layout [hudi]

2024-01-29 Thread via GitHub
bhasudha merged PR #10584: URL: https://github.com/apache/hudi/pull/10584 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7334] Remove EMBEDDED_KV_STORE based FSV usage in tests [hudi]

2024-01-29 Thread via GitHub
nsivabalan commented on PR #10551: URL: https://github.com/apache/hudi/pull/10551#issuecomment-1915561666 hey @linliu-code : if we have root caused it to embedded kv store, why not re-enable tests to use in-memory FSV. we should not keep them disabled. I see some of core index tests are

Re: [PR] [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10571: URL: https://github.com/apache/hudi/pull/10571#issuecomment-1915534780 ## CI report: * 2efd13452af6b99766ddc95b2227353455bb2afd Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915533927 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 311548fb623934e115b27ec42bf155c484f71339 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915463489 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 1f02516cda2e16c13c772a829f08da9095f3d357 Azure:

Re: [PR] Improve left nav for better layout [hudi]

2024-01-29 Thread via GitHub
bhasudha commented on PR #10584: URL: https://github.com/apache/hudi/pull/10584#issuecomment-1915447537 Tested locally. ![Screenshot 2024-01-29 at 11 49 21 AM](https://github.com/apache/hudi/assets/2179254/8aaaca94-7f18-4221-9231-538d58a087d8) -- This is an automated message from

[PR] Improve left nav for better layout [hudi]

2024-01-29 Thread via GitHub
bhasudha opened a new pull request, #10584: URL: https://github.com/apache/hudi/pull/10584 ### Change Logs Change left navigation for better organization ### Impact site change ### Risk level (write none, low medium or high below) No risk. site update.

Re: [PR] [HUDI-7337] Implement MetricsReporter that reports metrics to M3 [hudi]

2024-01-29 Thread via GitHub
kbuci commented on code in PR #10565: URL: https://github.com/apache/hudi/pull/10565#discussion_r1470107376 ## hudi-client/hudi-client-common/pom.xml: ## @@ -120,6 +120,16 @@ io.prometheus simpleclient_pushgateway + + com.uber.m3 + tally-m3

Re: [PR] [HUDI-7337] Implement MetricsReporter that reports metrics to M3 [hudi]

2024-01-29 Thread via GitHub
kbuci commented on code in PR #10565: URL: https://github.com/apache/hudi/pull/10565#discussion_r1470107376 ## hudi-client/hudi-client-common/pom.xml: ## @@ -120,6 +120,16 @@ io.prometheus simpleclient_pushgateway + + com.uber.m3 + tally-m3

Re: [PR] [HUDI-7346] Remove usage of org.apache.hadoop.hbase.util.Bytes [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10574: URL: https://github.com/apache/hudi/pull/10574#issuecomment-1915374757 ## CI report: * 7210467ed560266f5c0f9ecb680c1b636ced0d27 UNKNOWN * 2a39068e7f57e29a5a525e88e01c7248c0f1b649 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915373820 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 1f02516cda2e16c13c772a829f08da9095f3d357 Azure:

Re: [PR] [HUDI-7346] Remove usage of org.apache.hadoop.hbase.util.Bytes [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10574: URL: https://github.com/apache/hudi/pull/10574#issuecomment-1915364217 ## CI report: * 7210467ed560266f5c0f9ecb680c1b636ced0d27 UNKNOWN * 2a39068e7f57e29a5a525e88e01c7248c0f1b649 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915363286 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 1f02516cda2e16c13c772a829f08da9095f3d357 Azure:

Re: [PR] [HUDI-7346] Remove usage of org.apache.hadoop.hbase.util.Bytes [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10574: URL: https://github.com/apache/hudi/pull/10574#issuecomment-1915352965 ## CI report: * 7210467ed560266f5c0f9ecb680c1b636ced0d27 UNKNOWN * 2a39068e7f57e29a5a525e88e01c7248c0f1b649 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915352131 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 1f02516cda2e16c13c772a829f08da9095f3d357 Azure:

Re: [I] [SUPPORT] parquet bloom filters not supported by hudi [hudi]

2024-01-29 Thread via GitHub
jonvex commented on issue #7117: URL: https://github.com/apache/hudi/issues/7117#issuecomment-1915347717 How does ``` accu.add(0) ``` increment when ``` override def add(v: Integer): Unit = _sum += v ``` -- This is an automated message from the Apache Git Service.

Re: [I] [SUPPORT] Unable to Set Database Name for Hive Metadata Sync using Flink SQL [hudi]

2024-01-29 Thread via GitHub
vkhoroshko closed issue #10583: [SUPPORT] Unable to Set Database Name for Hive Metadata Sync using Flink SQL URL: https://github.com/apache/hudi/issues/10583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] [SUPPORT] Unable to Set Database Name for Hive Metadata Sync using Flink SQL [hudi]

2024-01-29 Thread via GitHub
vkhoroshko commented on issue #10583: URL: https://github.com/apache/hudi/issues/10583#issuecomment-1915333431 The correct option for Flink SQL is "hive-sync.db" and not "hive-sync.database". Closing the ticket. -- This is an automated message from the Apache Git Service. To

Re: [PR] [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil [hudi]

2024-01-29 Thread via GitHub
linliu-code commented on code in PR #10571: URL: https://github.com/apache/hudi/pull/10571#discussion_r1469996932 ## hudi-io/src/main/java/org/apache/hudi/common/util/StringUtils.java: ## @@ -33,8 +33,10 @@ */ public class StringUtils { - public static final char[]

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915274008 ## CI report: * 5d5e4e6b6b571c21ce650d45ebcc2071794d1681 Azure:

Re: [PR] [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10571: URL: https://github.com/apache/hudi/pull/10571#issuecomment-1915273938 ## CI report: * af4ae7ecc4258f364209b40d233d8d09b1cae9d5 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915273070 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 39f7d4b2e29686657f90a8dc497fb6093431f2dd Azure:

Re: [PR] [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10571: URL: https://github.com/apache/hudi/pull/10571#issuecomment-1915261766 ## CI report: * af4ae7ecc4258f364209b40d233d8d09b1cae9d5 Azure:

Re: [PR] [HUDI-7344] Use Java {Input/Output}Stream instead of FSData{Input/Output}Stream when possible [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10573: URL: https://github.com/apache/hudi/pull/10573#issuecomment-1915261847 ## CI report: * 5d5e4e6b6b571c21ce650d45ebcc2071794d1681 Azure:

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1915260790 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * 39f7d4b2e29686657f90a8dc497fb6093431f2dd Azure:

Re: [PR] [WIP][HUDI-6472] fix spark sql does not ignore case [hudi]

2024-01-29 Thread via GitHub
hudi-bot commented on PR #10582: URL: https://github.com/apache/hudi/pull/10582#issuecomment-1915249814 ## CI report: * b2f4afe93e6c67f73bcfde03557268a047f422a1 Azure:

  1   2   >