Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2024-01-10 Thread via GitHub
KnightChess commented on PR #10191: URL: https://github.com/apache/hudi/pull/10191#issuecomment-1886555643 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1886547221 ## CI report: * 74c21aab4c787bf7ec8a4e708d54e2baa62a96f8 Azure:

Re: [PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats Earlier During Data Skipping [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10485: URL: https://github.com/apache/hudi/pull/10485#issuecomment-1886526246 ## CI report: * c58ddb3ade3d8d54c0610991a0ad141330061b49 Azure:

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1886522080 ## CI report: * 74c21aab4c787bf7ec8a4e708d54e2baa62a96f8 Azure:

Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10191: URL: https://github.com/apache/hudi/pull/10191#issuecomment-1886475580 ## CI report: * 2bc4ba3eac8d086da0ae5884bb0a536e3ee7957e Azure:

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1886456406 ## CI report: * 74c21aab4c787bf7ec8a4e708d54e2baa62a96f8 Azure:

Re: [PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats Earlier During Data Skipping [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10485: URL: https://github.com/apache/hudi/pull/10485#issuecomment-1886388255 ## CI report: * c58ddb3ade3d8d54c0610991a0ad141330061b49 Azure:

Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2024-01-10 Thread via GitHub
KnightChess closed pull request #10191: [HUDI-6207] spark support bucket index query for table with bucket index URL: https://github.com/apache/hudi/pull/10191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] [SUPPORT]New merger fails with NPE when schema evolution with string occurs [hudi]

2024-01-10 Thread via GitHub
ad1happy2go commented on issue #9005: URL: https://github.com/apache/hudi/issues/9005#issuecomment-1886362639 @parisni Closing out this issue as we already have solution implemented in this PR - https://github.com/apache/hudi/pull/9262 -- This is an automated message from the Apache Git

Re: [I] [SUPPORT]New merger fails with NPE when schema evolution with string occurs [hudi]

2024-01-10 Thread via GitHub
codope closed issue #9005: [SUPPORT]New merger fails with NPE when schema evolution with string occurs URL: https://github.com/apache/hudi/issues/9005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Hard deletion using deltastreamer [hudi]

2024-01-10 Thread via GitHub
ad1happy2go commented on issue #10483: URL: https://github.com/apache/hudi/issues/10483#issuecomment-1886353742 @Kangho-Lee This is happening as in the new schema, `_hoodie_is_deleted` is non-nullable column which can't be evolved as existing data dont have this column and need to set as

Re: [PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats Earlier During Data Skipping [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10485: URL: https://github.com/apache/hudi/pull/10485#issuecomment-1886332817 ## CI report: * c58ddb3ade3d8d54c0610991a0ad141330061b49 Azure:

Re: [PR] [HUDI-7170] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10241: URL: https://github.com/apache/hudi/pull/10241#issuecomment-1886332233 ## CI report: * 78c32e76253bb0db70f289bf88f9560545b5819e Azure:

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1886332169 ## CI report: * 0b6fa381804f4ee6d3e1c6662da29dfd0b621603 Azure:

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1886316447 ## CI report: * 0b6fa381804f4ee6d3e1c6662da29dfd0b621603 Azure:

Re: [PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats Earlier During Data Skipping [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10485: URL: https://github.com/apache/hudi/pull/10485#issuecomment-1886288752 ## CI report: * c58ddb3ade3d8d54c0610991a0ad141330061b49 Azure:

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1886287962 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * df19ca54e806eec3f9794f1548c4ba5e446adfc1 Azure:

Re: [PR] [HUDI-7286]flink get hudi index type ignore case sensitive. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10476: URL: https://github.com/apache/hudi/pull/10476#issuecomment-1886287474 ## CI report: * 9b05b48912f52d3cd317c78be17b39af8e47225f Azure:

Re: [I] [SUPPORT] Hudi 0.13.1 on EMR, MOR table writer hangs intermittently with S3 read timeout error for column stats index [hudi]

2024-01-10 Thread via GitHub
ergophobiac commented on issue #10415: URL: https://github.com/apache/hudi/issues/10415#issuecomment-1886257556 Hello @ad1happy2go , We ran a test with the same configurations, just one addition: spark.hadoop.fs.s3a.connection.maximum=2000. (We found a resource saying the default on

Re: [I] Partitioning data into two keys is taking more time (10x) than partitioning into one key. [hudi]

2024-01-10 Thread via GitHub
maheshguptags commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1886251792 @xicm Let me try to increase the number write task and for load and test the performance. thanks -- This is an automated message from the Apache Git Service. To

Re: [PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats Earlier During Data Skipping [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10485: URL: https://github.com/apache/hudi/pull/10485#issuecomment-1886240515 ## CI report: * c58ddb3ade3d8d54c0610991a0ad141330061b49 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[I] [SUPPORT] Flink write to COW Hudi table,hive aggregate query results has duplicate data but select * did not [hudi]

2024-01-10 Thread via GitHub
CamelliaYjli opened a new issue, #10486: URL: https://github.com/apache/hudi/issues/10486 **Describe the problem you faced** I use Flink write Hudi COW table and sync to hive , but hive aggregate query (eg. count(*), row_number() over() )results has duplicate data but select *

[jira] [Updated] (HUDI-7291) Pushing Down Partition Pruning Conditions to Column Stats During Data Skipping

2024-01-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7291: - Labels: pull-request-available (was: ) > Pushing Down Partition Pruning Conditions to Column

[PR] [HUDI-7291] Pushing Down Partition Pruning Conditions to Column Stats During Data Skipping [hudi]

2024-01-10 Thread via GitHub
majian1998 opened a new pull request, #10485: URL: https://github.com/apache/hudi/pull/10485 In the current implementation of data skipping, column statistics for the entire table are read and then subjected to data skipping filtering operations based on these stats. When the table has a

Re: [PR] [HUDI-7170] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10241: URL: https://github.com/apache/hudi/pull/10241#issuecomment-1886178804 ## CI report: * 78c32e76253bb0db70f289bf88f9560545b5819e Azure:

Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10191: URL: https://github.com/apache/hudi/pull/10191#issuecomment-1886178722 ## CI report: * db7a22bac00d2670c5103ca28434fb9a2e1d1256 Azure:

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1886177404 ## CI report: * 29c01b1891b32969563683bf75c961a8cdf3ae7d Azure:

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1886172048 ## CI report: * 29c01b1891b32969563683bf75c961a8cdf3ae7d Azure:

Re: [PR] [HUDI-7170] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10241: URL: https://github.com/apache/hudi/pull/10241#issuecomment-1886172424 ## CI report: * 78c32e76253bb0db70f289bf88f9560545b5819e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10191: URL: https://github.com/apache/hudi/pull/10191#issuecomment-1886172344 ## CI report: * db7a22bac00d2670c5103ca28434fb9a2e1d1256 Azure:

[jira] [Created] (HUDI-7291) Pushing Down Partition Pruning Conditions to Column Stats During Data Skipping

2024-01-10 Thread Ma Jian (Jira)
Ma Jian created HUDI-7291: - Summary: Pushing Down Partition Pruning Conditions to Column Stats During Data Skipping Key: HUDI-7291 URL: https://issues.apache.org/jira/browse/HUDI-7291 Project: Apache Hudi

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1886166543 ## CI report: * 0b6fa381804f4ee6d3e1c6662da29dfd0b621603 Azure:

Re: [PR] [MINOR] Parallelized the check for existence of files in IncrementalRelation. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10480: URL: https://github.com/apache/hudi/pull/10480#issuecomment-1886167359 ## CI report: * ddf449c27d0413d995d704d5b31d7da385c012a9 Azure:

[I] [SUPPORT] ClassCastException when upsert COW table with RECORD_INDEX index type [hudi]

2024-01-10 Thread via GitHub
lei-su-awx opened a new issue, #10484: URL: https://github.com/apache/hudi/issues/10484 I used Spark 3.4.1 and hudi 0.14.0 on GKE, streaming reading a hudi COW table(on GCS) and write to another hudi COW table(on GCS) with upsert(RECORD_INDEX), here is my write option:

Re: [PR] [HUDI-7288] Fix ArrayIndexOutOfBoundsException when upgrade unPartitionedTable created by 0.10/0.11 HUDI version [hudi]

2024-01-10 Thread via GitHub
bvaradar merged PR #10482: URL: https://github.com/apache/hudi/pull/10482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

(hudi) branch master updated (d22bdc59843 -> 593ea85da20)

2024-01-10 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from d22bdc59843 [MINOR] Avoid resource leaks (#10345) add 593ea85da20 [HUDI-7288] Fix ArrayIndexOutOfBoundsException

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1448232660 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileRootIndexBlock.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [HUDI-7286]flink get hudi index type ignore case sensitive. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10476: URL: https://github.com/apache/hudi/pull/10476#issuecomment-1886124698 ## CI report: * d518fd9c2e4d3ba255429a6f922942d7b4ade1a3 Azure:

Re: [PR] [HUDI-7288] Fix ArrayIndexOutOfBoundsException when upgrade unPartitionedTable created by 0.10/0.11 HUDI version [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10482: URL: https://github.com/apache/hudi/pull/10482#issuecomment-1886124795 ## CI report: * 062a9702a05537d514eba3d7a1cb2b30521bdbd0 Azure:

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1886124734 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * 0e44d5b9308370806c9d668ed753cd57ecc75eea Azure:

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1886114585 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * 0e44d5b9308370806c9d668ed753cd57ecc75eea Azure:

Re: [PR] [HUDI-7288] Fix ArrayIndexOutOfBoundsException when upgrade unPartitionedTable created by 0.10/0.11 HUDI version [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10482: URL: https://github.com/apache/hudi/pull/10482#issuecomment-1886114658 ## CI report: * 062a9702a05537d514eba3d7a1cb2b30521bdbd0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7286]flink get hudi index type ignore case sensitive. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10476: URL: https://github.com/apache/hudi/pull/10476#issuecomment-1886114530 ## CI report: * d518fd9c2e4d3ba255429a6f922942d7b4ade1a3 Azure:

Re: [I] Partitioning data into two keys is taking more time (10x) than partitioning into one key. [hudi]

2024-01-10 Thread via GitHub
xicm commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1886083636 Sorry for my wrong understanding of `SubTasks`. Hudi splits the input data by partition+fileGroup and then writes these partitioned data with parallelism of `write.tasks`. The job write

[I] Hard deletion using deltastreamer [hudi]

2024-01-10 Thread via GitHub
Kangho-Lee opened a new issue, #10483: URL: https://github.com/apache/hudi/issues/10483 Hello guys. this [post](https://hudi.apache.org/blog/2020/01/15/delete-support-in-hudi/#deletion-with-hoodiedeltastreamer) is from january 2020, any updates about deletion with deltastreamer? Is

[jira] [Updated] (HUDI-7288) Fix ArrayIndexOutOfBoundsException when upgrade nonPartitionedTable created by 0.10/0.11 HUDI version

2024-01-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7288: - Labels: pull-request-available (was: ) > Fix ArrayIndexOutOfBoundsException when upgrade

[PR] [HUDI-7288] Fix ArrayIndexOutOfBoundsException when upgrade unPartitionedTable created by 0.10/0.11 HUDI version [hudi]

2024-01-10 Thread via GitHub
beyond1920 opened a new pull request, #10482: URL: https://github.com/apache/hudi/pull/10482 ### Change Logs When upgrade a nonPartitionedTable which created by 0.10/0.11 HUDI version, an `ArrayIndexOutOfBoundsException` would throw out. Because the hoodie.table.partition.fields

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1886052523 ## CI report: * 72bb1d9825b627e5266819a4836723618f8e59d4 Azure:

(hudi) branch master updated: [MINOR] Avoid resource leaks (#10345)

2024-01-10 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new d22bdc59843 [MINOR] Avoid resource leaks

Re: [PR] [MINOR] Avoid resource leaks [hudi]

2024-01-10 Thread via GitHub
nsivabalan merged PR #10345: URL: https://github.com/apache/hudi/pull/10345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR] Handle parsing of all zero timestamps with MDT suffixes. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10481: URL: https://github.com/apache/hudi/pull/10481#issuecomment-1885986048 ## CI report: * 69f5c3d06f3b4e8048e50ca04fc95aad133b87e2 Azure:

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1885985451 ## CI report: * 4827a8d0481f67243920efee57eda41b8a8210a7 Azure:

Re: [PR] [MINOR] Azure binary 2024 01 a4 [hudi]

2024-01-10 Thread via GitHub
jonvex closed pull request #10453: [MINOR] Azure binary 2024 01 a4 URL: https://github.com/apache/hudi/pull/10453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] do not merge, test with just the slow tests removed [hudi]

2024-01-10 Thread via GitHub
jonvex closed pull request #10454: [MINOR] do not merge, test with just the slow tests removed URL: https://github.com/apache/hudi/pull/10454 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [MINOR] Azure binary 2024 01 a3 [hudi]

2024-01-10 Thread via GitHub
jonvex closed pull request #10452: [MINOR] Azure binary 2024 01 a3 URL: https://github.com/apache/hudi/pull/10452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] Azure binary 2024 01 a2 [hudi]

2024-01-10 Thread via GitHub
jonvex closed pull request #10451: [MINOR] Azure binary 2024 01 a2 URL: https://github.com/apache/hudi/pull/10451 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [MINOR] Azure binary 2024 01 a1 [hudi]

2024-01-10 Thread via GitHub
jonvex closed pull request #10450: [MINOR] Azure binary 2024 01 a1 URL: https://github.com/apache/hudi/pull/10450 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1885938723 ## CI report: * 4827a8d0481f67243920efee57eda41b8a8210a7 Azure:

Re: [PR] [MINOR] Handle parsing of all zero timestamps with MDT suffixes. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10481: URL: https://github.com/apache/hudi/pull/10481#issuecomment-1885932787 ## CI report: * 69f5c3d06f3b4e8048e50ca04fc95aad133b87e2 Azure:

Re: [PR] [MINOR] Parallelized the check for existence of files in IncrementalRelation. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10480: URL: https://github.com/apache/hudi/pull/10480#issuecomment-1885932753 ## CI report: * ddf449c27d0413d995d704d5b31d7da385c012a9 Azure:

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1448111520 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileBlock.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] [MINOR] Handle parsing of all zero timestamps with MDT suffixes. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10481: URL: https://github.com/apache/hudi/pull/10481#issuecomment-1885925976 ## CI report: * 69f5c3d06f3b4e8048e50ca04fc95aad133b87e2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [MINOR] Parallelized the check for existence of files in IncrementalRelation. [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10480: URL: https://github.com/apache/hudi/pull/10480#issuecomment-1885925908 ## CI report: * ddf449c27d0413d995d704d5b31d7da385c012a9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [MINOR] Add permissions to the PR size labeler [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10478: URL: https://github.com/apache/hudi/pull/10478#issuecomment-1885925796 ## CI report: * 3acf3f7f5de88cc1c770644a3a04de93742a1fd9 Azure:

[PR] [MINOR] Handle parsing of all zero timestamps with MDT suffixes. [hudi]

2024-01-10 Thread via GitHub
prashantwason opened a new pull request, #10481: URL: https://github.com/apache/hudi/pull/10481 [MINOR] Handle parsing of all zero timestamps with MDT suffixes. ### Change Logs MDT uses an all zero timestamp if there are no exiting commits on the dataset. If additional indexes

[PR] [MINOR] Parallelized the check for existence of files in IncrementalRelation. [hudi]

2024-01-10 Thread via GitHub
prashantwason opened a new pull request, #10480: URL: https://github.com/apache/hudi/pull/10480 [MINOR] Parallelized the check for existence of files in IncrementalRelation. ### Change Logs Parallelized the check for existence of files in IncrementalRelation. ### Impact

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1885844351 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * 0e44d5b9308370806c9d668ed753cd57ecc75eea Azure:

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1885790397 ## CI report: * 4827a8d0481f67243920efee57eda41b8a8210a7 Azure:

Re: [PR] [] CVE-2023-44487 Upgrade jetty and exclude older jetty [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10223: URL: https://github.com/apache/hudi/pull/10223#issuecomment-1885776077 ## CI report: * 4827a8d0481f67243920efee57eda41b8a8210a7 Azure:

Re: [PR] [HUDI-7290] Don't assume ReplaceCommits are always Clustering [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10479: URL: https://github.com/apache/hudi/pull/10479#issuecomment-1885765613 ## CI report: * 52afba2aa7c6ec4e0f8ca0f50eaf4a0639c53432 Azure:

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447992467 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReader.java: ## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447991303 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReader.java: ## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447989017 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileDataBlock.java: ## @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447986629 ## pom.xml: ## @@ -929,6 +930,13 @@ provided + + Review Comment: Actually, I fixed the bundling in this PR. See other changes in this

[jira] [Updated] (HUDI-7290) filterPendingReplaceTimeline used incorrectly in various places

2024-01-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7290: -- Status: Patch Available (was: In Progress) > filterPendingReplaceTimeline used incorrectly in

[jira] [Updated] (HUDI-7284) Differentiate between replacecommits in stream sync

2024-01-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7284: -- Status: In Progress (was: Open) > Differentiate between replacecommits in stream sync >

[jira] [Updated] (HUDI-7284) Differentiate between replacecommits in stream sync

2024-01-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7284: -- Status: Patch Available (was: In Progress) > Differentiate between replacecommits in stream

[jira] [Closed] (HUDI-7284) Differentiate between replacecommits in stream sync

2024-01-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler closed HUDI-7284. - Resolution: Fixed > Differentiate between replacecommits in stream sync >

[jira] [Updated] (HUDI-7290) filterPendingReplaceTimeline used incorrectly in various places

2024-01-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7290: -- Status: In Progress (was: Open) > filterPendingReplaceTimeline used incorrectly in various

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447967202 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileBlockReader.java: ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [HUDI-7290] Don't assume ReplaceCommits are always Clustering [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10479: URL: https://github.com/apache/hudi/pull/10479#issuecomment-1885706932 ## CI report: * 52afba2aa7c6ec4e0f8ca0f50eaf4a0639c53432 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [MINOR] Add permissions to the PR size labeler [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10478: URL: https://github.com/apache/hudi/pull/10478#issuecomment-1885706866 ## CI report: * 3acf3f7f5de88cc1c770644a3a04de93742a1fd9 Azure:

Re: [PR] [MINOR] Add permissions to the PR size labeler [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10478: URL: https://github.com/apache/hudi/pull/10478#issuecomment-1885686320 ## CI report: * 3acf3f7f5de88cc1c770644a3a04de93742a1fd9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7290) filterPendingReplaceTimeline used incorrectly in various places

2024-01-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7290: - Labels: pull-request-available (was: ) > filterPendingReplaceTimeline used incorrectly in

[PR] [HUDI-7290] Don't assume ReplaceCommits are always Clustering [hudi]

2024-01-10 Thread via GitHub
jonvex opened a new pull request, #10479: URL: https://github.com/apache/hudi/pull/10479 ### Change Logs Fix usage in all places not in tests ### Impact reduce hudi failures and bugs ### Risk level (write none, low medium or high below) low ###

[PR] [MINOR] Add permissions to the PR size labeler [hudi]

2024-01-10 Thread via GitHub
yihua opened a new pull request, #10478: URL: https://github.com/apache/hudi/pull/10478 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1885589578 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * 0e44d5b9308370806c9d668ed753cd57ecc75eea Azure:

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1885577938 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN * 0e44d5b9308370806c9d668ed753cd57ecc75eea UNKNOWN Bot commands @hudi-bot supports the

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447808569 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileReader.java: ## @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
hudi-bot commented on PR #10477: URL: https://github.com/apache/hudi/pull/10477#issuecomment-1885435107 ## CI report: * b21e67457b312bea00ca5dad0255c618db7dc202 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[PR] [MINOR] Turning on publishing of test results to Azure Devops [hudi]

2024-01-10 Thread via GitHub
vinothchandar opened a new pull request, #10477: URL: https://github.com/apache/hudi/pull/10477 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

Re: [PR] [HUDI-7170][WIP] Implement HFile reader independent of HBase [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10241: URL: https://github.com/apache/hudi/pull/10241#discussion_r1447783314 ## hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileCompressionCodec.java: ## @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

(hudi) branch master updated: [MINOR] Fix usages of orElse (#10435)

2024-01-10 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 57a08466432 [MINOR] Fix usages of orElse (#10435)

Re: [PR] [MINOR] Fix usages of orElse [hudi]

2024-01-10 Thread via GitHub
yihua merged PR #10435: URL: https://github.com/apache/hudi/pull/10435 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR] Fix usages of orElse [hudi]

2024-01-10 Thread via GitHub
the-other-tim-brown commented on code in PR #10435: URL: https://github.com/apache/hudi/pull/10435#discussion_r1447758324 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -775,9 +775,11 @@ private Pair, JavaRDD>

Re: [PR] [MINOR] Fix usages of orElse [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10435: URL: https://github.com/apache/hudi/pull/10435#discussion_r1447741360 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scala: ## @@ -107,23 +107,19 @@ object HoodieSparkUtils extends SparkAdapterSupport with

Re: [PR] [MINOR] Fix usages of orElse [hudi]

2024-01-10 Thread via GitHub
yihua commented on code in PR #10435: URL: https://github.com/apache/hudi/pull/10435#discussion_r1447740407 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/streamer/StreamSync.java: ## @@ -775,9 +775,11 @@ private Pair, JavaRDD> writeToSinkAndDoMetaSync(Stri

(hudi) branch master updated (a338cd67028 -> 17c8bdd5f14)

2024-01-10 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from a338cd67028 [HUDI-7284] Stream sync doesn't differentiate replace commits (#10467) add 17c8bdd5f14 [HUDI-7241]

Re: [I] multi-writer jobs wait forever to finish it off (Using OPTIMISTIC_CONCURRENCY_CONTROL) [hudi]

2024-01-10 Thread via GitHub
ad1happy2go commented on issue #10468: URL: https://github.com/apache/hudi/issues/10468#issuecomment-1885334906 You can probably try Non Blocking concurrency control with hudi 1.0.0-beta. On Wed, Jan 10, 2024 at 8:30 PM Sam ***@***.***> wrote: > @SamarthRaval

Re: [PR] [HUDI-7241] Avoid always broadcast HUDI relation if not using HoodieSparkSessionExtension [hudi]

2024-01-10 Thread via GitHub
bvaradar merged PR #10373: URL: https://github.com/apache/hudi/pull/10373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

  1   2   >