[GitHub] [hudi] hudi-bot commented on pull request #7182: [HUDI-5196]spark sql 3.2+ query support

2022-11-10 Thread GitBox
hudi-bot commented on PR #7182: URL: https://github.com/apache/hudi/pull/7182#issuecomment-1311349316 ## CI report: * 583e43a881ea68814e10961a881c5d645981aee1 Azure:

[GitHub] [hudi] xushiyan merged pull request #7184: [HUDI-4142] Claim RFC-64 for table spec APIs

2022-11-10 Thread GitBox
xushiyan merged PR #7184: URL: https://github.com/apache/hudi/pull/7184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated (a06b1c0c4e -> 1f0a5eb7a7)

2022-11-10 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from a06b1c0c4e [MINOR] Update RFC-46 doc (#7050) add 1f0a5eb7a7 [HUDI-4142] Claim RFC-64 for table spec APIs (#7184)

[GitHub] [hudi] hudi-bot commented on pull request #7182: [HUDI-5196]spark sql 3.2+ query support

2022-11-10 Thread GitBox
hudi-bot commented on PR #7182: URL: https://github.com/apache/hudi/pull/7182#issuecomment-1311344819 ## CI report: * 583e43a881ea68814e10961a881c5d645981aee1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311339820 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ebe0b24821d673634c0811bba28f0dfab2e7677a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-10 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1311338719 ## CI report: * 6763e0f42e2a4474d4a5700e6b09e3928d88e83e Azure:

[GitHub] [hudi] codope opened a new pull request, #7184: [HUDI-4142] Claim RFC-64 for table spec APIs

2022-11-10 Thread GitBox
codope opened a new pull request, #7184: URL: https://github.com/apache/hudi/pull/7184 ### Change Logs This PR claims a new entry, RFC-64, for Table format APIs for query engine integrations. ### Impact Only RFC list update. ### Risk level (write none, low medium

[GitHub] [hudi] codope commented on a diff in pull request #7080: [RFC-64] New APIs to facilitate faster Query Engine integrations

2022-11-10 Thread GitBox
codope commented on code in PR #7080: URL: https://github.com/apache/hudi/pull/7080#discussion_r1019945994 ## rfc/rfc-64/rfc-64.md: ## @@ -0,0 +1,509 @@ + + +# RFC-64: New Hudi Table Spec API for Query Integrations + +## Proposers + +- @codope +- @alexeykudinkin + +## Approvers

[jira] [Updated] (HUDI-4142) RFC for new Table APIs proposal for query engine integrations

2022-11-10 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4142: -- Summary: RFC for new Table APIs proposal for query engine integrations (was: RFC for new Table APIs

[jira] [Updated] (HUDI-5194) fix schema evolution bugs

2022-11-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5194: - Labels: pull-request-available (was: ) > fix schema evolution bugs > - >

[GitHub] [hudi] xiarixiaoyao opened a new pull request, #7183: [HUDI-5194][WIP]Fix problems found in schema evolution in the production enviroment

2022-11-10 Thread GitBox
xiarixiaoyao opened a new pull request, #7183: URL: https://github.com/apache/hudi/pull/7183 ### Change Logs Fix the bug, history schema files cannot be cleaned by FileBasedInternalSchemaStorageManager Fix the bug, schema evolution cannot worked very well on non-batch read mode

[GitHub] [hudi] cdmikechen commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2022-11-10 Thread GitBox
cdmikechen commented on PR #7173: URL: https://github.com/apache/hudi/pull/7173#issuecomment-1311315112 @xiarixiaoyao @xicm There seems to be a PR at the moment that may overlap with this issue, should we deal with this issue in a unified way? https://github.com/apache/hudi/pull/3391

[GitHub] [hudi] xiarixiaoyao commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2022-11-10 Thread GitBox
xiarixiaoyao commented on PR #7173: URL: https://github.com/apache/hudi/pull/7173#issuecomment-1311311552 @xicm Thank you for your contribution Looks good, let me test it for presto -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] YannByron commented on a diff in pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
YannByron commented on code in PR #7167: URL: https://github.com/apache/hudi/pull/7167#discussion_r1019894907 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieRecordPayload.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Updated] (HUDI-5196) For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration.

2022-11-10 Thread scx (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] scx updated HUDI-5196: -- Description: {code:java} Previously, when we used Spark SQL to read the hudi table, we could only read the rt table or

[jira] [Updated] (HUDI-5196) For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration.

2022-11-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5196: - Labels: pull-request-available (was: ) > For spark with version greater than 3.2+, the query of

[GitHub] [hudi] scxwhite opened a new pull request, #7182: [HUDI-5196]spark sql 3.2+ query support

2022-11-10 Thread GitBox
scxwhite opened a new pull request, #7182: URL: https://github.com/apache/hudi/pull/7182 ### Change Logs For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration. on this pr,when can query hudi table

[jira] [Created] (HUDI-5196) For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration.

2022-11-10 Thread scx (Jira)
scx created HUDI-5196: - Summary: For spark with version greater than 3.2+, the query of hudi table using spark sql supports reading parameter configuration. Key: HUDI-5196 URL: https://issues.apache.org/jira/browse/HUDI-5196

[GitHub] [hudi] nsivabalan commented on issue #6341: [SUPPORT] Hudi delete not working via spark apis

2022-11-10 Thread GitBox
nsivabalan commented on issue #6341: URL: https://github.com/apache/hudi/issues/6341#issuecomment-1311280133 but please do give it a try w/ later versions. I highly doubt delete requires entire table schema. I vaguely remember we relaxed that constraint later. -- This is an

[GitHub] [hudi] nsivabalan commented on issue #6341: [SUPPORT] Hudi delete not working via spark apis

2022-11-10 Thread GitBox
nsivabalan commented on issue #6341: URL: https://github.com/apache/hudi/issues/6341#issuecomment-1311279566 sorry to have dropped the ball on this. what the instruction say is. to write to hudi, you might be doing something like ``` df.write.format("hudi").save(path). ```

[GitHub] [hudi] hudi-bot commented on pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
hudi-bot commented on PR #7167: URL: https://github.com/apache/hudi/pull/7167#issuecomment-1311276283 ## CI report: * 6b165aec634812ba8d6f4a55d0dfb8578031d25c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311275586 ## CI report: * 8bafc71d82607b40cb99505d145bbfddb8c81ae3 Azure:

[GitHub] [hudi] the-other-tim-brown commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
the-other-tim-brown commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311268344 Some results BEFORE: hudi-client-common . SUCCESS [01:53 min] hudi-utilities_2.11 SUCCESS [07:38 min]

[GitHub] [hudi] YannByron commented on pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2022-11-10 Thread GitBox
YannByron commented on PR #5165: URL: https://github.com/apache/hudi/pull/5165#issuecomment-1311266552 > Personally,i think we should enable vectorization by default +1. Users do not know when to enable vectorization. So provide a config to control this is meaningless. -- This is

[GitHub] [hudi] hudi-bot commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2022-11-10 Thread GitBox
hudi-bot commented on PR #7173: URL: https://github.com/apache/hudi/pull/7173#issuecomment-1311242776 ## CI report: * 198db51b584ad83112d09ada850d97893ef1d055 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
hudi-bot commented on PR #7167: URL: https://github.com/apache/hudi/pull/7167#issuecomment-1311242732 ## CI report: * 6b165aec634812ba8d6f4a55d0dfb8578031d25c Azure:

[jira] [Updated] (HUDI-5178) Add Call show_table_properties for spark sql

2022-11-10 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu updated HUDI-5178: - Fix Version/s: 0.12.2 > Add Call show_table_properties for spark sql >

[jira] [Resolved] (HUDI-4526) improve spillableMapBasePath disk directory is full

2022-11-10 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu resolved HUDI-4526. -- > improve spillableMapBasePath disk directory is full > ---

[jira] [Resolved] (HUDI-5178) Add Call show_table_properties for spark sql

2022-11-10 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu resolved HUDI-5178. -- > Add Call show_table_properties for spark sql > > >

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311236582 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ba1ede9840ed3ae5a4ce3d74f25ff7d2895dfde4 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311233825 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ba1ede9840ed3ae5a4ce3d74f25ff7d2895dfde4 Azure:

[GitHub] [hudi] eric9204 commented on a diff in pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
eric9204 commented on code in PR #7167: URL: https://github.com/apache/hudi/pull/7167#discussion_r1019834514 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieRecordPayload.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] hudi-bot commented on pull request #7129: [MINOR] Support column type evolution for Hive

2022-11-10 Thread GitBox
hudi-bot commented on PR #7129: URL: https://github.com/apache/hudi/pull/7129#issuecomment-1311230635 ## CI report: * f6a718f53a29aeb96a4488a848b94656485a0e3d Azure:

[GitHub] [hudi] xiarixiaoyao commented on pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2022-11-10 Thread GitBox
xiarixiaoyao commented on PR #5165: URL: https://github.com/apache/hudi/pull/5165#issuecomment-1311201272 > IIRC the vectorized reader was disabled for a reason. somehow the filtering was depends on the regular parquet reader. garyli1019 has explained the problem: IIRC the

[GitHub] [hudi] zhangyue19921010 closed pull request #7169: [HUDI-5186] Parallelism does not take effect when hoodie.combine.before.upsert/insert false

2022-11-10 Thread GitBox
zhangyue19921010 closed pull request #7169: [HUDI-5186] Parallelism does not take effect when hoodie.combine.before.upsert/insert false URL: https://github.com/apache/hudi/pull/7169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] hudi-bot commented on pull request #7181: add totalRecordsDeleted metric

2022-11-10 Thread GitBox
hudi-bot commented on PR #7181: URL: https://github.com/apache/hudi/pull/7181#issuecomment-1311196376 ## CI report: * de066f2920aea9c511bef762fed033095da39482 Azure:

[GitHub] [hudi] zhangyue19921010 commented on pull request #7174: [HUDI-5190] Consuming records from Iterator directly instead of using inner message queue

2022-11-10 Thread GitBox
zhangyue19921010 commented on PR #7174: URL: https://github.com/apache/hudi/pull/7174#issuecomment-1311187308 Hi @alexeykudinkin and @nsivabalan Sorry to bother u. Would u mind to take a further look about this PR? Really appreciate it if it's possible! -- This is an automated

[jira] [Updated] (HUDI-5195) FileIOUtils readDataFromPath will cut off data if the commit metadata which commited to HDFS is too huge

2022-11-10 Thread yuemeng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuemeng updated HUDI-5195: -- Summary: FileIOUtils readDataFromPath will cut off data if the commit metadata which commited to HDFS is too

[jira] [Updated] (HUDI-5195) FileIOUtils readDataFromPath will cut off data if the commit metadata is too huge

2022-11-10 Thread yuemeng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuemeng updated HUDI-5195: -- Summary: FileIOUtils readDataFromPath will cut off data if the commit metadata is too huge (was: FileIOUtils

[jira] [Assigned] (HUDI-5195) FileIOUtils readDataFromPath will cut off data if the commit data in hdfs is too huge

2022-11-10 Thread yuemeng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuemeng reassigned HUDI-5195: - Assignee: yuemeng > FileIOUtils readDataFromPath will cut off data if the commit data in hdfs is > too

[jira] [Created] (HUDI-5195) FileIOUtils readDataFromPath will cut off data if the commit data in hdfs is too huge

2022-11-10 Thread yuemeng (Jira)
yuemeng created HUDI-5195: - Summary: FileIOUtils readDataFromPath will cut off data if the commit data in hdfs is too huge Key: HUDI-5195 URL: https://issues.apache.org/jira/browse/HUDI-5195 Project: Apache

[GitHub] [hudi] YannByron commented on pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2022-11-10 Thread GitBox
YannByron commented on PR #5165: URL: https://github.com/apache/hudi/pull/5165#issuecomment-1311164037 @xiarixiaoyao @alexeykudinkin why we need to disable `enableVectorizedReader` in mor inc query. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] eric9204 commented on issue #6966: [SUPPORT]HoodieWriteHandle: Error writing record HoodieRecord{key=HoodieKey { recordKey=id308723 partitionPath=202210141643}, currentLocation='null',

2022-11-10 Thread GitBox
eric9204 commented on issue #6966: URL: https://github.com/apache/hudi/issues/6966#issuecomment-1311163178 @fengjian428 when spark-sql was used to write data to hudi, the deltacommit action and compaction action were performed one by one, therefore, they will not influence each other. But

[GitHub] [hudi] YannByron commented on pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
YannByron commented on PR #7167: URL: https://github.com/apache/hudi/pull/7167#issuecomment-1311159035 @eric9204 Looks good. please improve UT to cover more cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] YannByron commented on a diff in pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
YannByron commented on code in PR #7167: URL: https://github.com/apache/hudi/pull/7167#discussion_r1019780844 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieRecordPayload.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] YannByron commented on a diff in pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
YannByron commented on code in PR #7167: URL: https://github.com/apache/hudi/pull/7167#discussion_r1019779971 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieRecordPayload.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] YannByron commented on a diff in pull request #7167: [HUDI-5094] Remove partition fields before transform bytes to avro,if enable DROP_PARTITION_COLUMNS.

2022-11-10 Thread GitBox
YannByron commented on code in PR #7167: URL: https://github.com/apache/hudi/pull/7167#discussion_r1019779857 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestHoodieRecordPayload.java: ## @@ -0,0 +1,96 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
Zouxxyy commented on code in PR #7175: URL: https://github.com/apache/hudi/pull/7175#discussion_r1019779340 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestOverwriteNonDefaultsWithLatestAvroPayload.java: ## @@ -181,10 +181,10 @@ public void testDeletedRecord()

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311152024 ## CI report: * 5341fff4dfb30afde48c370a7c6cb6e31b389539 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-10 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1311152713 ## CI report: * ed35ee4008f66ed0c3599ad2e8e8b79c8ea4ba6d Azure:

[jira] [Updated] (HUDI-5194) fix schema evolution bugs

2022-11-10 Thread Tao Meng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Meng updated HUDI-5194: --- Description: # Fix the bug, history schema files cannot be cleaned by FileBasedInternalSchemaStorageManager

[jira] [Updated] (HUDI-5194) fix schema evolution bugs

2022-11-10 Thread Tao Meng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Meng updated HUDI-5194: --- Issue Type: Bug (was: New Feature) > fix schema evolution bugs > - > >

[GitHub] [hudi] YannByron commented on a diff in pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
YannByron commented on code in PR #7175: URL: https://github.com/apache/hudi/pull/7175#discussion_r1019775170 ## hudi-common/src/main/java/org/apache/hudi/common/model/debezium/AbstractDebeziumAvroPayload.java: ## @@ -76,7 +77,12 @@ private Option

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311148593 ## CI report: * 522204901637ca82f8232c63d9b26adce3f484b8 Azure:

[jira] [Created] (HUDI-5194) fix schema evolution bugs

2022-11-10 Thread Tao Meng (Jira)
Tao Meng created HUDI-5194: -- Summary: fix schema evolution bugs Key: HUDI-5194 URL: https://issues.apache.org/jira/browse/HUDI-5194 Project: Apache Hudi Issue Type: New Feature

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-10 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1311148851 ## CI report: * ed35ee4008f66ed0c3599ad2e8e8b79c8ea4ba6d Azure:

[GitHub] [hudi] YannByron commented on a diff in pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
YannByron commented on code in PR #7175: URL: https://github.com/apache/hudi/pull/7175#discussion_r1019774806 ## hudi-common/src/test/java/org/apache/hudi/common/util/TestObjectSizeCalculator.java: ## @@ -73,7 +74,8 @@ public void testGetObjectSize() { assertEquals(32,

[jira] [Updated] (HUDI-5109) Source all metadata table instability issues

2022-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5109: - Status: In Progress (was: Open) > Source all metadata table instability issues >

[jira] [Closed] (HUDI-5111) Add metadata on read support to integ tests

2022-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-5111. Resolution: Done > Add metadata on read support to integ tests >

[jira] [Updated] (HUDI-2740) Support for snapshot querying on MOR table

2022-11-10 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2740: - Status: Patch Available (was: In Progress) > Support for snapshot querying on MOR table >

[GitHub] [hudi] YannByron commented on a diff in pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
YannByron commented on code in PR #7175: URL: https://github.com/apache/hudi/pull/7175#discussion_r1019774239 ## hudi-common/src/test/java/org/apache/hudi/common/model/TestOverwriteNonDefaultsWithLatestAvroPayload.java: ## @@ -181,10 +181,10 @@ public void testDeletedRecord()

[GitHub] [hudi] hudi-bot commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2022-11-10 Thread GitBox
hudi-bot commented on PR #7173: URL: https://github.com/apache/hudi/pull/7173#issuecomment-1311144911 ## CI report: * 198db51b584ad83112d09ada850d97893ef1d055 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7129: [MINOR] Support column type evolution for Hive

2022-11-10 Thread GitBox
hudi-bot commented on PR #7129: URL: https://github.com/apache/hudi/pull/7129#issuecomment-1311144693 ## CI report: * f6a718f53a29aeb96a4488a848b94656485a0e3d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311144993 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ba1ede9840ed3ae5a4ce3d74f25ff7d2895dfde4 Azure:

[GitHub] [hudi] YannByron commented on a diff in pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
YannByron commented on code in PR #7175: URL: https://github.com/apache/hudi/pull/7175#discussion_r1019770729 ## hudi-common/src/test/java/org/apache/hudi/avro/TestHoodieAvroUtils.java: ## @@ -253,7 +254,13 @@ public void testRemoveFields() { assertEquals("key1",

[GitHub] [hudi] danny0405 commented on pull request #6182: [DO NOT MERGE] 0.11.1 release patch branch

2022-11-10 Thread GitBox
danny0405 commented on PR #6182: URL: https://github.com/apache/hudi/pull/6182#issuecomment-1311135044 Please use release 0.12.1 directly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] danny0405 closed pull request #6182: [DO NOT MERGE] 0.11.1 release patch branch

2022-11-10 Thread GitBox
danny0405 closed pull request #6182: [DO NOT MERGE] 0.11.1 release patch branch URL: https://github.com/apache/hudi/pull/6182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] YannByron commented on pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
YannByron commented on PR #7175: URL: https://github.com/apache/hudi/pull/7175#issuecomment-1311133735 nice work. Checks and CI haven't covered spark-common and spark-client yet. cc @xushiyan -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] fengjian428 commented on issue #6966: [SUPPORT]HoodieWriteHandle: Error writing record HoodieRecord{key=HoodieKey { recordKey=id308723 partitionPath=202210141643}, currentLocation='nul

2022-11-10 Thread GitBox
fengjian428 commented on issue #6966: URL: https://github.com/apache/hudi/issues/6966#issuecomment-1311131446 can you help to figure out why spark-SQL works fine but structure streaming cannot? @eric9204 I saw you make an API level change in #7167 -- This is an automated message from

[GitHub] [hudi] xiarixiaoyao commented on pull request #7129: [MINOR] Support column type evolution for Hive

2022-11-10 Thread GitBox
xiarixiaoyao commented on PR #7129: URL: https://github.com/apache/hudi/pull/7129#issuecomment-1311120298 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] xiarixiaoyao commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2022-11-10 Thread GitBox
xiarixiaoyao commented on PR #7173: URL: https://github.com/apache/hudi/pull/7173#issuecomment-139290 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #7181: add totalRecordsDeleted metric

2022-11-10 Thread GitBox
hudi-bot commented on PR #7181: URL: https://github.com/apache/hudi/pull/7181#issuecomment-1311101830 ## CI report: * de066f2920aea9c511bef762fed033095da39482 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311101313 ## CI report: * 522204901637ca82f8232c63d9b26adce3f484b8 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7181: add totalRecordsDeleted metric

2022-11-10 Thread GitBox
hudi-bot commented on PR #7181: URL: https://github.com/apache/hudi/pull/7181#issuecomment-1311098611 ## CI report: * de066f2920aea9c511bef762fed033095da39482 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311098147 ## CI report: * 59cdd09e3190c3646e1e3ea6ca3f076526ec0473 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311094657 ## CI report: * 59cdd09e3190c3646e1e3ea6ca3f076526ec0473 Azure:

[GitHub] [hudi] hussein-awala opened a new pull request, #7181: add totalRecordsDeleted metric

2022-11-10 Thread GitBox
hussein-awala opened a new pull request, #7181: URL: https://github.com/apache/hudi/pull/7181 ### Change Logs Add missing `totalRecordsDeleted` metric. ### Impact A new metric will be logged to the different reporters. ### Risk level (write none, low medium or

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311054127 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ba1ede9840ed3ae5a4ce3d74f25ff7d2895dfde4 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6680: [HUDI-4812] Lazy fetching partition path & file slice for HoodieFileIndex

2022-11-10 Thread GitBox
hudi-bot commented on PR #6680: URL: https://github.com/apache/hudi/pull/6680#issuecomment-1311053705 ## CI report: * 59cdd09e3190c3646e1e3ea6ca3f076526ec0473 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311051093 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN * ba1ede9840ed3ae5a4ce3d74f25ff7d2895dfde4 UNKNOWN Bot commands @hudi-bot supports the

[GitHub] [hudi] hudi-bot commented on pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

2022-11-10 Thread GitBox
hudi-bot commented on PR #7180: URL: https://github.com/apache/hudi/pull/7180#issuecomment-1311047999 ## CI report: * 3bf8572de90c275b7139f216cdd033d9dfc008e4 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7179: [HUDI-5193] Enhancing spark-ds write tests for some of the core user flows

2022-11-10 Thread GitBox
hudi-bot commented on PR #7179: URL: https://github.com/apache/hudi/pull/7179#issuecomment-1311045025 ## CI report: * 990ca855cc4f09bfd7ff25b9b8a7ac3fef799d46 Azure:

[GitHub] [hudi] the-other-tim-brown opened a new pull request, #7180: add in minor perf wins in hudi-utilities

2022-11-10 Thread GitBox
the-other-tim-brown opened a new pull request, #7180: URL: https://github.com/apache/hudi/pull/7180 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[hudi] branch master updated (49c5fcbf49 -> a06b1c0c4e)

2022-11-10 Thread akudinkin
This is an automated email from the ASF dual-hosted git repository. akudinkin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 49c5fcbf49 [HUDI-5176] Fix incremental source to consider inflight commits before completed commits (#7160)

[GitHub] [hudi] alexeykudinkin merged pull request #7050: [MINOR] update rfc46 doc

2022-11-10 Thread GitBox
alexeykudinkin merged PR #7050: URL: https://github.com/apache/hudi/pull/7050 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6442: [HUDI-4449] Support DataSourceV2 Read for Spark3.2

2022-11-10 Thread GitBox
alexeykudinkin commented on code in PR #6442: URL: https://github.com/apache/hudi/pull/6442#discussion_r1019650046 ## hudi-spark-datasource/hudi-spark3.2plus-common/src/main/scala/org/apache/spark/sql/hudi/catalog/HoodieInternalV2Table.scala: ## @@ -82,6 +85,15 @@ case class

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2022-11-10 Thread GitBox
alexeykudinkin commented on code in PR #5165: URL: https://github.com/apache/hudi/pull/5165#discussion_r1019634073 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala: ## @@ -152,6 +152,11 @@ object DataSourceReadOptions { val

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #7039: [HUDI-5080] Fixing unpersist to consider only rdds pertaining to current write operation

2022-11-10 Thread GitBox
alexeykudinkin commented on code in PR #7039: URL: https://github.com/apache/hudi/pull/7039#discussion_r1019628050 ## hudi-common/src/main/java/org/apache/hudi/common/util/CommitUtils.java: ## @@ -44,6 +46,28 @@ public class CommitUtils { private static final Logger LOG =

[GitHub] [hudi] hudi-bot commented on pull request #7179: [HUDI-5193] Enhancing spark-ds write tests for some of the core user flows

2022-11-10 Thread GitBox
hudi-bot commented on PR #7179: URL: https://github.com/apache/hudi/pull/7179#issuecomment-1310894553 ## CI report: * 990ca855cc4f09bfd7ff25b9b8a7ac3fef799d46 Azure:

[jira] [Updated] (HUDI-5193) Enhancing core user flow tests for spark-datasource writes

2022-11-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5193: - Labels: pull-request-available (was: ) > Enhancing core user flow tests for spark-datasource

[GitHub] [hudi] hudi-bot commented on pull request #7179: [HUDI-5193] Enhancing spark-ds write tests for some of the core user flows

2022-11-10 Thread GitBox
hudi-bot commented on PR #7179: URL: https://github.com/apache/hudi/pull/7179#issuecomment-1310889505 ## CI report: * 990ca855cc4f09bfd7ff25b9b8a7ac3fef799d46 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Created] (HUDI-5193) Enhancing core user flow tests for spark-datasource writes

2022-11-10 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-5193: - Summary: Enhancing core user flow tests for spark-datasource writes Key: HUDI-5193 URL: https://issues.apache.org/jira/browse/HUDI-5193 Project: Apache Hudi

[hudi] branch master updated: [HUDI-5176] Fix incremental source to consider inflight commits before completed commits (#7160)

2022-11-10 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 49c5fcbf49 [HUDI-5176] Fix incremental source

[GitHub] [hudi] nsivabalan merged pull request #7160: [HUDI-5176] Fix incremental source to consider inflight commits before completed commits

2022-11-10 Thread GitBox
nsivabalan merged PR #7160: URL: https://github.com/apache/hudi/pull/7160 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #7175: [HUDI-5191] Fix compatibility with avro 1.10

2022-11-10 Thread GitBox
hudi-bot commented on PR #7175: URL: https://github.com/apache/hudi/pull/7175#issuecomment-1310806605 ## CI report: * fe4f220520b2b989bde89589296675f63c760e2a Azure:

[jira] [Created] (HUDI-5192) GH actions and azure ci tests run even for trivial fixes

2022-11-10 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5192: - Summary: GH actions and azure ci tests run even for trivial fixes Key: HUDI-5192 URL: https://issues.apache.org/jira/browse/HUDI-5192 Project: Apache Hudi

[jira] [Closed] (HUDI-5056) Add support to DELETE_PARTITIONS w/ wild card

2022-11-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler closed HUDI-5056. - Resolution: Fixed > Add support to DELETE_PARTITIONS w/ wild card >

[jira] [Closed] (HUDI-5171) Ensure validateTableConfig also checks for partition path field value switch

2022-11-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler closed HUDI-5171. - Resolution: Fixed > Ensure validateTableConfig also checks for partition path field value switch

[jira] [Resolved] (HUDI-4888) Add validation to block COW table to use consistent hashing bucket index

2022-11-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler resolved HUDI-4888. --- > Add validation to block COW table to use consistent hashing bucket index >

[jira] [Closed] (HUDI-4888) Add validation to block COW table to use consistent hashing bucket index

2022-11-10 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler closed HUDI-4888. - Resolution: Fixed > Add validation to block COW table to use consistent hashing bucket index >

  1   2   >