[GitHub] [hudi] felixYyu commented on a diff in pull request #5064: [HUDI-3654] Add new module `hudi-metaserver`

2022-09-29 Thread GitBox
felixYyu commented on code in PR #5064: URL: https://github.com/apache/hudi/pull/5064#discussion_r984168332 ## hudi-metaserver/src/main/resources/mybatis/DDLMapper.xml: ## @@ -0,0 +1,127 @@ + + +http://mybatis.org/dtd/mybatis-3-mapper.dtd;> + + + +CREATE TABLE dbs +

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for  - *NonPartitionedKeyGenerator*   URL -

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for  - *NonPartitionedKeyGenerator*   URL -

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for  - *NonPartitionedKeyGenerator*   URL -

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Priority: Major (was: Minor) > Typo in Hudi documentation about NonPartitionedKeyGenerator >

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for  - *NonPartitionedKeyGenerator*   URL -

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for  - *NonPartitionedKeyGenerator*   URL -

[GitHub] [hudi] nsivabalan commented on issue #6800: [SUPPORT]org.apache.avro.SchemaParseException: Illegal initial character: 1Min

2022-09-29 Thread GitBox
nsivabalan commented on issue #6800: URL: https://github.com/apache/hudi/issues/6800#issuecomment-1263123231 if you are using deltastreamer, you can add a schema post processor and rename columns. if not,can't think of any easy solution apart from manually fixing it. -- This is an

[jira] [Updated] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayasheel Kalgal updated HUDI-4953: --- Description: Typo in Hudi documentation for

[GitHub] [hudi] nsivabalan commented on issue #6800: [SUPPORT]org.apache.avro.SchemaParseException: Illegal initial character: 1Min

2022-09-29 Thread GitBox
nsivabalan commented on issue #6800: URL: https://github.com/apache/hudi/issues/6800#issuecomment-1263122615 we rely on avro's field naming conventions. looks like starting char cannot be numbers. https://issues.apache.org/jira/browse/AVRO-153 -- This is an automated message from

[GitHub] [hudi] nsivabalan commented on issue #6804: [SUPPORT] Repairing the hudi table from No such file or directory of parquet file.

2022-09-29 Thread GitBox
nsivabalan commented on issue #6804: URL: https://github.com/apache/hudi/issues/6804#issuecomment-1263121682 if not for metadata table, can't think of easier way to go about this. essentially cleaner has cleaned up some data file which is being required by the query. if you have very

[GitHub] [hudi] nsivabalan commented on issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond

2022-09-29 Thread GitBox
nsivabalan commented on issue #6825: URL: https://github.com/apache/hudi/issues/6825#issuecomment-1263120280 guess timeline server crashed for some reason. CC @yihua any thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[jira] [Created] (HUDI-4953) Typo in Hudi documentation about NonPartitionedKeyGenerator

2022-09-29 Thread Jayasheel Kalgal (Jira)
Jayasheel Kalgal created HUDI-4953: -- Summary: Typo in Hudi documentation about NonPartitionedKeyGenerator Key: HUDI-4953 URL: https://issues.apache.org/jira/browse/HUDI-4953 Project: Apache Hudi

[GitHub] [hudi] nsivabalan commented on issue #6835: [SUPPORT] hive doesnt support mor read now, pls confirm

2022-09-29 Thread GitBox
nsivabalan commented on issue #6835: URL: https://github.com/apache/hudi/issues/6835#issuecomment-1263117459 since we have a patch being actively worked on, closing the issue. thanks for reporting. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan closed issue #6835: [SUPPORT] hive doesnt support mor read now, pls confirm

2022-09-29 Thread GitBox
nsivabalan closed issue #6835: [SUPPORT] hive doesnt support mor read now, pls confirm URL: https://github.com/apache/hudi/issues/6835 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] nsivabalan commented on issue #5582: [SUPPORT] NullPointerException in merge into Spark Sql HoodieSparkSqlWriter$.mergeParamsAndGetHoodieConfig

2022-09-29 Thread GitBox
nsivabalan commented on issue #5582: URL: https://github.com/apache/hudi/issues/5582#issuecomment-1263116299 @nitinkul @vicuna96 : gentle ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] nsivabalan commented on issue #6503: [SUPPORT] Hudi Merge Into with larger volume

2022-09-29 Thread GitBox
nsivabalan commented on issue #6503: URL: https://github.com/apache/hudi/issues/6503#issuecomment-1263115964 my understanding is that, preCombine is a mandatory field for merge into statement. But I will let @alexeykudinkin investigate further though. -- This is an automated message

[GitHub] [hudi] nsivabalan commented on issue #5777: [SUPPORT] Hudi table has duplicate data.

2022-09-29 Thread GitBox
nsivabalan commented on issue #5777: URL: https://github.com/apache/hudi/issues/5777#issuecomment-1263114829 I see you have given test data. is everything to be ingested in 1 single commit. or using diff commits. your reproducible script is not very clear on this. -- This is an

[GitHub] [hudi] nsivabalan commented on issue #5777: [SUPPORT] Hudi table has duplicate data.

2022-09-29 Thread GitBox
nsivabalan commented on issue #5777: URL: https://github.com/apache/hudi/issues/5777#issuecomment-1263114477 @jiangjiguang : did not realize you had give us a reproducible code snippet. so from what you have given above, you could see duplicate data w/ MOR RT query? -- This is an

[GitHub] [hudi] nsivabalan commented on issue #5777: [SUPPORT] Hudi table has duplicate data.

2022-09-29 Thread GitBox
nsivabalan commented on issue #5777: URL: https://github.com/apache/hudi/issues/5777#issuecomment-1263111919 sorry to have dropped the ball on this. again picking it up. btw, I see this config `hoodie.datasource.write.insert.drop.duplicates` was proposed earlier. do not set this to

[GitHub] [hudi] jiangbiao910 commented on issue #6462: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata

2022-09-29 Thread GitBox
jiangbiao910 commented on issue #6462: URL: https://github.com/apache/hudi/issues/6462#issuecomment-1263109912 @nsivabalan Thank you for your reply, if I don't set hoodie.metadata.enable"="false",throw "java.lang.NoSuchMethodError:

[jira] [Closed] (HUDI-4934) Cleaner cleans up files touched by clustering

2022-09-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan closed HUDI-4934. - Resolution: Fixed > Cleaner cleans up files touched by clustering >

[GitHub] [hudi] hudi-bot commented on pull request #6836: [HUDI-4952] Fixing reading from metadata table when there are no inflight commits

2022-09-29 Thread GitBox
hudi-bot commented on PR #6836: URL: https://github.com/apache/hudi/pull/6836#issuecomment-1263105795 ## CI report: * 77223f8b87bdfcfa75045fb622b127cc4f9e47ab Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

2022-09-29 Thread GitBox
hudi-bot commented on PR #6818: URL: https://github.com/apache/hudi/pull/6818#issuecomment-1263105772 ## CI report: * f14363a4be66f8a05ddbbe14600176da151d04ff Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6793: 【HUDI-4917】Optimized the way to get HoodieBaseFile of loadColumnRange…

2022-09-29 Thread GitBox
hudi-bot commented on PR #6793: URL: https://github.com/apache/hudi/pull/6793#issuecomment-1263105718 ## CI report: * 32cc352122d276f5bb5943a0dd420920854fdb8e Azure:

[jira] [Updated] (HUDI-4948) Support flush and rollover for CDC Write

2022-09-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4948: - Labels: pull-request-available (was: ) > Support flush and rollover for CDC Write >

[GitHub] [hudi] hudi-bot commented on pull request #6836: [HUDI-4952] Fixing reading from metadata table when there are no inflight commits

2022-09-29 Thread GitBox
hudi-bot commented on PR #6836: URL: https://github.com/apache/hudi/pull/6836#issuecomment-1263103453 ## CI report: * 77223f8b87bdfcfa75045fb622b127cc4f9e47ab Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6818: [HUDI-4948] Improve CDC Write

2022-09-29 Thread GitBox
hudi-bot commented on PR #6818: URL: https://github.com/apache/hudi/pull/6818#issuecomment-1263103404 ## CI report: * f14363a4be66f8a05ddbbe14600176da151d04ff Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6741: [HUDI-4898] presto/hive respect payload during merge parquet file and logfile when reading mor table

2022-09-29 Thread GitBox
hudi-bot commented on PR #6741: URL: https://github.com/apache/hudi/pull/6741#issuecomment-1263100954 ## CI report: * bff3acafde6d8a1bd5574b90ce644ef30acbf0a2 UNKNOWN * e39d50d6242e272f867c9987a8a2e97ca323568f Azure:

[GitHub] [hudi] nsivabalan commented on issue #6101: [SUPPORT] Hudi Delete Not working with EMR, AWS Glue & S3

2022-09-29 Thread GitBox
nsivabalan commented on issue #6101: URL: https://github.com/apache/hudi/issues/6101#issuecomment-1263071072 @navbalaraman : hey any updates for us. if you could not reproduce, feel free to close it out. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] nsivabalan commented on issue #6504: [SUPPORT] Hudi deletes fail in HoodieDeltaStreamer

2022-09-29 Thread GitBox
nsivabalan commented on issue #6504: URL: https://github.com/apache/hudi/issues/6504#issuecomment-1263070852 @santoshraj123 : gentle ping. if you got the issue resolved, feel free to close it out. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan commented on issue #6428: [SUPPORT] S3 Deltastreamer: Block has already been inflated

2022-09-29 Thread GitBox
nsivabalan commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1263070600 Since we could not reproduce w/ OSS spark, can you reach out to aws support. CC @umehrot2 @rahil-c : Have you folks seen this issue before. seems like simple read from metadata

[GitHub] [hudi] nsivabalan commented on issue #6428: [SUPPORT] S3 Deltastreamer: Block has already been inflated

2022-09-29 Thread GitBox
nsivabalan commented on issue #6428: URL: https://github.com/apache/hudi/issues/6428#issuecomment-1263069837 yes, you are right. you can disable via hudi-cli as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] nsivabalan commented on issue #6421: [SUPPORT]Table property not working while creating table - hoodie.datasource.write.drop.partition.columns

2022-09-29 Thread GitBox
nsivabalan commented on issue #6421: URL: https://github.com/apache/hudi/issues/6421#issuecomment-1263069591 @sandip-yadav : gentle ping. did you get a chance to try 0.12. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] wwli05 commented on issue #6835: [SUPPORT] hive doesnt support mor read now, pls confirm

2022-09-29 Thread GitBox
wwli05 commented on issue #6835: URL: https://github.com/apache/hudi/issues/6835#issuecomment-1263069076 thank you ,friends, really JI_SHI_YU -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot commented on pull request #6836: [HUDI-4952] Fixing reading from metadata table when there are no inflight commits

2022-09-29 Thread GitBox
hudi-bot commented on PR #6836: URL: https://github.com/apache/hudi/pull/6836#issuecomment-1263068964 ## CI report: * 77223f8b87bdfcfa75045fb622b127cc4f9e47ab Azure:

[GitHub] [hudi] nsivabalan closed issue #6462: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata

2022-09-29 Thread GitBox
nsivabalan closed issue #6462: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata URL: https://github.com/apache/hudi/issues/6462 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan commented on issue #6462: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata

2022-09-29 Thread GitBox
nsivabalan commented on issue #6462: URL: https://github.com/apache/hudi/issues/6462#issuecomment-1263068927 closing github issue as we have a fix. thanks for reporting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] hudi-bot commented on pull request #6836: [HUDI-4952] Fixing reading from metadata table when there are no inflight commits

2022-09-29 Thread GitBox
hudi-bot commented on PR #6836: URL: https://github.com/apache/hudi/pull/6836#issuecomment-1263066749 ## CI report: * 77223f8b87bdfcfa75045fb622b127cc4f9e47ab UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] nsivabalan commented on issue #6462: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata

2022-09-29 Thread GitBox
nsivabalan commented on issue #6462: URL: https://github.com/apache/hudi/issues/6462#issuecomment-1263064400 https://github.com/apache/hudi/pull/6836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588][HUDI-4472] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-09-29 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1263063979 ## CI report: * 288d166c49602a4593b1e97763a467811903737d UNKNOWN * ae59f6f918a5a08535b73be5c3fc2f29f5e84fb9 Azure:

[jira] [Updated] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4952: - Labels: pull-request-available (was: ) > Reading from metadata table could fail when there are

[GitHub] [hudi] nsivabalan opened a new pull request, #6836: [HUDI-4952] Fixing reading from metadata table when there are no inflight commits

2022-09-29 Thread GitBox
nsivabalan opened a new pull request, #6836: URL: https://github.com/apache/hudi/pull/6836 ### Change Logs When metadata table is just getting initialized, but first commit is not yet fully complete, reading from metadata table could fail w/ below stacktrace. Call trace that

[GitHub] [hudi] xiarixiaoyao commented on issue #6835: [SUPPORT] hive doesnt support mor read now, pls confirm

2022-09-29 Thread GitBox
xiarixiaoyao commented on issue #6835: URL: https://github.com/apache/hudi/issues/6835#issuecomment-1263059421 https://github.com/apache/hudi/pull/6741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[jira] [Updated] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4952: -- Sprint: 2022/09/19 > Reading from metadata table could fail when there are no completed

[jira] [Updated] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4952: -- Priority: Blocker (was: Major) > Reading from metadata table could fail when there are

[jira] [Assigned] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4952: - Assignee: sivabalan narayanan > Reading from metadata table could fail when

[jira] [Updated] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4952: -- Fix Version/s: 0.12.1 > Reading from metadata table could fail when there are no

[jira] [Created] (HUDI-4952) Reading from metadata table could fail when there are no completed commits

2022-09-29 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4952: - Summary: Reading from metadata table could fail when there are no completed commits Key: HUDI-4952 URL: https://issues.apache.org/jira/browse/HUDI-4952

[GitHub] [hudi] wwli05 opened a new issue, #6835: [SUPPORT] hive doesnt support mor read now, pls confirm

2022-09-29 Thread GitBox
wwli05 opened a new issue, #6835: URL: https://github.com/apache/hudi/issues/6835 from HoodieRealtimeRecordReader, it says support merge on read record reading, but from my test, it only return data from the log file. i looked the RealtimeCompactedRecordReader, public

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6705: [HUDI-4868] Fixed the issue that compaction is invalid when the last commit action is replace commit.

2022-09-29 Thread GitBox
nsivabalan commented on code in PR #6705: URL: https://github.com/apache/hudi/pull/6705#discussion_r984172615 ## hudi-common/src/main/java/org/apache/hudi/common/util/CompactionUtils.java: ## @@ -214,22 +216,22 @@ public static List

[GitHub] [hudi] hudi-bot commented on pull request #6805: [HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the Row

2022-09-29 Thread GitBox
hudi-bot commented on PR #6805: URL: https://github.com/apache/hudi/pull/6805#issuecomment-1263034687 ## CI report: * 573c27aef34708f1b6019f0647a0ef7093c3a96a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6805: [HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the Row

2022-09-29 Thread GitBox
hudi-bot commented on PR #6805: URL: https://github.com/apache/hudi/pull/6805#issuecomment-1263032260 ## CI report: * 573c27aef34708f1b6019f0647a0ef7093c3a96a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6830: [HUDI-2118] Skip checking corrupt log blocks for transactional write file systems

2022-09-29 Thread GitBox
hudi-bot commented on PR #6830: URL: https://github.com/apache/hudi/pull/6830#issuecomment-1263026890 ## CI report: * 6ab358154bb350a68340c9e8b9cafd0de260252c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6751: [MINOR] Fixes to make unit tests work on m1

2022-09-29 Thread GitBox
hudi-bot commented on PR #6751: URL: https://github.com/apache/hudi/pull/6751#issuecomment-1263026731 ## CI report: * c7a1d373796e8bfce040bd79a07f68ef6b7ffc59 UNKNOWN * 287c52c6da5eb75093f3c9f7bfd5bfaf0eeb9ac0 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6793: 【HUDI-4917】Optimized the way to get HoodieBaseFile of loadColumnRange…

2022-09-29 Thread GitBox
hudi-bot commented on PR #6793: URL: https://github.com/apache/hudi/pull/6793#issuecomment-1263026777 ## CI report: * 32cc352122d276f5bb5943a0dd420920854fdb8e Azure:

[GitHub] [hudi] boneanxs commented on a diff in pull request #6793: 【HUDI-4917】Optimized the way to get HoodieBaseFile of loadColumnRange…

2022-09-29 Thread GitBox
boneanxs commented on code in PR #6793: URL: https://github.com/apache/hudi/pull/6793#discussion_r984153227 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java: ## @@ -161,19 +162,19 @@ private List>

[GitHub] [hudi] hudi-bot commented on pull request #6745: Fix comment in RFC46

2022-09-29 Thread GitBox
hudi-bot commented on PR #6745: URL: https://github.com/apache/hudi/pull/6745#issuecomment-1263026678 ## CI report: * f2823f9cfd431f63e8026cd4a4d4680cd842a660 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6815: [HUDI-4937] Fix `HoodieTable` injecting non-reusable `HoodieBackedTableMetadata` aggressively flushing MT readers

2022-09-29 Thread GitBox
hudi-bot commented on PR #6815: URL: https://github.com/apache/hudi/pull/6815#issuecomment-1263026828 ## CI report: * 12160b8c178ef5bd2721727207c41fdfa2f40e8f Azure:

[GitHub] [hudi] boneanxs commented on pull request #6793: 【HUDI-4917】Optimized the way to get HoodieBaseFile of loadColumnRange…

2022-09-29 Thread GitBox
boneanxs commented on PR #6793: URL: https://github.com/apache/hudi/pull/6793#issuecomment-1263026464 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] giftbowen commented on pull request #6830: [HUDI-2118] Skip checking corrupt log blocks for transactional write file systems

2022-09-29 Thread GitBox
giftbowen commented on PR #6830: URL: https://github.com/apache/hudi/pull/6830#issuecomment-1263021647 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] scxwhite commented on issue #6687: [SUPPORT] Poor Upsert Performance on COW table due to indexing

2022-09-29 Thread GitBox
scxwhite commented on issue #6687: URL: https://github.com/apache/hudi/issues/6687#issuecomment-1263019514 You can see how to use these indexes in the [official documents.](https://hudi.apache.org/docs/basic_configurations#index-configs) If you want to know more about bucket index. Take a

[GitHub] [hudi] yuzhaojing closed pull request #6823: [Do Not Merge] test for 0.12.1 rc1

2022-09-29 Thread GitBox
yuzhaojing closed pull request #6823: [Do Not Merge] test for 0.12.1 rc1 URL: https://github.com/apache/hudi/pull/6823 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] boneanxs commented on pull request #6725: [HUDI-4881] Push down filters if possible when syncing partitions to Hive

2022-09-29 Thread GitBox
boneanxs commented on PR #6725: URL: https://github.com/apache/hudi/pull/6725#issuecomment-1263013312 @codope @yihua @alexeykudinkin @xushiyan Hi, could you plz take a look this improvement? -- This is an automated message from the Apache Git Service. To respond to the message, please

[jira] [Closed] (HUDI-4879) MERGE INTO fails when setting "hoodie.datasource.write.payload.class"

2022-09-29 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin closed HUDI-4879. - Resolution: Fixed > MERGE INTO fails when setting "hoodie.datasource.write.payload.class" >

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5416: [HUDI-3963] Use Lock-Free Message Queue Disruptor Improving Hoodie Writing Efficiency

2022-09-29 Thread GitBox
alexeykudinkin commented on code in PR #5416: URL: https://github.com/apache/hudi/pull/5416#discussion_r984089070 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -230,6 +240,16 @@ public class HoodieWriteConfig extends

[GitHub] [hudi] hudi-bot commented on pull request #6741: [HUDI-4898] presto/hive respect payload during merge parquet file and logfile when reading mor table

2022-09-29 Thread GitBox
hudi-bot commented on PR #6741: URL: https://github.com/apache/hudi/pull/6741#issuecomment-1262994744 ## CI report: * bff3acafde6d8a1bd5574b90ce644ef30acbf0a2 UNKNOWN * e39d50d6242e272f867c9987a8a2e97ca323568f Azure:

[GitHub] [hudi] xiarixiaoyao commented on pull request #6741: [HUDI-4898] presto/hive respect payload during merge parquet file and logfile when reading mor table

2022-09-29 Thread GitBox
xiarixiaoyao commented on PR #6741: URL: https://github.com/apache/hudi/pull/6741#issuecomment-1262994330 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #6831: [DO NOT MERGE] doing a test

2022-09-29 Thread GitBox
hudi-bot commented on PR #6831: URL: https://github.com/apache/hudi/pull/6831#issuecomment-1262989566 ## CI report: * abde5c46b45518257866a3de7914352920c8c5cf Azure:

[GitHub] [hudi] zhengyuan-cn commented on issue #6596: [SUPPORT] with Impala 4.0 Records lost

2022-09-29 Thread GitBox
zhengyuan-cn commented on issue #6596: URL: https://github.com/apache/hudi/issues/6596#issuecomment-1262986108 > > I replaced impala hudi dependency jar (hudi-common-0.5.0-incubating.jar, hudi-hadoop-mr-0.5.0-incubating.jar) with (hudi-common-0.12.0.jar, hudi-hadoop-mr-0.12.0.jar),issues

[GitHub] [hudi] zhengyuan-cn opened a new issue, #6596: [SUPPORT] with Impala 4.0 Records lost

2022-09-29 Thread GitBox
zhengyuan-cn opened a new issue, #6596: URL: https://github.com/apache/hudi/issues/6596 ENV: impala4.0+hive3.1.1 with hudi 0.12 via impala shell execute sql: select count(*) from tableName; return rows count is (195264946) less than actuall rows 217884008. but by spark SQL return

[GitHub] [hudi] zhengyuan-cn commented on issue #6596: [SUPPORT] with Impala 4.0 Records lost

2022-09-29 Thread GitBox
zhengyuan-cn commented on issue #6596: URL: https://github.com/apache/hudi/issues/6596#issuecomment-1262984003 > > > I replaced impala hudi dependency jar (hudi-common-0.5.0-incubating.jar, hudi-hadoop-mr-0.5.0-incubating.jar) with (hudi-common-0.12.0.jar, hudi-hadoop-mr-0.12.0.jar),issues

[GitHub] [hudi] zhengyuan-cn closed issue #6596: [SUPPORT] with Impala 4.0 Records lost

2022-09-29 Thread GitBox
zhengyuan-cn closed issue #6596: [SUPPORT] with Impala 4.0 Records lost URL: https://github.com/apache/hudi/issues/6596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #6817: [HUDI-4942] Fix RowSource schema provider

2022-09-29 Thread GitBox
hudi-bot commented on PR #6817: URL: https://github.com/apache/hudi/pull/6817#issuecomment-1262945708 ## CI report: * e1589ebfa7aea943040a85de3b93a4613b365d83 Azure:

[GitHub] [hudi] nsivabalan merged pull request #6355: [HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand

2022-09-29 Thread GitBox
nsivabalan merged PR #6355: URL: https://github.com/apache/hudi/pull/6355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #6665: [HUDI-4850] Incremental Ingestion from GCS

2022-09-29 Thread GitBox
hudi-bot commented on PR #6665: URL: https://github.com/apache/hudi/pull/6665#issuecomment-1262893169 ## CI report: * 4864b65515d6e9ea5b6ba9d83241cfc310cbf3ee UNKNOWN * 5ed92a20666863315f41578a905dd6f2681a1363 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6575: [HUDI-4754] Add compliance check in github actions

2022-09-29 Thread GitBox
hudi-bot commented on PR #6575: URL: https://github.com/apache/hudi/pull/6575#issuecomment-1262892850 ## CI report: * 1600e31836157c8d05e3bc8b9e08e1717471f1a6 UNKNOWN * 4d02f2c64a5fc4b89889677ee639a20b53cec26a UNKNOWN * 48147d19c835e7868102fd2d083659e6ee2ac343 UNKNOWN *

[hudi] branch master updated: [HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand (#6355)

2022-09-29 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 15ca7a3060 [HUDI-4925] Should Force to use

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6680: [HUDI-4812] lazy fetching partition path & file slice for HoodieFileIndex

2022-09-29 Thread GitBox
alexeykudinkin commented on code in PR #6680: URL: https://github.com/apache/hudi/pull/6680#discussion_r984047423 ## hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java: ## @@ -179,15 +197,125 @@ public void close() throws Exception { } protected

[hudi] branch asf-site updated: [DOCS] Add new blogs (#6833)

2022-09-29 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 001100a4ed [DOCS] Add new blogs (#6833)

[GitHub] [hudi] yihua merged pull request #6833: [DOCS] Add new blogs

2022-09-29 Thread GitBox
yihua merged PR #6833: URL: https://github.com/apache/hudi/pull/6833 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yihua opened a new pull request, #6834: [DOCS] Add 1.0.0 release entry to Roadmap

2022-09-29 Thread GitBox
yihua opened a new pull request, #6834: URL: https://github.com/apache/hudi/pull/6834 ### Change Logs As above. ### Impact **Risk level: none** The website can be built and visualized. ### Documentation Update N/A. ### Contributor's checklist

[GitHub] [hudi] bhasudha commented on pull request #6833: [DOCS] Add new blogs

2022-09-29 Thread GitBox
bhasudha commented on PR #6833: URL: https://github.com/apache/hudi/pull/6833#issuecomment-1262862834 Screenshot attached from local testing https://user-images.githubusercontent.com/2179254/193150303-4a14718d-12aa-42d9-9d7d-13a4a011b385.png;> -- This is an automated message

[GitHub] [hudi] bhasudha opened a new pull request, #6833: [DOCS] Add new blogs

2022-09-29 Thread GitBox
bhasudha opened a new pull request, #6833: URL: https://github.com/apache/hudi/pull/6833 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[GitHub] [hudi] yihua commented on a diff in pull request #5113: [HUDI-3625] [RFC-60] Optimized storage layout for Cloud Object Stores

2022-09-29 Thread GitBox
yihua commented on code in PR #5113: URL: https://github.com/apache/hudi/pull/5113#discussion_r984045645 ## rfc/rfc-56/rfc-56.md: ## @@ -0,0 +1,226 @@ + + +# RFC-56: Federated Storage Layer + +## Proposers +- @umehrot2 + +## Approvers +- @vinoth +- @shivnarayan + +## Status +

[GitHub] [hudi] yihua commented on a diff in pull request #5113: [HUDI-3625] [RFC-60] Optimized storage layout for Cloud Object Stores

2022-09-29 Thread GitBox
yihua commented on code in PR #5113: URL: https://github.com/apache/hudi/pull/5113#discussion_r984024879 ## rfc/rfc-56/rfc-56.md: ## @@ -0,0 +1,226 @@ + + +# RFC-56: Federated Storage Layer + +## Proposers +- @umehrot2 + +## Approvers +- @vinoth +- @shivnarayan + +## Status +

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588][HUDI-4472] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-09-29 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1262834872 ## CI report: * 288d166c49602a4593b1e97763a467811903737d UNKNOWN * ae59f6f918a5a08535b73be5c3fc2f29f5e84fb9 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6355: [HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand

2022-09-29 Thread GitBox
hudi-bot commented on PR #6355: URL: https://github.com/apache/hudi/pull/6355#issuecomment-1262834668 ## CI report: * 51fe330035a595e4d65cdf58554077ed0916fd25 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6815: [HUDI-4937] Fix `HoodieTable` injecting non-reusable `HoodieBackedTableMetadata` aggressively flushing MT readers

2022-09-29 Thread GitBox
hudi-bot commented on PR #6815: URL: https://github.com/apache/hudi/pull/6815#issuecomment-1262827821 ## CI report: * 12160b8c178ef5bd2721727207c41fdfa2f40e8f Azure:

[hudi] branch asf-site updated: [DOCS] Add images for new blogs

2022-09-29 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 8260a6882c [DOCS] Add images for new

[GitHub] [hudi] alexeykudinkin commented on issue #6758: [SUPPORT] Will metatable support partitions inside col_stat & files?

2022-09-29 Thread GitBox
alexeykudinkin commented on issue #6758: URL: https://github.com/apache/hudi/issues/6758#issuecomment-1262814782 @Zhangshunyu we're able to do this filtering even w/o physical partitioning (thanks to relying on HFile and elaborate key encoding scheme) -- we only read the records

[GitHub] [hudi] alexeykudinkin commented on pull request #6815: [HUDI-4937] Fix `HoodieTable` injecting non-reusable `HoodieBackedTableMetadata` aggressively flushing MT readers

2022-09-29 Thread GitBox
alexeykudinkin commented on PR #6815: URL: https://github.com/apache/hudi/pull/6815#issuecomment-1262808073 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6805: [HUDI-4949] optimize cdc read to avoid the problem of reusing buffer underlying the Row

2022-09-29 Thread GitBox
alexeykudinkin commented on code in PR #6805: URL: https://github.com/apache/hudi/pull/6805#discussion_r984015580 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala: ## @@ -516,7 +515,7 @@ class HoodieCDCRDD( val iter =

[GitHub] [hudi] alexeykudinkin commented on pull request #6358: [HUDI-4588][HUDI-4472] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-09-29 Thread GitBox
alexeykudinkin commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1262807900 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] nochimow commented on issue #6811: [SUPPORT] Slow upsert performance

2022-09-29 Thread GitBox
nochimow commented on issue #6811: URL: https://github.com/apache/hudi/issues/6811#issuecomment-1262804768 Hi @nsivabalan, ~97% of the data should be inserts and the remaning are updates. The updates only touches the latest partitions. (-1 day at max) No, we are not setting any small

[jira] [Updated] (HUDI-3204) Allow original partition column value to be retrieved when using TimestampBasedKeyGen

2022-09-29 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3204: -- Description: {color:#172b4d}Currently, b/c Spark by default omits partition values from the

[GitHub] [hudi] alexeykudinkin commented on pull request #6355: [HUDI-4925] Should Force to use ExpressionPayload in MergeIntoTableCommand

2022-09-29 Thread GitBox
alexeykudinkin commented on PR #6355: URL: https://github.com/apache/hudi/pull/6355#issuecomment-1262804005 CI is green: https://user-images.githubusercontent.com/428277/193139753-763ed18d-ee41-4e29-9eab-850c05f99912.png;>

[GitHub] [hudi] alexeykudinkin commented on issue #6798: [SUPPORT] - can't retrieve the partition field in stored parquet file

2022-09-29 Thread GitBox
alexeykudinkin commented on issue #6798: URL: https://github.com/apache/hudi/issues/6798#issuecomment-1262803683 @sstimmel this is a known issue due to how Spark treats partition-columns (by default, Spark doesn't persist them in the data files, but instead encoding them into partition

[jira] [Updated] (HUDI-3204) Allow original partition column value to be retrieved when using TimestampBasedKeyGen

2022-09-29 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3204: -- Summary: Allow original partition column value to be retrieved when using TimestampBasedKeyGen

[jira] [Updated] (HUDI-4879) MERGE INTO fails when setting "hoodie.datasource.write.payload.class"

2022-09-29 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4879: -- Reviewers: Alexey Kudinkin > MERGE INTO fails when setting

  1   2   3   >