[GitHub] [hudi] danny0405 commented on pull request #7907: [HUDI-6495][RFC-66] Non-blocking multi writer support

2023-08-29 Thread via GitHub
danny0405 commented on PR #7907: URL: https://github.com/apache/hudi/pull/7907#issuecomment-1698506743 > I feel we need to add lot more details here; including how new file slice generation works for all different types of queries. Make assumptions and format changes clear Yeah, we

[GitHub] [hudi] danny0405 commented on a diff in pull request #7907: [HUDI-6495][RFC-66] Non-blocking multi writer support

2023-08-29 Thread via GitHub
danny0405 commented on code in PR #7907: URL: https://github.com/apache/hudi/pull/7907#discussion_r1309671455 ## rfc/rfc-66/rfc-66.md: ## @@ -0,0 +1,119 @@ +# RFC-66: Lockless Multi Writer + +## Proposers +- @danny0405 +- @ForwardXu +- @SteNicholas + +## Approvers +- + +##

[GitHub] [hudi] danny0405 commented on a diff in pull request #7907: [HUDI-6495][RFC-66] Non-blocking multi writer support

2023-08-29 Thread via GitHub
danny0405 commented on code in PR #7907: URL: https://github.com/apache/hudi/pull/7907#discussion_r1309670169 ## rfc/rfc-66/rfc-66.md: ## @@ -0,0 +1,119 @@ +# RFC-66: Lockless Multi Writer + +## Proposers +- @danny0405 +- @ForwardXu +- @SteNicholas + +## Approvers +- + +##

[GitHub] [hudi] KnightChess commented on a diff in pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

2023-08-29 Thread via GitHub
KnightChess commented on code in PR #8610: URL: https://github.com/apache/hudi/pull/8610#discussion_r1309663986 ## hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java: ## @@ -1039,21 +1039,27 @@ public void createImmutableFileInPath(Path fullPath,

[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

2023-08-29 Thread via GitHub
KnightChess commented on PR #8610: URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698503648 > > why not throw exception? > > Which step you mean to throw exception, we generally need some lock to make the instant time generation monotonically increasing. Currently we

[GitHub] [hudi] danny0405 commented on a diff in pull request #7907: [HUDI-6495][RFC-66] Non-blocking multi writer support

2023-08-29 Thread via GitHub
danny0405 commented on code in PR #7907: URL: https://github.com/apache/hudi/pull/7907#discussion_r1309669080 ## rfc/rfc-66/rfc-66.md: ## @@ -0,0 +1,119 @@ +# RFC-66: Lockless Multi Writer + +## Proposers +- @danny0405 +- @ForwardXu +- @SteNicholas + +## Approvers +- + +##

[GitHub] [hudi] KnightChess commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

2023-08-29 Thread via GitHub
KnightChess commented on PR #8610: URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698499775 @danny0405 sorry, I comment under the review discuss -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] danny0405 commented on pull request #8610: [HUDI-6156] Prevent leaving tmp file in timeline when multi process t…

2023-08-29 Thread via GitHub
danny0405 commented on PR #8610: URL: https://github.com/apache/hudi/pull/8610#issuecomment-1698497261 > why not throw exception? Which step you mean to throw exception, we generally need some lock to make the instant time generation monotonically increasing. Currently we have no

[GitHub] [hudi] hudi-bot commented on pull request #9221: [HUDI-6550] Add Hadoop conf to HiveConf for HiveSyncConfig

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9221: URL: https://github.com/apache/hudi/pull/9221#issuecomment-1698493714 ## CI report: * 68f6b69e42a45349c941d386344b28a60f1c7b29 Azure:

[GitHub] [hudi] danny0405 commented on pull request #9553: [HUDI-1517][HUDI-6758][HUDI-6761] Adding support for per-logfile marker to track all log files added by a commit and to assist with rollbacks

2023-08-29 Thread via GitHub
danny0405 commented on PR #9553: URL: https://github.com/apache/hudi/pull/9553#issuecomment-1698493463 > I would caution against landing it in 0.14.0. -1 for landing it in 0.14.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] hudi-bot commented on pull request #9221: [HUDI-6550] Add Hadoop conf to HiveConf for HiveSyncConfig

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9221: URL: https://github.com/apache/hudi/pull/9221#issuecomment-1698460690 ## CI report: * 68f6b69e42a45349c941d386344b28a60f1c7b29 Azure:

[GitHub] [hudi] xushiyan commented on pull request #9221: [HUDI-6550] Add Hadoop conf to HiveConf for HiveSyncConfig

2023-08-29 Thread via GitHub
xushiyan commented on PR #9221: URL: https://github.com/apache/hudi/pull/9221#issuecomment-1698457359 @CTTY will you be able to test and verify the change before we land it? it's a blocker for 0.14.0 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698450736 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN * 7e769e60b101466c27604ce531b95f42eab87885 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #9571: Enabling comprehensive schema evolution in delta streamer code

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9571: URL: https://github.com/apache/hudi/pull/9571#issuecomment-1698418034 ## CI report: * 070278982fdd12e8f708ea22cbfc641b41d2cfc7 Azure:

[jira] [Updated] (HUDI-4358) Standardize the order field(orderingVal/eventTime) of Hudi

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4358: Fix Version/s: 1.0.0 > Standardize the order field(orderingVal/eventTime) of Hudi >

[jira] [Updated] (HUDI-4321) Fix Hudi to not write in Parquet legacy format

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4321: Fix Version/s: 1.0.0 (was: 0.14.0) > Fix Hudi to not write in Parquet legacy format

[jira] [Updated] (HUDI-3354) Rebase `HoodieRealtimeRecordReader` to return `HoodieRecord`

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3354: Fix Version/s: 1.0.0 > Rebase `HoodieRealtimeRecordReader` to return `HoodieRecord` >

[jira] [Updated] (HUDI-5249) Support MetadataColumnStatsIndex for Spark record

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5249: Fix Version/s: 1.0.0 > Support MetadataColumnStatsIndex for Spark record >

[jira] [Updated] (HUDI-5282) Support Metadata in HoodieSparkRecord

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5282: Fix Version/s: 1.0.0 > Support Metadata in HoodieSparkRecord > - > >

[jira] [Updated] (HUDI-5264) Test parquet log with avro record in spark sql test

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5264: Fix Version/s: 1.0.0 > Test parquet log with avro record in spark sql test >

[GitHub] [hudi] hudi-bot commented on pull request #9521: [HUDI-6736] Revert pr 8849

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9521: URL: https://github.com/apache/hudi/pull/9521#issuecomment-1698412523 ## CI report: * a75561640dac35e42b001a35b326b6af2bfed7a4 Azure:

[jira] [Updated] (HUDI-5807) HoodieSparkParquetReader is not appending partition-path values

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5807: Fix Version/s: 1.0.0 (was: 0.14.0) > HoodieSparkParquetReader is not appending

[jira] [Updated] (HUDI-6768) Revisit HoodieRecord design and how it affects e2e row writing

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6768: Fix Version/s: 1.0.0 > Revisit HoodieRecord design and how it affects e2e row writing >

[jira] [Updated] (HUDI-6767) Simplify compatibility of HoodieRecord conversion

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6767: Fix Version/s: 1.0.0 > Simplify compatibility of HoodieRecord conversion >

[jira] [Updated] (HUDI-6752) Scope out the work for file group reading and writing with record merging in Spark

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6752: Story Points: 3 > Scope out the work for file group reading and writing with record merging in > Spark >

[jira] [Updated] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6798: Story Points: 3 > Implement event-time-based merging mode in FileGroupReader >

[jira] [Updated] (HUDI-6797) Implement position-based updates in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6797: Story Points: 3 > Implement position-based updates in FileGroupReader >

[jira] [Updated] (HUDI-1517) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-1517: - Fix Version/s: 0.14.0 (was: 1.0.0) > Create marker file for every log file >

[GitHub] [hudi] beyond1920 closed issue #9090: [SUPPORT] FileNotFoundException would happen occasionally after cherrypick HUDI-1517

2023-08-29 Thread via GitHub
beyond1920 closed issue #9090: [SUPPORT] FileNotFoundException would happen occasionally after cherrypick HUDI-1517 URL: https://github.com/apache/hudi/issues/9090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Comment Edited] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760200#comment-17760200 ] Jing Zhang edited comment on HUDI-6484 at 8/30/23 2:29 AM: --- The issue

[jira] [Comment Edited] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760200#comment-17760200 ] Jing Zhang edited comment on HUDI-6484 at 8/30/23 2:28 AM: --- The issue

[jira] [Closed] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang closed HUDI-6484. Fix Version/s: (was: 1.0.0) Resolution: Workaround > FileNotFoundException would happen

[jira] [Comment Edited] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760200#comment-17760200 ] Jing Zhang edited comment on HUDI-6484 at 8/30/23 2:27 AM: --- The issue

[jira] [Commented] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760200#comment-17760200 ] Jing Zhang commented on HUDI-6484: -- The issue [HUDI-1517|https://issues.apache.org/jira/browse/HUDI-1517]

[jira] [Commented] (HUDI-1517) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760198#comment-17760198 ] Jing Zhang commented on HUDI-1517: -- The commit [#4913|https://github.com/apache/hudi/pull/4913] is

[jira] [Updated] (HUDI-6803) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6803: - Description: The issue aims to introduce marker file for log files , which is similar with

[jira] [Updated] (HUDI-6784) Support custom logic for deletion

2023-08-29 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6784: - Fix Version/s: 1.0.0 > Support custom logic for deletion > - > >

[jira] [Updated] (HUDI-6712) Implement optimized keyed lookup on parquet files

2023-08-29 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6712: - Status: Patch Available (was: In Progress) > Implement optimized keyed lookup on parquet files >

[jira] [Updated] (HUDI-6803) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6803: - Description: The issue aims to introduce marker file for log files , which is similar with

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698385621 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN * 24e7acc1e3f5cbf039a796395dca0832e866b377 Azure:

[jira] [Updated] (HUDI-6803) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6803: - Fix Version/s: 1.0.0 > Create marker file for every log file > - > >

[jira] [Updated] (HUDI-1517) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-1517: - Fix Version/s: 1.0.0 > Create marker file for every log file > - > >

[jira] [Created] (HUDI-6803) Create marker file for every log file

2023-08-29 Thread Jing Zhang (Jira)
Jing Zhang created HUDI-6803: Summary: Create marker file for every log file Key: HUDI-6803 URL: https://issues.apache.org/jira/browse/HUDI-6803 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-6484) FileNotFoundException would happen occasionally during read latest snapshot of a MOR table

2023-08-29 Thread Jing Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhang updated HUDI-6484: - Fix Version/s: 1.0.0 (was: 0.14.0) > FileNotFoundException would happen

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698380243 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN * 24e7acc1e3f5cbf039a796395dca0832e866b377 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #9553: [HUDI-1517][HUDI-6758][HUDI-6761] Adding support for per-logfile marker to track all log files added by a commit and to assist with rollbacks

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9553: URL: https://github.com/apache/hudi/pull/9553#issuecomment-1698374674 ## CI report: * aeac327c3cad812fea5e2bc01c07c1314bbf1838 UNKNOWN * 390358e6f53821e8c19365974d4d1da9b2ee0e89 Azure:

[GitHub] [hudi] Zouxxyy commented on pull request #9554: [HUDI-6760] Add SelfDescribingInputFormatInterface for hive FileInput…

2023-08-29 Thread via GitHub
Zouxxyy commented on PR #9554: URL: https://github.com/apache/hudi/pull/9554#issuecomment-1698367243 Can someone help to understand why `hudi-spark-common` cannot automatically depend on `hive-exec` in `hudi-hadoop-mr` ? ```text mvn dependency:tree -pl

[jira] [Assigned] (HUDI-6641) Remove the log append and always uses the current instant time in file name

2023-08-29 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-6641: - Assignee: Danny Chen > Remove the log append and always uses the current instant time in file

[jira] [Closed] (HUDI-6782) Instead of appending to same log file, consider one log file per commit

2023-08-29 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-6782. - Resolution: Duplicate Duplicate of HUDI-6641 > Instead of appending to same log file, consider one log

[jira] [Closed] (HUDI-6758) Avoid duplicated log blocks on the LogRecordReader

2023-08-29 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-6758. - Resolution: Fixed > Avoid duplicated log blocks on the LogRecordReader >

[jira] [Updated] (HUDI-6758) Avoid duplicated log blocks on the LogRecordReader

2023-08-29 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-6758: -- Fix Version/s: 0.14.0 > Avoid duplicated log blocks on the LogRecordReader >

[hudi] branch master updated: [HUDI-6758] Detecting and skipping Spurious log blocks with MOR reads (#9545)

2023-08-29 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9425e5ac8f6 [HUDI-6758] Detecting and skipping

[GitHub] [hudi] codope merged pull request #9545: [HUDI-6758] Detecting and skipping Spurious log blocks with MOR reads

2023-08-29 Thread via GitHub
codope merged PR #9545: URL: https://github.com/apache/hudi/pull/9545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698343258 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN * 24e7acc1e3f5cbf039a796395dca0832e866b377 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698337862 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN * 24e7acc1e3f5cbf039a796395dca0832e866b377 UNKNOWN Bot commands @hudi-bot supports the

[GitHub] [hudi] vinothchandar commented on a diff in pull request #9559: [HUDI-3727] Add metrics for async indexer

2023-08-29 Thread via GitHub
vinothchandar commented on code in PR #9559: URL: https://github.com/apache/hudi/pull/9559#discussion_r1309480711 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/index/RunIndexActionExecutor.java: ## @@ -100,6 +105,11 @@ public class

[jira] [Updated] (HUDI-6802) Use completion time in Spark FileIndex for listing

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6802: Fix Version/s: 1.0.0 > Use completion time in Spark FileIndex for listing >

[jira] [Created] (HUDI-6802) Use completion time in Spark FileIndex for listing

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6802: --- Summary: Use completion time in Spark FileIndex for listing Key: HUDI-6802 URL: https://issues.apache.org/jira/browse/HUDI-6802 Project: Apache Hudi Issue Type: New

[jira] [Created] (HUDI-6801) Implement merging of partial updates in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6801: --- Summary: Implement merging of partial updates in FileGroupReader Key: HUDI-6801 URL: https://issues.apache.org/jira/browse/HUDI-6801 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-6801) Implement merging of partial updates in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6801: Fix Version/s: 1.0.0 > Implement merging of partial updates in FileGroupReader >

[jira] [Updated] (HUDI-6800) Implement log writing with partial updates on the write path

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6800: Summary: Implement log writing with partial updates on the write path (was: Implement partial update

[jira] [Updated] (HUDI-6800) Implement log writing with partial updates on the write path

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6800: Story Points: 5 > Implement log writing with partial updates on the write path >

[jira] [Updated] (HUDI-6800) Implement partial update encoding on the write path

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6800: Fix Version/s: 1.0.0 > Implement partial update encoding on the write path >

[jira] [Created] (HUDI-6800) Implement partial update encoding on the write path

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6800: --- Summary: Implement partial update encoding on the write path Key: HUDI-6800 URL: https://issues.apache.org/jira/browse/HUDI-6800 Project: Apache Hudi Issue Type: New

[jira] [Created] (HUDI-6799) Integrate FileGroupReader with merge handle on the write path

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6799: --- Summary: Integrate FileGroupReader with merge handle on the write path Key: HUDI-6799 URL: https://issues.apache.org/jira/browse/HUDI-6799 Project: Apache Hudi

[jira] [Updated] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6798: Fix Version/s: 1.0.0 > Implement event-time-based merging mode in FileGroupReader >

[jira] [Updated] (HUDI-6796) Implement position-based deletes in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6796: Fix Version/s: 1.0.0 > Implement position-based deletes in FileGroupReader >

[jira] [Updated] (HUDI-6797) Implement position-based updates in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6797: Fix Version/s: 1.0.0 > Implement position-based updates in FileGroupReader >

[jira] [Created] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6798: --- Summary: Implement event-time-based merging mode in FileGroupReader Key: HUDI-6798 URL: https://issues.apache.org/jira/browse/HUDI-6798 Project: Apache Hudi Issue

[jira] [Created] (HUDI-6797) Implement position-based updates in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6797: --- Summary: Implement position-based updates in FileGroupReader Key: HUDI-6797 URL: https://issues.apache.org/jira/browse/HUDI-6797 Project: Apache Hudi Issue Type: New

[jira] [Created] (HUDI-6796) Implement position-based deletes in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6796: --- Summary: Implement position-based deletes in FileGroupReader Key: HUDI-6796 URL: https://issues.apache.org/jira/browse/HUDI-6796 Project: Apache Hudi Issue Type: New

[GitHub] [hudi] vinothchandar commented on a diff in pull request #9462: [HUDI-3625] Update RFC-60

2023-08-29 Thread via GitHub
vinothchandar commented on code in PR #9462: URL: https://github.com/apache/hudi/pull/9462#discussion_r1309471028 ## rfc/rfc-60/rfc-60.md: ## @@ -196,13 +195,75 @@ for metadata table to be populated. 4. If there is an error reading from Metadata table, we will not fall back

[GitHub] [hudi] hudi-bot commented on pull request #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9572: URL: https://github.com/apache/hudi/pull/9572#issuecomment-1698306152 ## CI report: * ad05887b523496f59ac8b6e976183d6c325ed94d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #9571: Enabling comprehensive schema evolution in delta streamer code

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9571: URL: https://github.com/apache/hudi/pull/9571#issuecomment-1698306110 ## CI report: * 070278982fdd12e8f708ea22cbfc641b41d2cfc7 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #9571: Enabling comprehensive schema evolution in delta streamer code

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9571: URL: https://github.com/apache/hudi/pull/9571#issuecomment-1698300060 ## CI report: * 070278982fdd12e8f708ea22cbfc641b41d2cfc7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] linliu-code opened a new pull request, #9572: Utilize merger to replace insertValue api

2023-08-29 Thread via GitHub
linliu-code opened a new pull request, #9572: URL: https://github.com/apache/hudi/pull/9572 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[GitHub] [hudi] hudi-bot commented on pull request #9545: [HUDI-6758] Detecting and skipping Spurious log blocks with MOR reads

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9545: URL: https://github.com/apache/hudi/pull/9545#issuecomment-1698299873 ## CI report: * f12633fa6d50cf56b3e2036c2fa418fbf137c7fb Azure:

[GitHub] [hudi] hudi-bot commented on pull request #9521: [HUDI-6736] Revert pr 8849

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9521: URL: https://github.com/apache/hudi/pull/9521#issuecomment-1698299766 ## CI report: * 0aa97d414fd91d95e5931d108407dbc2b280b519 Azure:

[jira] [Created] (HUDI-6795) Implement generation of record_positions for updates and deletes on write path

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6795: --- Summary: Implement generation of record_positions for updates and deletes on write path Key: HUDI-6795 URL: https://issues.apache.org/jira/browse/HUDI-6795 Project: Apache

[GitHub] [hudi] hudi-bot commented on pull request #9545: [HUDI-6758] Detecting and skipping Spurious log blocks with MOR reads

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9545: URL: https://github.com/apache/hudi/pull/9545#issuecomment-1698294477 ## CI report: * f12633fa6d50cf56b3e2036c2fa418fbf137c7fb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #9521: [HUDI-6736] Revert pr 8849

2023-08-29 Thread via GitHub
hudi-bot commented on PR #9521: URL: https://github.com/apache/hudi/pull/9521#issuecomment-1698294397 ## CI report: * 0aa97d414fd91d95e5931d108407dbc2b280b519 Azure:

[jira] [Commented] (HUDI-6768) Revisit HoodieRecord design and how it affects e2e row writing

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760168#comment-17760168 ] Ethan Guo commented on HUDI-6768: - This is coupled with e2e row writing. Let's not jump on this until we

[jira] [Updated] (HUDI-6767) Simplify compatibility of HoodieRecord conversion

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6767: Story Points: 3 > Simplify compatibility of HoodieRecord conversion >

[jira] [Updated] (HUDI-6768) Revisit HoodieRecord design and how it affects e2e row writing

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6768: Story Points: 3 > Revisit HoodieRecord design and how it affects e2e row writing >

[jira] [Updated] (HUDI-6751) Scope out remaining work for the record merging API

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6751: Story Points: 4 > Scope out remaining work for the record merging API >

[jira] [Updated] (HUDI-6784) Support custom logic for deletion

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6784: Story Points: 3 > Support custom logic for deletion > - > >

[jira] [Updated] (HUDI-5282) Support Metadata in HoodieSparkRecord

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5282: Story Points: 8 > Support Metadata in HoodieSparkRecord > - > >

[jira] [Updated] (HUDI-6765) Add merge mode to allow differentiation of dedup logic

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6765: Story Points: 2 > Add merge mode to allow differentiation of dedup logic >

[jira] [Updated] (HUDI-4321) Fix Hudi to not write in Parquet legacy format

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4321: Story Points: 2 > Fix Hudi to not write in Parquet legacy format >

[jira] [Updated] (HUDI-5264) Test parquet log with avro record in spark sql test

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5264: Story Points: 3 > Test parquet log with avro record in spark sql test >

[jira] [Updated] (HUDI-5249) Support MetadataColumnStatsIndex for Spark record

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5249: Story Points: 5 > Support MetadataColumnStatsIndex for Spark record >

[jira] [Updated] (HUDI-4358) Standardize the order field(orderingVal/eventTime) of Hudi

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4358: Story Points: 4 > Standardize the order field(orderingVal/eventTime) of Hudi >

[jira] [Updated] (HUDI-6702) Extend merge API to support all merging operations

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6702: Story Points: 6 Issue Type: New Feature (was: Task) > Extend merge API to support all merging

[jira] [Created] (HUDI-6794) Support completion-time-based file slice in FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6794: --- Summary: Support completion-time-based file slice in FileGroupReader Key: HUDI-6794 URL: https://issues.apache.org/jira/browse/HUDI-6794 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-6793) Support time-travel read in engine-agnostic FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6793: Fix Version/s: 1.0.0 > Support time-travel read in engine-agnostic FileGroupReader >

[jira] [Created] (HUDI-6793) Support time-travel read in engine-agnostic FileGroupReader

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6793: --- Summary: Support time-travel read in engine-agnostic FileGroupReader Key: HUDI-6793 URL: https://issues.apache.org/jira/browse/HUDI-6793 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-6792) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6792: Fix Version/s: 1.0.0 > Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark > Incremental

[jira] [Updated] (HUDI-6792) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6792: Story Points: 3 (was: 2) > Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark >

[jira] [Created] (HUDI-6792) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query

2023-08-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-6792: --- Summary: Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query Key: HUDI-6792 URL: https://issues.apache.org/jira/browse/HUDI-6792 Project:

[jira] [Updated] (HUDI-6791) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark CDC Query

2023-08-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-6791: Fix Version/s: 1.0.0 > Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark CDC Query >

  1   2   3   >