[jira] [Updated] (HUDI-6787) Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompactedRecordReader for Hive

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6787: - Summary: Hive Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompacted

[jira] [Updated] (HUDI-7045) Fix new file format and reader for schema evolution

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7045: - Reviewers: Ethan Guo > Fix new file format and reader for schema evolution > -

[jira] [Updated] (HUDI-7507) ongoing concurrent writers with smaller timestamp can cause issues with table services

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7507: - Fix Version/s: (was: 1.0.0) > ongoing concurrent writers with smaller timestamp can cause iss

[jira] [Updated] (HUDI-7503) Concurrent executions of table service plan should not corrupt dataset

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7503: - Description: Some external workflow schedulers can accidentally (or) misbehave and schedule dupli

[jira] [Updated] (HUDI-7503) Concurrent executions of table service plan should not corrupt dataset

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7503: - Fix Version/s: 0.15.0 1.0.0 > Concurrent executions of table service plan shoul

[jira] [Updated] (HUDI-7503) Concurrent executions of table service plan should not corrupt dataset

2024-04-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7503: - Summary: Concurrent executions of table service plan should not corrupt dataset (was: concurrent

[jira] [Commented] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832988#comment-17832988 ] Vinoth Chandar commented on HUDI-7559: -- [~codope] Hows this different from what we te

[jira] [Updated] (HUDI-7559) Fix functional index (on column stats): Handle NPE in filterQueriesWithRecordKey(...)

2024-04-01 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7559: - Sprint: Sprint 2024-03-25 > Fix functional index (on column stats): Handle NPE in > filterQueries

[jira] [Assigned] (HUDI-7484) Fix partitioning style when partition is inferred from partitionBy

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7484: Assignee: Vinaykumar Bhat > Fix partitioning style when partition is inferred from partitio

[jira] [Assigned] (HUDI-7510) Loosen the compaction scheduling and rollback check for MDT

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7510: Assignee: Danny Chen > Loosen the compaction scheduling and rollback check for MDT > --

[jira] [Updated] (HUDI-7457) Remove runtime shutdown hook from HoodieLogFormatWriter

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7457: - Status: Patch Available (was: In Progress) > Remove runtime shutdown hook from HoodieLogFormatWri

[jira] [Assigned] (HUDI-7335) Create hudi-hadoop-common for hadoop-specific implementation

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7335: Assignee: Ethan Guo > Create hudi-hadoop-common for hadoop-specific implementation > --

[jira] [Assigned] (HUDI-7457) Remove runtime shutdown hook from HoodieLogFormatWriter

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7457: Assignee: Danny Chen > Remove runtime shutdown hook from HoodieLogFormatWriter > --

[jira] [Updated] (HUDI-7457) Remove runtime shutdown hook from HoodieLogFormatWriter

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7457: - Status: In Progress (was: Open) > Remove runtime shutdown hook from HoodieLogFormatWriter > -

[jira] [Updated] (HUDI-6699) An indexed global timeline (phase2)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6699: - Status: In Progress (was: Open) > An indexed global timeline (phase2) > -

[jira] [Assigned] (HUDI-7065) Fix the new file group reader with COW in Spark integration

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7065: Assignee: Jonathan Vexler > Fix the new file group reader with COW in Spark integration > -

[jira] [Assigned] (HUDI-7028) Fix Spark Quick Start

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7028: Assignee: Jonathan Vexler (was: Lin Liu) > Fix Spark Quick Start > - >

[jira] [Assigned] (HUDI-7075) Fix validation of parquet column projection on HadoopFsRelation in TestParquetColumnProjection

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7075: Assignee: Jonathan Vexler > Fix validation of parquet column projection on HadoopFsRelation

[jira] [Updated] (HUDI-6700) Archiving should be time based, not this min-max and not per instant. Lets treat it like a log (Phase 2)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6700: - Status: In Progress (was: Open) > Archiving should be time based, not this min-max and not per in

[jira] [Assigned] (HUDI-7269) Fallback to key-based merging if there is no positions in log header

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7269: Assignee: Jonathan Vexler > Fallback to key-based merging if there is no positions in log h

[jira] [Assigned] (HUDI-7547) Simplification of archival, savepoint, cleaning interplays

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7547: Assignee: Danny Chen > Simplification of archival, savepoint, cleaning interplays > ---

[jira] [Assigned] (HUDI-7408) LSM tree writer failed with compaction

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7408: Assignee: Danny Chen > LSM tree writer failed with compaction > ---

[jira] [Assigned] (HUDI-7234) Handle both inserts and updates in log blocks for partial updates

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7234: Assignee: Vinoth Chandar (was: Lin Liu) > Handle both inserts and updates in log blocks fo

[jira] [Assigned] (HUDI-6713) Redesign CDC workload to include partition column for partition pruning

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6713: Assignee: Vinoth Chandar > Redesign CDC workload to include partition column for partition

[jira] [Assigned] (HUDI-6791) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark CDC Query

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6791: Assignee: Jonathan Vexler > Integrate FileGroupReader with NewHoodieParquetFileFormat for S

[jira] [Assigned] (HUDI-6792) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6792: Assignee: Jonathan Vexler > Integrate FileGroupReader with NewHoodieParquetFileFormat for S

[jira] [Assigned] (HUDI-7545) Concurrency control for LSM timeline management and writing.

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7545: Assignee: Danny Chen (was: Vinoth Chandar) > Concurrency control for LSM timeline manageme

[jira] [Updated] (HUDI-7547) Simplification of archival, savepoint, cleaning interplays

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7547: - Status: In Progress (was: Open) > Simplification of archival, savepoint, cleaning interplays > --

[jira] [Assigned] (HUDI-7229) Enable partial updates for CDC work payload

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7229: Assignee: Vinoth Chandar (was: Lin Liu) > Enable partial updates for CDC work payload > --

[jira] [Assigned] (HUDI-6816) Remove JSON HoodieCommitMetadata altogether

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6816: Assignee: Vinoth Chandar > Remove JSON HoodieCommitMetadata altogether > --

[jira] [Assigned] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6909: Assignee: Vinoth Chandar (was: Lin Liu) > Handle `_hoodie_operation` field in the new Hood

[jira] [Assigned] (HUDI-6802) Use completion time in Spark FileIndex for listing

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6802: Assignee: Jonathan Vexler > Use completion time in Spark FileIndex for listing > --

[jira] [Assigned] (HUDI-6794) Support completion-time-based file slice in FileGroupReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6794: Assignee: Jonathan Vexler > Support completion-time-based file slice in FileGroupReader > -

[jira] [Assigned] (HUDI-7548) Close any gaps on Indexing (bloom index, col stats, agg_stats, record index with support for non-unique keys,

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7548: Assignee: Vinoth Chandar > Close any gaps on Indexing (bloom index, col stats, agg_stats,

[jira] [Assigned] (HUDI-7548) Close any gaps on Indexing (bloom index, col stats, agg_stats, record index with support for non-unique keys,

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7548: Assignee: Sagar Sumit (was: Vinoth Chandar) > Close any gaps on Indexing (bloom index, co

[jira] [Assigned] (HUDI-6768) Revisit HoodieRecord design and how it affects e2e row writing

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6768: Assignee: Ethan Guo > Revisit HoodieRecord design and how it affects e2e row writing >

[jira] [Updated] (HUDI-1698) Multiwriting for Flink / Java

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1698: - Status: In Progress (was: Open) > Multiwriting for Flink / Java > - >

[jira] [Assigned] (HUDI-6596) Propose rollback implementation changes to guard against concurrent jobs

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6596: Assignee: Ethan Guo > Propose rollback implementation changes to guard against concurrent

[jira] [Assigned] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7007: Assignee: Vinaykumar Bhat (was: Sagar Sumit) > Integrate functional index using bloom filt

[jira] [Assigned] (HUDI-7480) initializeFunctionalIndexPartition is called multiple times

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7480: Assignee: Vinaykumar Bhat (was: Sagar Sumit) > initializeFunctionalIndexPartition is calle

[jira] [Assigned] (HUDI-7117) Functional index creation not working when table is created using datasource writer

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7117: Assignee: Vinaykumar Bhat (was: Sagar Sumit) > Functional index creation not working when

[jira] [Assigned] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7007: Assignee: Sagar Sumit > Integrate functional index using bloom filter on reader side >

[jira] [Updated] (HUDI-7420) Parallelize the process of constructing `logFilesMarkerPath` in CommitMetadatautils#reconcileMetadataForMissingFiles

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7420: - Story Points: 1 > Parallelize the process of constructing `logFilesMarkerPath` in > CommitMetadat

[jira] [Updated] (HUDI-7518) Fix HoodieMetadataPayload merging logic around repeated deletes

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7518: - Story Points: 1 > Fix HoodieMetadataPayload merging logic around repeated deletes > --

[jira] [Updated] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7007: - Story Points: 4 > Integrate functional index using bloom filter on reader side > -

[jira] [Updated] (HUDI-7480) initializeFunctionalIndexPartition is called multiple times

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7480: - Story Points: 2 > initializeFunctionalIndexPartition is called multiple times > --

[jira] [Assigned] (HUDI-7420) Parallelize the process of constructing `logFilesMarkerPath` in CommitMetadatautils#reconcileMetadataForMissingFiles

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-7420: Assignee: Sagar Sumit > Parallelize the process of constructing `logFilesMarkerPath` in >

[jira] [Updated] (HUDI-7527) Include instants outside commits and compaction for generating the latest instant and timeline hash for timeline server requests

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7527: - Sprint: Sprint 2024-03-25 > Include instants outside commits and compaction for generating the lat

[jira] [Updated] (HUDI-7531) Consider pending clustering when scheduling a new clustering plan

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7531: - Sprint: Sprint 2024-03-25 > Consider pending clustering when scheduling a new clustering plan > --

[jira] [Updated] (HUDI-7510) Loosen the compaction scheduling and rollback check for MDT

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7510: - Sprint: Sprint 2024-03-25 > Loosen the compaction scheduling and rollback check for MDT >

[jira] [Updated] (HUDI-7457) Remove runtime shutdown hook from HoodieLogFormatWriter

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7457: - Sprint: Sprint 2024-03-25 > Remove runtime shutdown hook from HoodieLogFormatWriter >

[jira] [Updated] (HUDI-7335) Create hudi-hadoop-common for hadoop-specific implementation

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7335: - Sprint: Sprint 2024-03-25 > Create hudi-hadoop-common for hadoop-specific implementation > ---

[jira] [Updated] (HUDI-7343) Replace Path.SEPARATOR with HoodieLocation.SEPARATOR

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7343: - Sprint: Sprint 2024-03-25 > Replace Path.SEPARATOR with HoodieLocation.SEPARATOR > ---

[jira] [Updated] (HUDI-7203) Reuse Hudi CompressionCodec enums

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7203: - Sprint: Sprint 2024-03-25 > Reuse Hudi CompressionCodec enums > -

[jira] [Updated] (HUDI-7157) Support filter pushdown for positional merging in Spark 3.5

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7157: - Sprint: Sprint 2024-03-25 > Support filter pushdown for positional merging in Spark 3.5 >

[jira] [Updated] (HUDI-7545) Concurrency control for LSM timeline management and writing.

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7545: - Sprint: Sprint 2024-03-25 > Concurrency control for LSM timeline management and writing. > ---

[jira] [Updated] (HUDI-7227) Enable completion time for File Group Reader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7227: - Sprint: Sprint 2024-03-25 > Enable completion time for File Group Reader > ---

[jira] [Updated] (HUDI-7028) Fix Spark Quick Start

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7028: - Sprint: Sprint 2024-03-25 > Fix Spark Quick Start > - > > Key:

[jira] [Updated] (HUDI-7484) Fix partitioning style when partition is inferred from partitionBy

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7484: - Sprint: Sprint 2024-03-25 > Fix partitioning style when partition is inferred from partitionBy > -

[jira] [Updated] (HUDI-7342) Use BaseFileUtils to hide format-specific logic in HoodiePartitionMetadata

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7342: - Sprint: Sprint 2024-03-25 > Use BaseFileUtils to hide format-specific logic in HoodiePartitionMeta

[jira] [Updated] (HUDI-7344) Use Java [Input/Output]Stream instead of FSData[Input/Output]Stream when possible

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7344: - Sprint: Sprint 2024-03-25 > Use Java [Input/Output]Stream instead of FSData[Input/Output]Stream wh

[jira] [Updated] (HUDI-7221) Move Hudi Option class from hudi-common to hudi-io module

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7221: - Sprint: Sprint 2024-03-25 > Move Hudi Option class from hudi-common to hudi-io module > --

[jira] [Updated] (HUDI-7346) Remove usage of org.apache.hadoop.hbase.util.Bytes

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7346: - Sprint: Sprint 2024-03-25 > Remove usage of org.apache.hadoop.hbase.util.Bytes > -

[jira] [Updated] (HUDI-7202) Consolidate IO util methods

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7202: - Sprint: Sprint 2024-03-25 > Consolidate IO util methods > --- > >

[jira] [Updated] (HUDI-7075) Fix validation of parquet column projection on HadoopFsRelation in TestParquetColumnProjection

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7075: - Sprint: Sprint 2024-03-25 > Fix validation of parquet column projection on HadoopFsRelation in >

[jira] [Updated] (HUDI-7345) Remove usage of org.apache.hadoop.util.VersionUtil

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7345: - Sprint: Sprint 2024-03-25 > Remove usage of org.apache.hadoop.util.VersionUtil > -

[jira] [Updated] (HUDI-7217) Implement sequential read, full-key lookup, and prefix lookup in new HFile reader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7217: - Sprint: Sprint 2024-03-25 > Implement sequential read, full-key lookup, and prefix lookup in new H

[jira] [Updated] (HUDI-7065) Fix the new file group reader with COW in Spark integration

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7065: - Sprint: Sprint 2024-03-25 > Fix the new file group reader with COW in Spark integration >

[jira] [Updated] (HUDI-6798) Implement event-time-based merging mode in FileGroupReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6798: - Sprint: Sprint 2024-03-25 > Implement event-time-based merging mode in FileGroupReader > -

[jira] [Updated] (HUDI-7269) Fallback to key-based merging if there is no positions in log header

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7269: - Sprint: Sprint 2024-03-25 > Fallback to key-based merging if there is no positions in log header >

[jira] [Updated] (HUDI-7544) Harden, Stress and Performance test the LSM timeline on cloud storage

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7544: - Sprint: Sprint 2024-03-25 > Harden, Stress and Performance test the LSM timeline on cloud storage

[jira] [Updated] (HUDI-6787) Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and RealtimeCompactedRecordReader for Hive

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6787: - Sprint: Sprint 2024-03-25 > Integrate FileGroupReader with HoodieMergeOnReadSnapshotReader and >

[jira] [Updated] (HUDI-7543) Implement CDC query support (MoR/CoW) for Spark on FGReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7543: - Sprint: Sprint 2024-03-25 > Implement CDC query support (MoR/CoW) for Spark on FGReader >

[jira] [Updated] (HUDI-7408) LSM tree writer failed with compaction

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7408: - Sprint: Sprint 2024-03-25 > LSM tree writer failed with compaction > -

[jira] [Updated] (HUDI-7497) Add a global timeline mingled with active and archived instants

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7497: - Sprint: Sprint 2024-03-25 > Add a global timeline mingled with active and archived instants >

[jira] [Updated] (HUDI-6699) An indexed global timeline (phase2)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6699: - Sprint: Sprint 2024-03-25 > An indexed global timeline (phase2) >

[jira] [Updated] (HUDI-6700) Archiving should be time based, not this min-max and not per instant. Lets treat it like a log (Phase 2)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6700: - Sprint: Sprint 2024-03-25 > Archiving should be time based, not this min-max and not per instant.

[jira] [Assigned] (HUDI-2461) Support lock free multi-writer for metadata table

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2461: Assignee: Danny Chen (was: Sagar Sumit) > Support lock free multi-writer for metadata tabl

[jira] [Updated] (HUDI-2461) Support lock free multi-writer for metadata table

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2461: - Sprint: Sprint 2024-03-25 > Support lock free multi-writer for metadata table > --

[jira] [Updated] (HUDI-7547) Simplification of archival, savepoint, cleaning interplays

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7547: - Sprint: Sprint 2024-03-25 > Simplification of archival, savepoint, cleaning interplays > -

[jira] [Updated] (HUDI-6802) Use completion time in Spark FileIndex for listing

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6802: - Sprint: Sprint 2024-03-25 > Use completion time in Spark FileIndex for listing > -

[jira] [Updated] (HUDI-7218) Integrate new HFile reader with HoodieHFileReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7218: - Sprint: Sprint 2024-03-25 > Integrate new HFile reader with HoodieHFileReader > --

[jira] [Updated] (HUDI-7220) Benchmark new HFile reader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7220: - Sprint: Sprint 2024-03-25 > Benchmark new HFile reader > -- > >

[jira] [Updated] (HUDI-7045) Fix new file format and reader for schema evolution

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7045: - Sprint: Sprint 2024-03-25 > Fix new file format and reader for schema evolution >

[jira] [Updated] (HUDI-7350) Introduce HoodieIOFactory to abstract the reader and writer implementation

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7350: - Sprint: Sprint 2024-03-25 > Introduce HoodieIOFactory to abstract the reader and writer implementa

[jira] [Updated] (HUDI-6909) Handle `_hoodie_operation` field in the new HoodieFileGroupReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6909: - Sprint: Sprint 2024-03-25 > Handle `_hoodie_operation` field in the new HoodieFileGroupReader > --

[jira] [Updated] (HUDI-7045) Fix new file format and reader for schema evolution

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7045: - Status: Patch Available (was: In Progress) > Fix new file format and reader for schema evolution

[jira] [Closed] (HUDI-7357) Introduce generic StorageConfiguration

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar closed HUDI-7357. Resolution: Fixed > Introduce generic StorageConfiguration > --

[jira] [Updated] (HUDI-7219) Implement storage and HFile block cache in the same JVM

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7219: - Sprint: Sprint 2024-03-25 > Implement storage and HFile block cache in the same JVM >

[jira] [Updated] (HUDI-7045) Fix new file format and reader for schema evolution

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7045: - Epic Link: HUDI-6243 > Fix new file format and reader for schema evolution > -

[jira] [Updated] (HUDI-6794) Support completion-time-based file slice in FileGroupReader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6794: - Sprint: Sprint 2024-03-25 > Support completion-time-based file slice in FileGroupReader >

[jira] [Updated] (HUDI-6791) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark CDC Query

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6791: - Sprint: Sprint 2024-03-25 > Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark CD

[jira] [Updated] (HUDI-6792) Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark Incremental Query

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6792: - Sprint: Sprint 2024-03-25 > Integrate FileGroupReader with NewHoodieParquetFileFormat for Spark >

[jira] [Updated] (HUDI-2867) Make HoodiePartitionPath optional

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2867: - Epic Link: HUDI-7537 (was: HUDI-6243) > Make HoodiePartitionPath optional > -

[jira] [Updated] (HUDI-7542) Ensure extensibility to time-travel writes

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7542: - Sprint: Sprint 2024-03-25 > Ensure extensibility to time-travel writes > -

[jira] [Updated] (HUDI-7541) Ensure extensibility to new indexes - vectors, search and other formats (CLP, unstructured data)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7541: - Sprint: Sprint 2024-03-25 > Ensure extensibility to new indexes - vectors, search and other format

[jira] [Updated] (HUDI-7216) Support reading bloom filter block (BLOOM_CHUNK) in HFile reader

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7216: - Sprint: Sprint 2024-03-25 > Support reading bloom filter block (BLOOM_CHUNK) in HFile reader > ---

[jira] [Updated] (HUDI-7229) Enable partial updates for CDC work payload

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7229: - Sprint: Sprint 2024-03-25 > Enable partial updates for CDC work payload >

[jira] [Updated] (HUDI-6712) Implement optimized keyed lookup on parquet files

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6712: - Sprint: Sprint 2024-03-25 > Implement optimized keyed lookup on parquet files > --

[jira] [Updated] (HUDI-7538) Consolidate the CDC Formats (changelog format, RFC-51)

2024-03-25 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-7538: - Sprint: Sprint 2024-03-25 > Consolidate the CDC Formats (changelog format, RFC-51) > -

<    1   2   3   4   5   6   7   8   9   10   >