Aitozi opened a new issue, #7880:
URL: https://github.com/apache/paimon/issues/7880

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Motivation
   
   # Proposal: Logical Partition Coalescing with Metadata-Level Mapping
   
   ## Motivation
   
   Paimon users often define partition keys on columns that are frequently used 
in query predicates. This is natural because users expect partition pruning to 
reduce scan cost:
   
   ```sql
   WHERE dt = '2026-05-17' AND app = 'A'
   ```
   
   However, real workloads often have many fine-grained partition values with 
highly skewed sizes. If every logical partition value is stored as an 
independent physical partition directory, small logical partitions can produce 
many small files and excessive partition directories.
   
   The goal is to allow multiple small logical partitions to be stored under a 
shared physical partition directory, while preserving the original logical 
partition semantics and partition pruning capability.
   
   ## Core Idea
   
   Decouple two concepts that are currently effectively the same:
   
   - **Logical partition value**: the value users query and reason about, for 
example `app = 'A'`.
   - **Physical partition value**: the value used in the storage directory 
layout, for example `app = '[#small]'`.
   
   Example:
   
   ```text
   Logical partitions:
     app = A
     app = B
     app = C
   
   Physical partition:
     app = [#small]
   ```
   
   Users should still query with logical values:
   
   ```sql
   WHERE app = 'A'
   ```
   
   Paimon should be able to prune files or physical partitions using metadata 
that records which logical partition values are contained in each physical 
partition or file.
   
   ## Why Metadata Support Is Required
   
   This cannot be implemented safely by only keeping mapping rules in table 
options.
   
   Rules may change over time:
   
   ```text
   Old rule:
     A, B, C -> [#small]
   
   New rule:
     A, D, E -> [#small]
   ```
   
   If pruning depends only on the current rule, historical files may be 
incorrectly pruned or scanned. Therefore, Paimon needs snapshot-aware metadata 
describing the actual logical partition values written into each file or 
physical partition.
   
   This metadata is required for correctness, not only for optimization.
   
   ## Proposed Metadata Change
   
   Add optional logical partition mapping metadata to Paimon’s manifest layer.
   
   Conceptually:
   
   ```text
   LogicalPartitionValues:
     complete: boolean
     logicalColumns: List<String>
     logicalTuples: List<BinaryRow>
   ```
   
   Each manifest entry can describe:
   
   ```text
   physical partition:
     dt = 2026-05-17
     app = [#small]
   
   logical partition values:
     columns = [app]
     tuples = [A, B, C]
     complete = true
   ```
   
   If the metadata is missing or incomplete, readers must conservatively keep 
the file during pruning.
   
   ## Why Manifest-Level Metadata
   
   Manifest-level metadata is a good first design point because:
   
   - It is managed by Paimon itself, not by external Hive Metastore partition 
parameters.
   - It is snapshot-aware and naturally follows Paimon’s commit model.
   - It can survive rule changes because it records actual written logical 
values.
   - It supports file-level pruning, which is more precise than only physical 
partition-level metadata.
   - It keeps compatibility simple: old manifest entries can have `null` 
metadata and fall back to conservative scanning.
   
   ## Query Pruning Semantics
   
   Predicate handling should distinguish:
   
   ```text
   normal partition predicates
   logical partition predicates
   residual data predicates
   ```
   
   For a query like:
   
   ```sql
   WHERE dt = '2026-05-17' AND app = 'A'
   ```
   
   Paimon can:
   
   1. Use `dt = '2026-05-17'` for normal physical partition pruning.
   2. Use logical partition metadata to check whether a file under `app = 
'[#small]'` may contain `app = 'A'`.
   3. Keep `app = 'A'` as a residual predicate to guarantee correctness.
   
   Pruning rule:
   
   ```text
   if logical metadata is missing or incomplete:
       keep file
   else if any logical tuple may match the predicate:
       keep file
   else:
       prune file
   ```
   
   ## Compatibility
   
   The metadata field should be optional and nullable.
   
   For old manifests or tables without this feature:
   
   ```text
   logicalPartitionValues = null
   ```
   
   Readers must treat this as unknown and scan conservatively.
   
   This preserves compatibility while allowing new writers to provide richer 
metadata for better pruning.
   
   ## Summary
   
   This proposal introduces logical partition coalescing by separating logical 
partition semantics from physical partition layout. The key metadata change is 
to record logical partition values in Paimon manifests, so that multiple 
logical partitions can share one physical directory without losing partition 
pruning correctness.
   
   ### Solution
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to