Dandandan commented on code in PR #22207:
URL: https://github.com/apache/datafusion/pull/22207#discussion_r3249772156


##########
datafusion/physical-expr/src/partitioning.rs:
##########
@@ -133,13 +135,94 @@ impl Display for Partitioning {
                     .join(", ");
                 write!(f, "Hash([{phy_exprs_str}], {size})")
             }
+            Partitioning::Expr(expr) => write!(f, "{expr}"),
             Partitioning::UnknownPartitioning(size) => {
                 write!(f, "UnknownPartitioning({size})")
             }
         }
     }
 }
 
+/// Physical expression partitioning.
+///
+/// Partition `i` contains rows where `partition_exprs[i]` evaluates to true.
+/// The source declaring partitioning is responsible for ensuring that, for 
every
+/// row emitted, exactly one partition expression evaluates to true and that 
row
+/// is emitted by the corresponding partition. The expressions do not need to
+/// cover values that the plan cannot emit.
+///
+/// For example, a scan that can only emit rows for `2022` can declare two date
+/// partitions as:
+///
+/// ```text
+/// partition_exprs[0] = date >= 2022-01-01 AND date < 2022-07-01
+/// partition_exprs[1] = date >= 2022-07-01 AND date < 2023-01-01
+/// ```
+///
+/// This is valid even though values outside `2022` are not covered, as long as
+/// the source does not emit rows outside those ranges. It would not be valid
+/// for this plan to emit a row from `partition[i]` whose date is not within
+/// `partition_exprs[i]`, or to emit a row whose date matches multiple
+/// partition expressions.
+///
+/// More complex partitioning can be represented using normal expression
+/// composition. For example, one partition in a date and city range can be
+/// represented as `date >= 2021-01-01 AND date < 2022-07-01 AND city < 
'Boston'`.
+///
+/// NOTE: Optimizer and execution behavior for this partitioning is 
intentionally
+/// not implemented and will be introduced incrementally.
+#[derive(Debug, Clone)]
+pub struct ExprPartitioning {

Review Comment:
   Oh I see the issue already refers to it as range partititioning. Any reason 
of why not using the terminology here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to