gruuya commented on code in PR #7981:
URL: https://github.com/apache/arrow-datafusion/pull/7981#discussion_r1377481562


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -2132,9 +2181,96 @@ pub struct Limit {
 
 /// Removes duplicate rows from the input
 #[derive(Clone, PartialEq, Eq, Hash)]
-pub struct Distinct {
+pub enum Distinct {
+    /// Plain `DISTINCT` referencing all selection expressions
+    All(Arc<LogicalPlan>),
+    /// The `Postgres` addition, allowing separate control over DISTINCT'd and 
selected columns
+    On(DistinctOn),
+}
+
+/// Removes duplicate rows from the input
+#[derive(Clone, PartialEq, Eq, Hash)]
+pub struct DistinctOn {
+    /// The `DISTINCT ON` clause expression list
+    pub on_expr: Vec<Expr>,
+    /// The selected projection expression list
+    pub select_expr: Vec<Expr>,
+    /// The `ORDER BY` clause, whose initial expressions must match those of 
the `ON` clause
+    pub sort_expr: Option<Vec<Expr>>,
     /// The logical plan that is being DISTINCT'd
     pub input: Arc<LogicalPlan>,
+    /// The schema description of the DISTINCT ON output
+    pub schema: DFSchemaRef,
+}
+
+impl DistinctOn {
+    /// Create a new `DistintOn` struct.
+    pub fn try_new(
+        on_expr: Vec<Expr>,
+        select_expr: Vec<Expr>,
+        sort_expr: Option<Vec<Expr>>,
+        input: Arc<LogicalPlan>,
+    ) -> Result<Self> {
+        let on_expr = normalize_cols(on_expr, input.as_ref())?;
+
+        // Create fields with any qualifier stuffed in the name itself

Review Comment:
   This is sort of a hack, but the reason I was forced to do it is so that the 
plan schema remains unchanged after `replace_distinct_aggregate`. In 
particular, in that optimizer I alias the final projected expressions with the 
selection expression display name, but that always results in a schema without 
qualifiers (with any original qualifier crammed into the field name).
   
   I feel like this case should be handled by `Projection::try_new_with_schema` 
over at `replace_distinct_aggregate`, but it seems that after #7919 the custom 
schema doesn't propagate through the optimization chain.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to