ravlio commented on issue #9011:
URL: 
https://github.com/apache/arrow-datafusion/issues/9011#issuecomment-1913588557

   > The original plan you show has a TableScan at the top -- is this a 
projection? Or is it a view definition somehow?
   
   No. `str_0` is an extra column that I wanted to get rid of. My actual 
logical plan is:
   
   
   ```
   logical plan: TableScan: ?table? projection=[project_id, user_id, 
created_at, event_id, event, str_0, str_1, str_2, str_3, str_4, str_5, str_6, 
str_7, str_8, str_9, str_10, str_11, str_12, str_13, str_14, str_15, str_16, 
str_17, str_18, str_19, str_20, str_21, str_22, str_23, str_24, ts_0, str_25, 
str_26, str_27, str_28, d_0, d_1, d_2, d_3, i8_0, i8_1, d_4]
   Pivot
     Sort: date_trunc(Utf8("day"), created_at) AS created_at ASC NULLS LAST, 
str_20 ASC NULLS LAST
       Unpivot
         PartitionedAggregate: , agg: Count { filter: None, groups: 
Some([(Alias(Alias { expr: ScalarFunction(ScalarFunction { func_def: 
BuiltIn(DateTrunc), args: [Literal(Utf8("day")), Column(Column { relation: 
None, name: "created_at" })] }), relation: None, name: "created_at" }), 
SortField { data_type: Timestamp(Nanosecond, None) }), (Column(Column { 
relation: None, name: "str_20" }), SortField { data_type: Utf8 })]), predicate: 
Column { relation: None, name: "event" }, partition_col: Column { relation: 
None, name: "user_id" }, distinct: false } as "0_0"
           Filter: project_id = Int64(1) AND created_at >= 
TimestampNanosecond(1705582297292025000, None) AND created_at <= 
TimestampNanosecond(1706446297292025000, None) AND event = UInt16(13)
             Sort: project_id ASC NULLS LAST, user_id ASC NULLS LAST
               Repartition: Hash(project_id, user_id) partition_count=12
                 Projection: project_id, user_id, created_at, event, str_20 
<--- explicit projection
                   TableScan: ?table? projection=[project_id, user_id, 
created_at, event_id, event, str_0, str_1, str_2, str_3, str_4, str_5, str_6, 
str_7, str_8, str_9, str_10, str_11, str_12, str_13, str_14, str_15, str_16, 
str_17, str_18, str_19, str_20, str_21, str_22, str_23, str_24, ts_0, str_25, 
str_26, str_27, str_28, d_0, d_1, d_2, d_3, i8_0, i8_1, d_4]
   ```
   
   You may see a lot of extra columns that I can't get rid of. Even with 
explicit projection that doesn't push down. I wanted only `project_id, user_id, 
created_at, event, str_20` but I got everything. Also, the sort node is gone 
for some reason but the data is unsorted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to