ravlio commented on issue #9011:
URL:
https://github.com/apache/arrow-datafusion/issues/9011#issuecomment-1913588557
> The original plan you show has a TableScan at the top -- is this a
projection? Or is it a view definition somehow?
No. `str_0` is an extra column that I wanted to get rid of. My actual
logical plan is:
```
logical plan: TableScan: ?table? projection=[project_id, user_id,
created_at, event_id, event, str_0, str_1, str_2, str_3, str_4, str_5, str_6,
str_7, str_8, str_9, str_10, str_11, str_12, str_13, str_14, str_15, str_16,
str_17, str_18, str_19, str_20, str_21, str_22, str_23, str_24, ts_0, str_25,
str_26, str_27, str_28, d_0, d_1, d_2, d_3, i8_0, i8_1, d_4]
Pivot
Sort: date_trunc(Utf8("day"), created_at) AS created_at ASC NULLS LAST,
str_20 ASC NULLS LAST
Unpivot
PartitionedAggregate: , agg: Count { filter: None, groups:
Some([(Alias(Alias { expr: ScalarFunction(ScalarFunction { func_def:
BuiltIn(DateTrunc), args: [Literal(Utf8("day")), Column(Column { relation:
None, name: "created_at" })] }), relation: None, name: "created_at" }),
SortField { data_type: Timestamp(Nanosecond, None) }), (Column(Column {
relation: None, name: "str_20" }), SortField { data_type: Utf8 })]), predicate:
Column { relation: None, name: "event" }, partition_col: Column { relation:
None, name: "user_id" }, distinct: false } as "0_0"
Filter: project_id = Int64(1) AND created_at >=
TimestampNanosecond(1705582297292025000, None) AND created_at <=
TimestampNanosecond(1706446297292025000, None) AND event = UInt16(13)
Sort: project_id ASC NULLS LAST, user_id ASC NULLS LAST
Repartition: Hash(project_id, user_id) partition_count=12
Projection: project_id, user_id, created_at, event, str_20
<--- explicit projection
TableScan: ?table? projection=[project_id, user_id,
created_at, event_id, event, str_0, str_1, str_2, str_3, str_4, str_5, str_6,
str_7, str_8, str_9, str_10, str_11, str_12, str_13, str_14, str_15, str_16,
str_17, str_18, str_19, str_20, str_21, str_22, str_23, str_24, ts_0, str_25,
str_26, str_27, str_28, d_0, d_1, d_2, d_3, i8_0, i8_1, d_4]
```
You may see a lot of extra columns that I can't get rid of. Even with
explicit projection that doesn't push down. I wanted only `project_id, user_id,
created_at, event, str_20` but I got everything. Also, the sort node is gone
for some reason but the data is unsorted.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]