[
https://issues.apache.org/jira/browse/SPARK-37369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-37369:
------------------------------------
Assignee: (was: Apache Spark)
> Avoid redundant ColumnarToRow transistion on InMemoryTableScan
> --------------------------------------------------------------
>
> Key: SPARK-37369
> URL: https://issues.apache.org/jira/browse/SPARK-37369
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: L. C. Hsieh
> Priority: Major
>
> We have a rule to insert columnar transition between row-based and columnar
> query plans. InMemoryTableScanExec can produce columnar output. So if its
> parent plan isn't columnar, the rule adds a ColumnarToRow between them.
> But InMemoryTableScanExec is a special query plan because it can convert from
> cached batch to columnar batch or row.
> For such case, we ask InMemoryTableScanExec to convert cached batch to
> columnar batch, and then convert to row in the added ColumnarToRow, before
> the parent query.
> So for such case, we can simply ask InMemoryTableScanExec to produce row
> output instead of a redundant conversion.
> ```
> +- Union
>
>
> :- ColumnarToRow
>
>
> : +- InMemoryTableScan [i#8, j#9]
>
>
> : +- InMemoryRelation [i#8, j#9], StorageLevel(disk,
> memory, deserialized, 1 replicas)
> ```
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]