[
https://issues.apache.org/jira/browse/ARROW-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andy Grove resolved ARROW-9678.
-------------------------------
Fix Version/s: 2.0.0
Resolution: Fixed
Issue resolved by pull request 7919
[https://github.com/apache/arrow/pull/7919]
> [Rust] [DataFusion] Improve projection push down to remove unused columns
> -------------------------------------------------------------------------
>
> Key: ARROW-9678
> URL: https://issues.apache.org/jira/browse/ARROW-9678
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Rust, Rust - DataFusion
> Reporter: Jorge
> Assignee: Jorge
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.0.0
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> Currently, the projection push down only removes columns that are never
> referenced in the plan. However, sometimes a projection declares columns that
> themselves are never used.
> This issue is about improving the projection push-down to remove any column
> that is not logically required by the plan.
> Failing unit-test with the idea:
> {code:java}
> #[test]
> fn table_unused_column() -> Result<()> {
> let table_scan = test_table_scan()?;
> assert_eq!(3, table_scan.schema().fields().len());
> assert_fields_eq(&table_scan, vec!["a", "b", "c"]);
> // we never use "b" in the first projection => remove it
> let plan = LogicalPlanBuilder::from(&table_scan)
> .project(vec![col("c"), col("a"), col("b")])?
> .filter(col("c").gt(&lit(1)))?
> .project(vec![col("c"), col("a")])?
> .build()?;
> assert_fields_eq(&plan, vec!["c", "a"]);
> let expected = "\
> Projection: #c, #a\
> \n Selection: #c Gt Int32(1)\
> \n Projection: #c, #a\
> \n TableScan: test projection=Some([0, 2])";
> assert_optimized_plan_eq(&plan, expected);
> Ok(())
> }
> {code}
> This issue was firstly identified by [~andygrove]
> [here|https://github.com/ballista-compute/ballista/issues/320].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)