[ https://issues.apache.org/jira/browse/ARROW-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jörn Horstmann reassigned ARROW-10243: -------------------------------------- Assignee: Jörn Horstmann > [Rust] [Datafusion] Optimize literal expression evaluation > ---------------------------------------------------------- > > Key: ARROW-10243 > URL: https://issues.apache.org/jira/browse/ARROW-10243 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion > Reporter: Jörn Horstmann > Assignee: Jörn Horstmann > Priority: Major > Attachments: flamegraph.svg > > > While benchmarking the tpch query I noticed that the physical literal > expression takes up a sizable amount of time. I think the creation of the > corresponding array for numeric literals can be speed up by creating Buffer > and ArrayData directly without going through a builder. That also allows to > skip building a null bitmap for non-null literals. > I'm also thinking whether it might be possible to cache the created array. > For queries without a WHERE clause, I'd expect all batches except the last to > have the same length. I'm not sure though where to store the cached value. > Another possible optimization could be to cast literals already on the > logical plan side. In the tpch query the literal `1` is of type `u64` in the > logical plan and then needs to be processed by a cast kernel to convert to > `f64` for usage in an arithmetic expression. > The attached flamegraph is of 10 runs of tpch, with the data being loaded > into memory before running the queries (See ARROW-10240). > {code} > flamegraph ./target/release/tpch --iterations 10 --path ../tpch-dbgen > --format tbl --query 1 --batch-size 4096 -c1 --load > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)