[ 
https://issues.apache.org/jira/browse/ARROW-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jörn Horstmann updated ARROW-10243:
-----------------------------------
    Description: 
While benchmarking the tpch query I noticed that the physical literal 
expression takes up a sizable amount of time. I think the creation of the 
corresponding array for numeric literals can be speed up by creating Buffer and 
ArrayData directly without going through a builder. That also allows to skip 
building a null bitmap for non-null literals.

I'm also thinking whether it might be possible to cache the created array. For 
queries without a WHERE clause, I'd expect all batches except the last to have 
the same length. I'm not sure though where to store the cached value.

Another possible optimization could be to cast literals already on the logical 
plan side. In the tpch query the literal `1` is of type `u64` in the logical 
plan and then needs to be processed by a cast kernel to convert to `f64` for 
usage in an arithmetic expression.

The attached flamegraph is of 10 runs of tpch, with the data being loaded into 
memory before running the queries (See ARROW-10240).

{code}
flamegraph ./target/release/tpch --iterations 10 --path ../tpch-dbgen --format 
tbl --query 1 --batch-size 4096 -c1 --load
{code}

  was:
While benchmarking the tpch query I noticed that the physical literal 
expression takes up a sizable amount of time. I think the creation of the 
corresponding array for numeric literals can be speed up by creating Buffer and 
ArrayData directly without going through a builder. That also allows to skip 
building a null bitmap for non-null literals.

I'm also thinking whether it might be possible to cache the created array. For 
queries without a WHERE clause, I'd expect all batches except the last to have 
the same length. I'm not sure though where to store the cached value.

Another possible optimization could be to cast literals already on the logical 
plan side. In the tpch query the literal `1` is of type `u64` in the logical 
plan and then needs to be processed by a cast kernel to convert to `f64` for 
usage in an arithmetic expression.

The attached flamegraph is of 10 runs of tpch, with the data being loaded into 
memory before running the queries (See ARROW-10240).


> [Rust] [Datafusion] Optimize literal expression evaluation
> ----------------------------------------------------------
>
>                 Key: ARROW-10243
>                 URL: https://issues.apache.org/jira/browse/ARROW-10243
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust, Rust - DataFusion
>            Reporter: Jörn Horstmann
>            Priority: Major
>         Attachments: flamegraph.svg
>
>
> While benchmarking the tpch query I noticed that the physical literal 
> expression takes up a sizable amount of time. I think the creation of the 
> corresponding array for numeric literals can be speed up by creating Buffer 
> and ArrayData directly without going through a builder. That also allows to 
> skip building a null bitmap for non-null literals.
> I'm also thinking whether it might be possible to cache the created array. 
> For queries without a WHERE clause, I'd expect all batches except the last to 
> have the same length. I'm not sure though where to store the cached value.
> Another possible optimization could be to cast literals already on the 
> logical plan side. In the tpch query the literal `1` is of type `u64` in the 
> logical plan and then needs to be processed by a cast kernel to convert to 
> `f64` for usage in an arithmetic expression.
> The attached flamegraph is of 10 runs of tpch, with the data being loaded 
> into memory before running the queries (See ARROW-10240).
> {code}
> flamegraph ./target/release/tpch --iterations 10 --path ../tpch-dbgen 
> --format tbl --query 1 --batch-size 4096 -c1 --load
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to