waynexia commented on pull request #792:
URL: https://github.com/apache/arrow-datafusion/pull/792#issuecomment-890549827
This is the result of Q1, run with this command:
>cargo run --release --bin tpch -- benchmark datafusion --iterations 5
--path ./data --format tbl --query 1 --batch-size 4096
I've run it a couple of times and can observe a stable difference of ~100ms.
I checked the optimized logical plan with debug log and make sure this
optimization is applied:
>=== Optimized logical plan ===
Sort: #lineitem.l_returnflag ASC NULLS FIRST, #lineitem.l_linestatus ASC
NULLS FIRST
Projection: #lineitem.l_returnflag, #lineitem.l_linestatus,
#SUM(lineitem.l_quantity) AS sum_qty, #SUM(lineitem.l_extendedprice) AS
sum_base_price, #SUM(lineitem.l_extendedprice Multiply Int64(1) Minus
lineitem.l_discount) AS sum_disc_price, #SUM(lineitem.l_extendedprice Multiply
Int64(1) Minus lineitem.l_discount Multiply Int64(1) Plus lineitem.l_tax) AS
sum_charge, #AVG(lineitem.l_quantity) AS avg_qty,
#AVG(lineitem.l_extendedprice) AS avg_price, #AVG(lineitem.l_discount) AS
avg_disc, #COUNT(UInt8(1)) AS count_order
Aggregate: groupBy=[[#lineitem.l_returnflag, #lineitem.l_linestatus]],
aggr=[[SUM(#lineitem.l_quantity), SUM(#lineitem.l_extendedprice),
SUM(#BinaryExpr-*BinaryExpr--Column-lineitem.l_discountLiteral1Column-lineitem.l_extendedprice
AS lineitem.l_extendedprice Multiply Int64(1) Minus lineitem.l_discount),
SUM(#BinaryExpr-*BinaryExpr--Column-lineitem.l_discountLiteral1Column-lineitem.l_extendedprice
AS lineitem.l_extendedprice Multiply Int64(1) Minus lineitem.l_discount
Multiply Int64(1) Plus #lineitem.l_tax), AVG(#lineitem.l_quantity),
AVG(#lineitem.l_extendedprice), AVG(#lineitem.l_discount), COUNT(UInt8(1))]]
Projection: #lineitem.l_extendedprice Multiply Int64(1) Minus
#lineitem.l_discount, #l_quantity, #l_extendedprice, #l_discount, #l_tax,
#l_returnflag, #l_linestatus
Filter: #lineitem.l_shipdate LtEq Date32("10471")
TableScan: lineitem projection=Some([4, 5, 6, 7, 8, 9, 10])
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]