andygrove commented on PR #5866: URL: https://github.com/apache/arrow-datafusion/pull/5866#issuecomment-1499125555
> > I tried testing the changes in this PR and ran into some errors when running query 1 using the code in https://github.com/sql-benchmarks/sqlbench-runners/tree/main/datafusion > > ``` > > thread 'tokio-runtime-worker' panicked at 'Unexpected accumulator state in hash aggregate: Internal("Arithmetic Overflow in AvgAccumulator")', /home/andy/.cargo/git/checkouts/arrow-datafusion-bfd9a8de51c58474/4e6eac5/datafusion/core/src/physical_plan/aggregates/row_hash.rs:642:81 > > thread 'tokio-runtime-worker' panicked at 'Unexpected accumulator state in hash aggregate: Internal("Arithmetic Overflow in AvgAccumulator")', /home/andy/.cargo/git/checkouts/arrow-datafusion-bfd9a8de51c58474/4e6eac5/datafusion/core/src/physical_plan/aggregates/row_hash.rs:642:81 > > thread 'tokio-runtime-worker' panicked at 'Unexpected accumulator state in hash aggregate: Internal("Arithmetic Overflow in AvgAccumulator")', /home/andy/.cargo/git/checkouts/arrow-datafusion-bfd9a8de51c58474/4e6eac5/datafusion/core/src/physical_plan/aggregates/row_hash.rs:642:81 > > ``` > > > > > > > > > > > > > > > > > > > > > > > > I don't see these errors when running against the latest in the main branch. > > I can not reproduce the issue using DataFusion's own benchmark data(sf=10), but I'm able to reproduce the issue using Spark generated benchmark data. I guess Spark's tpch data schema is different with DataFusion's. Maybe decimals vs floats? Official TPC-H uses decimals. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
