Ted-Jiang commented on PR #3828:
URL:
https://github.com/apache/arrow-datafusion/pull/3828#issuecomment-1328062248
> Specifically made the parquet files like this:
>
> ```
> RUSTFLAGS="-C target-cpu=native" cargo run --release --bin tpch -- convert
--input ~/tpch_data/data_SF1 --output ~/tpch_data/parquet_data_SF1
--format=parquet
> ```
>
> And then ran
>
> ```
> RUSTFLAGS="-C target-cpu=native" cargo run --release --bin tpch --
benchmark datafusion --iterations 3 --path ~/tpch_data/parquet_data_SF1
--format parquet --batch-size 4096
>
> Finished release [optimized] target(s) in 0.28s
> Running `target/release/tpch benchmark datafusion --iterations 3
--path /home/alamb/tpch_data/parquet_data_SF1 --format parquet --batch-size
4096`
> Running benchmarks with the following options: DataFusionBenchmarkOpt {
query: None, debug: false, iterations: 3, partitions: 2, batch_size: 4096,
path: "/home/alamb/tpch_data/parquet_data_SF1", file_format: "parquet",
mem_table: false, output_path: None, disable_statistics: false,
enable_scheduler: false }
> Query 1 iteration 0 took 1511.2 ms and returned 4 rows
> Query 1 iteration 1 took 1372.2 ms and returned 4 rows
> Query 1 iteration 2 took 1419.7 ms and returned 4 rows
> Query 1 avg time: 1434.38 ms
> thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a
`None` value',
datafusion/core/src/physical_plan/file_format/parquet/page_filter.rs:129:27
> note: run with `RUST_BACKTRACE=1` environment variable to display a
backtrace
> Error: ArrowError(ExternalError(ArrowError(ExternalError("Arrow error:
External error: Execution error: Arrow error: External error: Arrow error:
External error: Execution error: Arrow error: External error: Execution error:
Join Error: task 218 panicked"))))
> alamb@aal-dev:~/arrow-datafusion$
> ```
>
> FYI @Ted-Jiang -- haven't had a chance to file this as a ticket or look
more carefully into it
Thanks for testing this, i will try to figure it out tomorrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]