zhuqi-lucas commented on PR #13996: URL: https://github.com/apache/datafusion/pull/13996#issuecomment-2572223582
> Thank you, I have tried and there is an issue generating data, everything else looks good to me. > > When I run `./bench.sh data h2o_medum` with python 3.13 > > ``` > ... > error: the configured Python interpreter version (3.13) is newer than PyO3's maximum supported version (3.12) > = help: please check if an updated version of PyO3 is available. Current version: 0.20.3 > = help: set PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1 to suppress this check and build anyway using the stable ABI > warning: build failed, waiting for other jobs to finish... > 💥 maturin failed > ... > ``` > > The error showed up, I think `falsa` does not support python 3.13. Perhaps we can enforce [email protected] to suppress this issue now? In the future maybe we can use a docker image to generate h2o dataset instead. Thank you @2010YOUY01 for review, i fix the issue, now python 3.13 is also supported by testing: ```rust ./benchmarks/bench.sh data h2o_small *************************** DataFusion Benchmark Runner and Data Generator COMMAND: data BENCHMARK: h2o_small DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data CARGO_COMMAND: cargo run --release PREFER_HASH_JOIN: true *************************** Found Python version 3.13, which is suitable. Using Python command: /usr/local/bin/python3 Installing falsa... Generating h2o test data in /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o with size=SMALL and format=PARQUET 10000000 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/G1_1e7_1e7_100_0.parquet An output data schema is the following: id1: string id2: string id3: string id4: int64 id5: int64 id6: int64 v1: int64 not null v2: int64 not null v3: double not null An output format is PARQUET Batch mode is supported. In case of memory problems you can try to reduce a batch_size. Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:04 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
