alamb commented on issue #6782: URL: https://github.com/apache/arrow-datafusion/issues/6782#issuecomment-1727583335
I took a pass through the paper's results section. 1. We are reporting results for TPCH SF 1 -- I think the results would be more compelling if we used a larger scale factor (like 10) Is there any chance you can try rerunning the numbers with SF10? `bench.sh` supports making `tpch10` dataset so I think it should be straightforward. 2. Overall I think the results are reasonable (obviously they could always be better) but they let us tell the basic story "DataFusion is within 2x of DuckDB" and thus conclude there is nothing about an open design that precludes fast performance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
