Re: [PR] chore: add TPC queries to be run by fuzzer correctness checker [datafusion-comet]

via GitHub Thu, 23 Oct 2025 09:33:59 -0700


comphead commented on PR #2632:
URL: 
https://github.com/apache/datafusion-comet/pull/2632#issuecomment-3437989564


   > > I don't think that we should have a combined 
fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I 
think it would be better to move the DataFrame comparison logic into a shared 
class somewhere and then update our benchmarking tool to be able to use it.
   > > This probably means that we need to convert our benchmark script from 
Python to Scala.
   > 
   > Another option would be to update the existing Python benchmark script to 
save query results to Parquet, and then implement a command-line tool for 
comparing the Parquet files produced from the Spark and Comet runs.
   
   Right, this option looks better IMO so we can have a command line utility 
similar to fuzzer and reuse comparison logic. We still need this PR in some way 
as it has some refactoring to reuse comparison 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: add TPC queries to be run by fuzzer correctness checker [datafusion-comet]

Reply via email to