zhuqi-lucas commented on PR #13996:
URL: https://github.com/apache/datafusion/pull/13996#issuecomment-2579118558

   > Thank you @zhuqi-lucas and @2010YOUY01
   > 
   > I tried this out locally and it worked really nicely. Thank you
   > 
   > I think the following follow on tasks would be valuable:
   > 
   > 1. Document this benchmark in 
https://github.com/apache/datafusion/tree/main/benchmarks#benchmarks
   > 2. Remove the old copy of the h2o benchmark in 
https://github.com/apache/datafusion/blob/main/benchmarks/src/bin/h2o.rs
   > 
   > I can try and help over the next day or two
   
   
   > I also think we maybe should also consider supporting fewer of these 
combinations (in follow on PRs) -- for example I am not sure how much value the 
parquet versions of the h2o tests are as the benchmark uses CSV (so that is 
what people care about about). We already have pretty good coverage for parquet 
in clickbench
   
   
   Thank you for review, i agree, i can do for following improvement for next 
steps:
   
   1. Document this benchmark in 
https://github.com/apache/datafusion/tree/main/benchmarks#benchmarks
   (I can help create a PR for the doc improvement)
   2. Remove the old copy of the h2o benchmark in 
https://github.com/apache/datafusion/blob/main/benchmarks/src/bin/h2o.rs
   (This has been done in this PR)
   3.  Supporting fewer of these formats combinations, maybe we change default 
to csv format, and change generate all data to only include the csv format.
   4. After join support, add it.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to