iChauster commented on PR #13426:
URL: https://github.com/apache/arrow/pull/13426#issuecomment-1185773649

   Hey all,
   
   Wrote an initial version of a cpp data generation function called 
`MakeRandomTable` in `test_util.cc`. It is definitely a more simplified version 
of the python scripts, but it has everything we need to benchmark (frequency, 
width, key density, batch size variation). This also allowed us to simplify our 
benchmarking code quite a bit, since we no longer had to fight with Google 
Benchmarks over string parameters.
   
   I've removed the python data generation scripts from this PR and we will 
figure out how to get that sorted for a more comprehensive end-to-end benchmark 
which showcases the streaming features of the node. I've also removed the hash 
join benchmarks to keep in line with our other cpp microbenchmarks.
   
   Let me know what you think!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to