xunxunmimi5577 commented on issue #1671: URL: https://github.com/apache/datafusion-ballista/issues/1671#issuecomment-4438228057
Hi @milenkovicm , I was curious if you has done a performance comparison between Ballista and Spark on the TPCDS benchmark. In my own tests(scale 100g), Ballista seems to be slower—Is this expected behavior? If not, are there any optimizations or best practices I can apply to speed up Ballista's performance here? Ballista is deployed on a single node, but not in standalone mode. The scheduler and executor are separate processes, with 1 executor and 8 concurrent tasks. Spark is deployed on as same single node(standalone mode), with 1 executor and 4 cores. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
