andygrove commented on PR #21508:
URL: https://github.com/apache/datafusion/pull/21508#issuecomment-4215456735

   > I would say if we have github bot action, similar to `run benchmarks` on 
the the PR would help remove the local testing part. How this script is planned 
to be called?
   
   It would be nice to eventually add a GitHub workflow to run this, but for 
now, probably best just to make the script available for people to run. Many of 
the tests are written in such a way that we cannot support them in PySpark, 
which makes this quite challenging.
   
   The Comet approach is much nicer, but there is no way in this repo to 
actually run the DF expressions from within Spark, so we cannot use Spark SQL 
for the tests. I suppose we could update the sql parser crate to support Spark 
SQL and update the planner to support using Spark expressions and then the 
tests could be written in Spark SQL. Sounds like a lot of work though.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to