2010YOUY01 commented on issue #13470:
URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2536442025

   > It would be great if there was a common infrastructure across databases 
for slt. High level shown below.
   > 
   > ```
   > ------------     --------------   ----------------
   > |  DF slt  |     | SQLite slt |   | whatever slt |
   > ------------     --------------   ----------------
   >     |                 |                 |
   >  ----------------------------------------------    --------------
   >  |     slt transpiler                         |<---| slt bank   |
   >  ----------------------------------------------    --------------
   >     |                 |                 |
   >     v                 v                 v
   > ------------     --------------   ----------------
   > |  DF slt  |     | SQLite slt |   | whatever slt |
   > ------------     --------------   ----------------
   > ```
   > 
   > Boxes on the top are existing slt tests available in various 
databases/repositories, e.g., SQLite (which is mostly discussed in this thread).
   > 
   > `slt bank` is a set of slt tests independent of any dialect (or even 
potentially dialect specific) that is contributed by third party groups, e.g., 
researchers.
   > 
   > `slt transpiler` takes existing slt tests and transpiles to the desired 
target (e.g., DF). Translation steps could be:
   > 
   > * Take one record at a time
   > * Transpile SQL to the target dialect (using one of the existing tools)
   > * Adjust output (likely most challenging)
   > * Output record: if it works keep it if it does not comment out in the 
output
   > 
   > The transpiler would also be a repository of links to existing slts.
   > 
   > Boxes at the bottom are the resulting slt tests.
   > 
   > Ideally, the transpilation would not be a one off process, but it would be 
done for example every time slt tests are run. Benefit of such an approach 
would be to ensure that any updates made to existing tests (e.g., new tests 
added to SQLite) are reflected in the target run (e.g., DF runs).
   
   https://arxiv.org/pdf/2410.21731 They did this slt bank for SQLite, 
PostgreSQL, DuckDB and found multiple bugs, the authors said integrating them 
into a new system is easy
   
   > Implementing them typically requires a low effort, as they are implemented 
in 33 LOC on average for the DBMSs in our experiments.
   
   The implementation is at https://github.com/suyZhong/SQuaLity


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to