2010YOUY01 commented on issue #13470: URL: https://github.com/apache/datafusion/issues/13470#issuecomment-2536442025
> It would be great if there was a common infrastructure across databases for slt. High level shown below. > > ``` > ------------ -------------- ---------------- > | DF slt | | SQLite slt | | whatever slt | > ------------ -------------- ---------------- > | | | > ---------------------------------------------- -------------- > | slt transpiler |<---| slt bank | > ---------------------------------------------- -------------- > | | | > v v v > ------------ -------------- ---------------- > | DF slt | | SQLite slt | | whatever slt | > ------------ -------------- ---------------- > ``` > > Boxes on the top are existing slt tests available in various databases/repositories, e.g., SQLite (which is mostly discussed in this thread). > > `slt bank` is a set of slt tests independent of any dialect (or even potentially dialect specific) that is contributed by third party groups, e.g., researchers. > > `slt transpiler` takes existing slt tests and transpiles to the desired target (e.g., DF). Translation steps could be: > > * Take one record at a time > * Transpile SQL to the target dialect (using one of the existing tools) > * Adjust output (likely most challenging) > * Output record: if it works keep it if it does not comment out in the output > > The transpiler would also be a repository of links to existing slts. > > Boxes at the bottom are the resulting slt tests. > > Ideally, the transpilation would not be a one off process, but it would be done for example every time slt tests are run. Benefit of such an approach would be to ensure that any updates made to existing tests (e.g., new tests added to SQLite) are reflected in the target run (e.g., DF runs). https://arxiv.org/pdf/2410.21731 They did this slt bank for SQLite, PostgreSQL, DuckDB and found multiple bugs, the authors said integrating them into a new system is easy > Implementing them typically requires a low effort, as they are implemented in 33 LOC on average for the DBMSs in our experiments. The implementation is at https://github.com/suyZhong/SQuaLity -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org