+1!
šŸ„‚ šŸ¾

On Thu, 5 Jun 2025 at 11:50, Michael S. Molina <michael.s.mol...@gmail.com>
wrote:

> Huge congratulations on getting SIP-117 across the finish line! šŸŽ‰
>
> It’s clear this took a ton of effort—six months, 18 PRs, and a whole lot
> of careful thinking—but the result is a major achievement. Centralizing all
> SQL parsing into SQLScript and SQLStatement, with dialect-aware support and
> full test coverage, is a massive win for both maintainability and security.
> I really appreciate this improvement; it’s one of those foundational pieces
> that quietly enables so much to work safely and efficiently across
> Superset. Kudos to you and everyone who contributed!
>
> Best regards,
> Michael S. Molina
>
> > On 5 Jun 2025, at 15:27, Quentin Leroy <quentin.n.le...@gmail.com>
> wrote:
> >
> > Thank you !!
> >
> >> On Thu, Jun 5, 2025 at 5:43 PM <robe...@dealmeida.net.invalid> wrote:
> >>
> >> Hi, all!
> >>
> >> I just wanted to give a heads up that SIP-117
> >> (github.com/apache/superset/issues/26786), "Improve SQL parsing", has
> >> been fully implemented. We now have all the codebase using a single
> >> parser library (`sqlglot`) through two new classes: `SQLScript` and
> >> `SQLStatement` (a script is a sequence of statements).
> >>
> >> With this change, the SQL parsing in Superset is now dialect-dependent.
> >> Of the 60 engines we support, 41 have dedicated dialects. Adding new
> >> dialects is relatively easy, and during the work for SIP-117 I created a
> >> Druid dialect (contributed upstream to `sqlglot`) and two dialects for
> >> Firebolt (maintained in the Superset repo). Better yet, all SQL parsing
> >> functionality is now contained in these 2 classes, with 100% test
> >> coverage. If we ever need to change the parser in the future we only
> >> have to modify these classes and run the test suite to make sure
> >> everything still works as expected.
> >>
> >> The work for SIP-117 took almost 6 months, 18 PRs, and added
> >> approximately 600 lines of code and 800 lines of tests. While it's easy
> >> to forget that Superset even does SQL parsing, it's a critical part of
> >> our codebase. For example, parsing SQL is needed in order to set (or
> >> update) limits in queries, preventing too much data from being loaded
> >> into the UI. And while this might seem simple, keep in mind different
> >> databases have different syntaxes for it:
> >>
> >>     SELECT * FROM t LIMIT 10
> >>     SELECT TOP 10 * FROM t
> >>     SELECT * FROM t FETCH FIRST 10 ROWS ONLY
> >>
> >> More importantly, SQL parsing is critical for security. It's used to
> >> identify which tables are being accessed when a query runs, so that
> >> Superset can enforce data access roles (DAR). It's used to detect
> >> malicious use of functions that can expose data, as well as the
> >> malicious use of subqueries in ad-hoc expressions. And it's used to
> >> modify arbitrarily complex queries in place, injecting row-level
> >> security (RLS) filters.
> >>
> >> I'd like to thanks all the contributors who helped with this SIP,
> >> especially Vitor Ɓvila, Elizabeth Thompson, Antonio Rivero, and Max
> >> Beauchemin.
> >>
> >> --Beto
>

Reply via email to