[I] [SIP] Pre SQL expression block for virtual datasets [superset]

via GitHub Wed, 17 Jan 2024 04:50:26 -0800


TechAuditBI opened a new issue, #26646:
URL: https://github.com/apache/superset/issues/26646


   ## [SIP] Proposal for Pre SQL block for virtual datasets
   
   ### Motivation
   
   Currently huge piece of SQL functionality couldn't be used in virtual 
datasets due to the fact that dataset query is being put into from section in a 
pattern "from ("SQL query") as virtual_table. This limits plenty of SQL 
dialects from using non SELECT statements in a dataset query including for 
example temp tables, variables or WITH statements.
   
   ### Proposed Change
   
   Proposed change is to include a new "Pre SQL" field that would contain SQL 
that would be put before all SELECT statements for every query on that dataset.
   
   ### New or Changed Public Interfaces
   
   Interface changes:
   In a dataset edit modal a new text input field similar to existing "SQL" 
field should be added. Like on a screenshot below:
   
![image](https://github.com/apache/superset/assets/88540994/e9234931-2bf2-4788-85e5-ca6fb0da5a3c)
   
   Resulting SQL query to be sent to a corresponding database:
   Without PRE SQL:
   ```
   SELECT region AS region,
          sum("Браки (Город)") AS "SUM(Браки (Город))",
          sum("Разводы (Город)") AS "SUM(Разводы (Город))",
          sum("Браки(Село)") AS "SUM(Браки(Село))",
          sum("Разводы (Село)") AS "SUM(Разводы (Село))"
   FROM
     (select case
                 when metric_name='Брак'
                      and population = 'Городское население' then value
             end as "Браки (Город)",
             case
                 when metric_name='Разводы'
                      and population = 'Городское население' then value
             end as "Разводы (Город)",
             case
                 when metric_name='Брак'
                      and population = 'Сельское население' then value
             end as "Браки(Село)",
             case
                 when metric_name='Разводы'
                      and population = 'Сельское население' then value
             end as "Разводы (Село)" ,
             year_,
             region,
             population,
             year_,
             period,
             metric_name
      from public.population_data_
      where metric_name in ('Разводы',
                            'Брак') ) AS virtual_table
   GROUP BY region
   ORDER BY "SUM(Браки (Город))" DESC
   LIMIT 1000;
   ```
   
   With PRE SQL:
   ```
   WITH (
     SELECT 
       a, b, c
     FROM 
       table
     ) as #tmp
   
   SELECT region AS region,
          sum("Браки (Город)") AS "SUM(Браки (Город))",
          sum("Разводы (Город)") AS "SUM(Разводы (Город))",
          sum("Браки(Село)") AS "SUM(Браки(Село))",
          sum("Разводы (Село)") AS "SUM(Разводы (Село))"
   FROM
     (select case
                 when metric_name='Брак'
                      and population = 'Городское население' then value
             end as "Браки (Город)",
             case
                 when metric_name='Разводы'
                      and population = 'Городское население' then value
             end as "Разводы (Город)",
             case
                 when metric_name='Брак'
                      and population = 'Сельское население' then value
             end as "Браки(Село)",
             case
                 when metric_name='Разводы'
                      and population = 'Сельское население' then value
             end as "Разводы (Село)" ,
             year_,
             region,
             population,
             year_,
             period,
             metric_name
      from public.population_data_
      where metric_name in ('Разводы',
                            'Брак') ) AS virtual_table
   GROUP BY region
   ORDER BY "SUM(Браки (Город))" DESC
   LIMIT 1000;
   ```
   
   API changes:
   GET /api/v1/dataset/ 
   POST /api/v1/dataset/ 
   GET /api/v1/dataset/{pk}
   POST /api/v1/dataset/{pk} 
   should contain new "pre_sql": "string" field 
   
   ### New dependencies
   
   None
   
   ### Migration Plan and Compatibility
   
   A new column called for example "pre_expression" should be added to 
sl_datasets table. Data type similar to the expression "column" default value 
NULL. So every existing dataset would be compatible by default. Downgrade 
migrations are to be discussed on the other hand.
   
   ### Rejected Alternatives
   
   None
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [SIP] Pre SQL expression block for virtual datasets [superset]

Reply via email to