Prokos opened a new pull request, #33296:
URL: https://github.com/apache/superset/pull/33296
### SUMMARY
The macro `get_filters` within the Jinja context can be used to implement
filtering in custom locations in your virtual dataset SQL. This works great,
except for when filtering with the IS_NULL or IS_NOT_NULL operator. These are
ignored, cannot be used, and will break certain implementations that rely on
remove_filter to be functional (like the one I've outlined below).
The reason the code breaks for these operators is because they are unique in
the sense that they don't have a comparator. Adjusting the code to not need a
comparator for these operator types fixes the issue.
### TESTING INSTRUCTIONS
1. Create a virtual dataset on BigQuery with the following SQL
- likely reproducible with less - but it's vital that both column
selection/grouping and filtering is handled)
```
{% set finalColumns = columns if columns is defined else ['__ALL__'] %}
SELECT
sum(measure) as measure,
{% for col in finalColumns %}
{% if col == 'col1' or col == '__ALL__' or (col is mapping and
'sqlExpression' in col and col.sqlExpression == 'col1') %}
col1
{%- if col == '__ALL__' -%},{%- endif -%}
{%- if not loop.last -%},{%- endif -%}
{% endif %}
{% if col == 'col2' or col == '__ALL__' or (col is mapping and
'sqlExpression' in col and col.sqlExpression == 'col2') %}
col2
{%- if col == '__ALL__' -%},{%- endif -%}
{%- if not loop.last -%},{%- endif -%}
{% endif %}
{% endfor %}
FROM UNNEST([
STRUCT(1 as measure, 'value1A' as col1, 'value2A' as col2),
STRUCT(2 as measure, 'value1A' as col1, 'value2B' as col2),
STRUCT(5 as measure, 'value1B' as col1, 'value2B' as col2),
STRUCT(15 as measure, NULL as col1, NULL as col2)
])
-- FILTERS
{%- for filter in get_filters('col1', remove_filter=True) -%}
{%- if loop.first %} WHERE 1 = 1 {% endif %}
AND
{% if filter.get('op') == 'IN' -%}
col1 IN {{ filter.get('val')|where_in }}
{% elif filter.get('op') == 'NOT IN' -%}
col1 NOT IN {{ filter.get('val')|where_in }}
{% elif filter.get('op') == 'ILIKE' -%}
LOWER(col1) LIKE LOWER({{ filter.get('val') }})
{% elif filter.get('op') == 'IS NULL' -%}
col1 IS NULL
{% elif filter.get('op') == 'IS NOT NULL' -%}
col1 IS NOT NULL
{% else -%}
col1 {{ filter.get('op') }} {{ filter.get('val') }}
{%- endif -%}
{%- endfor %}
{%- for filter in get_filters('col2', remove_filter=True) -%}
{%- if loop.first %} WHERE 1 = 1 {% endif %}
AND
{% if filter.get('op') == 'IN' -%}
col2 IN {{ filter.get('val')|where_in }}
{% elif filter.get('op') == 'NOT IN' -%}
col2 NOT IN {{ filter.get('val')|where_in }}
{% elif filter.get('op') == 'ILIKE' -%}
LOWER(col2) LIKE LOWER({{ filter.get('val') }})
{% elif filter.get('op') == 'IS NULL' -%}
col2 IS NULL
{% elif filter.get('op') == 'IS NOT NULL' -%}
col2 IS NOT NULL
{% else -%}
col2 {{ filter.get('op') }} {{ filter.get('val') }}
{%- endif -%}
{%- endfor %}
-- GROUP BY
{% set ns = namespace(groupByEmitted=false) %}
{% for col in finalColumns %}
{% if col == 'col1' or col == '__ALL__' or (col is mapping and
'sqlExpression' in col and col.sqlExpression == 'col1') %}
{%- if not ns.groupByEmitted %}
GROUP BY
{% set ns.groupByEmitted = true %}
{% endif -%}
col1
{%- if col == '__ALL__' -%},{%- endif -%}
{%- if not loop.last -%},{%- endif -%}
{% endif %}
{% if col == 'col2' or col == '__ALL__' or (col is mapping and
'sqlExpression' in col and col.sqlExpression == 'col2') %}
{%- if not ns.groupByEmitted %}
GROUP BY
{% set ns.groupByEmitted = true %}
{% endif -%}
col2
{%- if not loop.last -%},{%- endif -%}
{% endif %}
{% endfor %}
```
3. Click "Create Chart"
4. Set a filter on one of the columns but don't select it as a dimension:
<img width="325" alt="image"
src="https://github.com/user-attachments/assets/e9fe440b-6a31-4cd0-9a97-e1e1967a4e9b"
/>
5. On master, this breaks due to the following resulting SQL:
```SELECT `measure` AS `measure`, `col2` AS `col2`
FROM (
SELECT
sum(measure) as measure,
col2
FROM UNNEST([
STRUCT(1 as measure, 'value1A' as col1, 'value2A' as col2),
STRUCT(2 as measure, 'value1A' as col1, 'value2B' as col2),
STRUCT(5 as measure, 'value1B' as col1, 'value2B' as col2),
STRUCT(15 as measure, NULL as col1, NULL as col2)
])
GROUP BY
col2
) AS `virtual_table`
WHERE `col1` IS NULL
LIMIT 1000
```
6. With this fix implemented, this works and the resulting SQL looks like
this:
### ADDITIONAL INFORMATION
- [X] Fixes https://github.com/apache/superset/issues/33294
- [X] Required feature flags: ENABLE_TEMPLATE_PROCESSING = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]