cjameyson opened a new issue #9951:
URL: https://github.com/apache/incubator-superset/issues/9951


   `When applying a Filter to a count distinct metric (`COUNT(DISTINCT 
mycolumn)`), the HAVING clause generated in the Custom SQL tab of the Filter 
popover is invalid syntax for Impala.
   
   The generated query is:
   ```
   SELECT category AS category,
          count(DISTINCT id) AS `COUNT_DISTINCT(id)`
   FROM mytable
   WHERE to_timestamp(cast(created_dts as string), 'yyyyMMddHH') >= '2020-05-22 
00:00:00.000000'
     AND to_timestamp(cast(created_dts as string), 'yyyyMMddHH') < '2020-05-29 
00:00:00.000000'
   GROUP BY category
   HAVING ((COUNT_DISTINCT(id) > 0))
   ORDER BY `COUNT_DISTINCT(id)` DESC
   ```
   
   And the query fails on the HAVING clause because `COUNT_DISTINCT` is not 
valid for Impala.
   
   ### Expected results
   
   `HAVING COUNT(distinct id) > 0`
   or
   ```HAVING `COUNT_DISTINCT(id)` > 0```
   
   ### Actual results
   
   `HAVING ((COUNT_DISTINCT(id) > 0))`
   
   Which causes this Impala exception:
   ```
   impala error: IllegalStateException: Failed analysis after expr 
substitution. CAUSED BY: AnalysisException: mydatabase.count_distinct() unknown
   ```
   #### Screenshots
   [Screenshot](https://imgur.com/a/vnzEhPa)
   
   
   #### How to reproduce the bug
   
   1. Create a new chart using the Table viz
   2. Group by a column
   3. Add a metric for COUNT DISTINCT on some other column
   4. Click the Filter dropdown, add filter for the metric added in step 3
   5. Note the invalid SQL autopopulated in the Custom SQL tab
   6. Hit Run Query Button
   
   ### Environment
   ```
   $ cat /etc/os-release
   NAME="CentOS Linux"
   VERSION="7 (Core)"
   ID="centos"
   ID_LIKE="rhel fedora"
   VERSION_ID="7"
   PRETTY_NAME="CentOS Linux 7 (Core)"
   ```
   
   ## Version Info
   - superset version: `superset version`
   ```
   $ superset version
   Loaded your LOCAL configuration at 
[/home/superset/ssconfig/superset_config.py]
   logging was configured successfully
   INFO:superset.utils.logging_configurator:logging was configured successfully
   
/home/superset/super/lib64/python3.6/site-packages/flask_caching/__init__.py:189:
 UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively 
disabled.
     "Flask-Caching: CACHE_TYPE is set to null, "
   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   Superset None
   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   ```
   
   Checking pip...
   ```
   apache-superset (0.36.0)                                           
     INSTALLED: 0.36.0 (latest)
   ```
   
   ```
   superset --version
   Python 3.6.8
   Flask 1.1.2
   Werkzeug 1.0.1
   ```
   
   - node.js version: `node -v`
   (not installed)
   - npm version: `npm -v`
   (not installed)
   
   - Impyla
   ```
   >>> import impala
   >>> impala.__version__
   'v0.16.2'
   ```
   
   ### Checklist
   
   Make sure these boxes are checked before submitting your issue - thank you!
   
   - [x] I have checked the superset logs for python stacktraces and included 
it here as text if there are any.
   ```
   ERROR:superset.connectors.sqla.models:Query SELECT category AS category,
          count(DISTINCT id) AS `COUNT_DISTINCT(id)`
   FROM mytable
   WHERE to_timestamp(cast(created_dts as string), 'yyyyMMddHH') >= '2020-05-22 
00:00:00.000000'
     AND to_timestamp(cast(created_dts as string), 'yyyyMMddHH') < '2020-05-29 
00:00:00.000000'
   GROUP BY category
   HAVING ((COUNT_DISTINCT(id) > 0))
   ORDER BY `COUNT_DISTINCT(id)` DESC
   LIMIT 10000 on schema None failed
   Traceback (most recent call last):
     File 
"/home/superset/super/lib64/python3.6/site-packages/superset/connectors/sqla/models.py",
 line 1059, in query
       df = self.database.get_df(sql, self.schema, mutator)
     File 
"/home/superset/super/lib64/python3.6/site-packages/superset/models/core.py", 
line 362, in get_df
       self.db_engine_spec.execute(cursor, sqls[-1])
     File 
"/home/superset/super/lib64/python3.6/site-packages/superset/db_engine_specs/base.py",
 line 841, in execute
       cursor.execute(query)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 329, in execute
       configuration=configuration)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 375, in execute_async
       self._execute_async(op)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 394, in _execute_async
       operation_fn()
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 372, in op
       run_async=True)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 1096, in execute
       return self._operation('ExecuteStatement', req)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 1026, in _operation
       resp = self._rpc(kind, request)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 994, in _rpc
       err_if_rpc_not_ok(response)
     File 
"/home/superset/super/lib64/python3.6/site-packages/impala/hiveserver2.py", 
line 748, in err_if_rpc_not_ok
       raise HiveServer2Error(resp.status.errorMessage)
   impala.error.HiveServer2Error: IllegalStateException: Failed analysis after 
expr substitution.
   CAUSED BY: AnalysisException: mfglogs_ficore.count_distinct() unknown
   ```
   - [x] I have reproduced the issue with at least the latest released version 
of superset.
   - [x] I have checked the issue tracker for the same issue and I haven't 
found one similar.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to