aaron-manning opened a new issue, #28170:
URL: https://github.com/apache/superset/issues/28170

   ### Bug description
   
   There appears to be a bug in SuperSet in the way it generates the output 
used by 'Percentage Metrics' in Table charts when a Postgres 'Decimal' datatype 
is part of the Metric.
   
   SuperSet: Version: 3.1.2, 3.1.0
   Postgres: 16.1
   
   ### What causes the bug
   1. A Metric is defined that references a Decimal column in its definition 
(curiously, this happens even when that data will never be actually used)
   2. There is at least one other % metric column output in the chart that ends 
up being null for each row
   
   ### How to reproduce the bug
   
   Create the following dataset (note that there seems to be no easy way to 
directly create a virtual dataset without first creating a physical one and 
modifying it)
   
![image](https://github.com/apache/superset/assets/32957675/8d4a9878-ffa7-42ed-929a-aad78679e971)
   ```
   SELECT
       *
   FROM
       (
           VALUES
               (1, 10::decimal),
               (2, 20::decimal),
               (3, 30::decimal)
       ) AS t (id, val);
   ```
   
   Create the following chart
   
![image](https://github.com/apache/superset/assets/32957675/715d8545-111d-4975-af7a-240d0552d5fd)
   
   % Metric 1
   
![image](https://github.com/apache/superset/assets/32957675/6625402f-0a8f-4f45-80cb-ea565f9cfd9b)
   ```
   NULL
   ```
   
   % Metric 2
   
![image](https://github.com/apache/superset/assets/32957675/c8db5110-e1a3-4003-b356-d5688aff3313)
   ```
   SUM(CASE WHEN true THEN 1 ELSE val END)
   ```
   
   The resulting error:
   ```
   DB engine Error
   division by zero
   
   This may be triggered by:
   Issue 1011 - Superset encountered an unexpected error. 
   ```
   
   The query generated (as output via 'View Query' UI feature)
   ```
   SELECT NULL AS "NULL",
          SUM(CASE
                  WHEN true THEN 1
                  ELSE val
              END) AS "SUM(CASE WHEN true THEN 1 ELSE val END)"
   FROM
     (SELECT *
      FROM (
            VALUES (1,
                    10::decimal), (2,
                                   20::decimal), (3,
                                                  30::decimal)) AS t (id, val)) 
AS virtual_table
   LIMIT 10000;
   ```
   
   The query when run in SQL lab has no issues
   
![image](https://github.com/apache/superset/assets/32957675/c2951606-de6e-49b9-bf66-bc3199f3c4f6)
   
   Note the following
   1. Metric 1 and 2 can be re-ordered with no consequence.
   2. Metric 1 can be any query, but MUST output NULL for each row in the data 
(after filters) to demonstrate this bug.
   3. Metric 2 MUST reference a Decimal datatype column in the dataset to 
demonstrate this bug. Curiously it does not actually need to ever output this 
value, I imagine this is a key aspect of the bug.
   4. Whether the column referenced in Metric 2 is nullable, has null rows, or 
negative rows or has rows with the value 0, has no consequence.
   5. Whether the chart has other columns output (Metric, % Metric or other), 
or has different settings configured, appears to have no consequence.
   6. The SQL output in 'View Query' always executes without issue when run in 
the SQL lab. The problem is clearly limited to something related to % Metrics 
that lives outside of the generated queries displayed in the UI.
   7. This issue isn't limited to virtual datasets, the same happens with a 
regular physical dataset (querying over a view)
    
   ### Expected Outcome
   The chart should output the following, without error:
   
![image](https://github.com/apache/superset/assets/32957675/3c2ee0c3-9aed-4306-b453-f499fda39317)
   Note that to get this to output, the query used in % Metric 2 was changed to 
   ```
   SUM(CASE WHEN true THEN 1 ELSE NULL END)
   ```
   (aka the reference to a Decimal column was removed)
   
   ### Workaround
   Any Decimal data types that end up used in % Metrics can be casted to Double 
to avoid this error.
   ```
   SUM(CASE WHEN true THEN 1 ELSE val::DOUBLE PRECISION END)
   ```
   
   ### Screenshots/recordings
   
   _No response_
   
   ### Superset version
   
   3.1.2
   
   ### Python version
   
   Not applicable
   
   ### Node version
   
   Not applicable
   
   ### Browser
   
   Not applicable
   
   ### Additional context
   
   ```
   division by zero
   Traceback (most recent call last):
     File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1823, in 
full_dispatch_request
       rv = self.dispatch_request()
     File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1799, in 
dispatch_request
       return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
     File 
"/usr/local/lib/python3.10/site-packages/flask_appbuilder/security/decorators.py",
 line 95, in wraps
       return f(self, *args, **kwargs)
     File "/app/superset/views/base_api.py", line 127, in wraps
       raise ex
     File "/app/superset/views/base_api.py", line 121, in wraps
       duration, response = time_function(f, self, *args, **kwargs)
     File "/app/superset/utils/core.py", line 1463, in time_function
       response = func(*args, **kwargs)
     File "/app/superset/utils/log.py", line 255, in wrapper
       value = f(*args, **kwargs)
     File "/app/superset/charts/data/api.py", line 256, in data
       return self._get_data_response(
     File "/app/superset/charts/data/api.py", line 412, in _get_data_response
       result = command.run(force_cached=force_cached)
     File "/app/superset/commands/chart/data/get_data_command.py", line 45, in 
run
       payload = self._query_context.get_payload(
     File "/app/superset/common/query_context.py", line 100, in get_payload
       return self._processor.get_payload(cache_query_context, force_cached)
     File "/app/superset/common/query_context_processor.py", line 587, in 
get_payload
       query_results = [
     File "/app/superset/common/query_context_processor.py", line 588, in 
<listcomp>
       get_query_results(
     File "/app/superset/common/query_actions.py", line 225, in 
get_query_results
       return result_func(query_context, query_obj, force_cached)
     File "/app/superset/common/query_actions.py", line 103, in _get_full
       payload = query_context.get_df_payload(query_obj, 
force_cached=force_cached)
     File "/app/superset/common/query_context.py", line 121, in get_df_payload
       return self._processor.get_df_payload(
     File "/app/superset/common/query_context_processor.py", line 156, in 
get_df_payload
       query_result = self.get_query_result(query_obj)
     File "/app/superset/common/query_context_processor.py", line 256, in 
get_query_result
       df = query_object.exec_post_processing(df)
     File "/app/superset/common/query_object.py", line 445, in 
exec_post_processing
       df = getattr(pandas_postprocessing, operation)(df, **options)
     File "/app/superset/utils/pandas_postprocessing/utils.py", line 129, in 
wrapped
       return func(df, **options)
     File "/app/superset/utils/pandas_postprocessing/contribution.py", line 74, 
in contribution
       numeric_df = numeric_df / numeric_df.values.sum(axis=axis, keepdims=True)
     File "/usr/local/lib/python3.10/site-packages/pandas/core/ops/common.py", 
line 81, in new_method
       return method(self, other)
     File "/usr/local/lib/python3.10/site-packages/pandas/core/arraylike.py", 
line 210, in __truediv__
       return self._arith_method(other, operator.truediv)
     File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 
7457, in _arith_method
       new_data = self._dispatch_frame_op(other, op, axis=axis)
     File "/usr/local/lib/python3.10/site-packages/pandas/core/frame.py", line 
7496, in _dispatch_frame_op
       bm = self._mgr.operate_blockwise(
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/internals/managers.py", 
line 1545, in operate_blockwise
       return operate_blockwise(self, other, array_op)
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/internals/ops.py", line 
63, in operate_blockwise
       res_values = array_op(lvals, rvals)
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/ops/array_ops.py", line 
232, in arithmetic_op
       res_values = _na_arithmetic_op(left, right, op)  # type: ignore[arg-type]
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/ops/array_ops.py", line 
171, in _na_arithmetic_op
       result = func(left, right)
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/computation/expressions.py",
 line 239, in evaluate
       return _evaluate(op, op_str, a, b)  # type: ignore[misc]
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/computation/expressions.py",
 line 128, in _evaluate_numexpr
       result = _evaluate_standard(op, op_str, a, b)
     File 
"/usr/local/lib/python3.10/site-packages/pandas/core/computation/expressions.py",
 line 70, in _evaluate_standard
       return op(a, b)
   ```
   
   ### Checklist
   
   - [X] I have searched Superset docs and Slack and didn't find a solution to 
my problem.
   - [X] I have searched the GitHub issue tracker and didn't find a similar bug 
report.
   - [X] I have checked Superset's logs for errors and if I found a relevant 
Python stacktrace, I included it here as text in the "additional context" 
section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to