villebro commented on a change in pull request #18782:
URL: https://github.com/apache/superset/pull/18782#discussion_r809763411



##########
File path: superset/utils/pandas_postprocessing/contribution.py
##########
@@ -71,5 +73,7 @@ def contribution(
     numeric_df = numeric_df[columns]
     axis = 0 if orientation == PostProcessingContributionOrientation.COLUMN 
else 1
     numeric_df = numeric_df / numeric_df.values.sum(axis=axis, keepdims=True)
+    # replace infinity and nan with 0 in dataframe
+    numeric_df.replace(to_replace=[np.Inf, -np.Inf, np.nan], value=0, 
inplace=True)

Review comment:
       Thanks for the explanation! This is mostly a nit, but please hear me out 
😆 I agree that we need to fill nulls with zeros before doing the contribution 
calculation, but strictly mathematically speaking, after we've done the 
contribution calculation, I think leaving the infinite values as null seems 
more appropriate. In the example, I feel this should be the correct result for 
ROW level contribution, as there strictly speaking isn't anything to contribute 
to:
   
   ```
              __timestamp     a     b    c
   0 2020-07-16 14:49:00  0.50  0.50  0.0
   1 2020-07-16 14:50:00  0.25  0.75  0.0
   2 2020-07-16 14:51:00   NaN   NaN  NaN
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to