john-bodley opened a new pull request #4905: [wip][missing values] Removing replacing missing values URL: https://github.com/apache/incubator-superset/pull/4905 Apologies for not having full context of this code but from a numerical standpoint replacing [missing values](https://pandas.pydata.org/pandas-docs/stable/missing_data.html) with zero (or other values) is rarely ever a good idea as this leads to inaccuracies which surely violates the core tenant of a data analysis tool. Note Pandas (implicitly) and Numpy (explicitly) correctly handle missing values, e.g. [mean](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.mean.html) and [nanmean](https://docs.scipy.org/doc/numpy/reference/generated/numpy.nanmean.html) respectively. This PR is still WIP as I've only remedied a couple of the visualization types and still to add a number of unit tests to ensure numerical correctness with missing values. I felt there was merit in sharing this now in order for me to better understand the context of replacing missing values and potential corner cases I need to be aware of. For context here's a few examples were replacing missing values with `0` leads to incorrect results: **Time-series (current):** <img width="992" alt="screen shot 2018-04-28 at 5 14 28 pm" src="https://user-images.githubusercontent.com/4567245/39402261-4555abc0-4b0f-11e8-85cc-0cfeae44a2f0.png"> **Time-series (proposed):** <img width="1011" alt="screen shot 2018-04-28 at 5 14 56 pm" src="https://user-images.githubusercontent.com/4567245/39402260-4530e0ce-4b0f-11e8-86b0-f72024f19749.png"> **Box-plot (current):** <img width="1017" alt="screen shot 2018-04-28 at 5 09 27 pm" src="https://user-images.githubusercontent.com/4567245/39402263-4589f97a-4b0f-11e8-9989-5a81ba5c723d.png"> **Box-plot (proposed):** <img width="1009" alt="screen shot 2018-04-28 at 5 09 54 pm" src="https://user-images.githubusercontent.com/4567245/39402262-456f78ac-4b0f-11e8-8b97-4df2f71e8982.png"> Closes https://github.com/apache/incubator-superset/issues/3603 to: @jeffreythewang @mistercrunch @williaster @xrmx
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
