Luc <ouaga...@gmail.com> added the comment: Just to make sure we are focused on the issue, the reported bug is with the statistics library (not with numpy). It happens, when there is at least one missing value in the data and involves the computation of the median, median_low and median_high using the statistics library. The test was performed on Python 3.6.4.
When there is no missing values (NaNs) in the data, computing the median, median_high and median_low from the statistics library work fine. So, yes, removing the NaNs (or imputing for them) before computing the median(s) resolve the issue. Also, just like statistics.mean(data) when data has missing return a nan, the median, median_high and median_low should behave the same way. import numpy import statistics as stats data = [75, 90,85, 92, 95, 80, np.nan] Median = stats.median(data) Median_high = stats.median_high(data) Median_low = stats.median_low(data) print("The incorrect Median is", Median) The incorrect Median is, 90 print("The incorrect median high is", Median_high) The incorrect median high is, 90 print("The incorrect median low is", Median_low) The incorrect median low is, 90 ## Mean returns nan Mean = stats.mean(data) prin("The mean is", Mean) The mean is, nan Now, when we drop the missing values, we have: data2 = [75, 90,85, 92, 95, 80] stats.median(data2) 87.5 stats.median_high(data2) 90 stats.median_low(data2) 85 ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33084> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com