[issue26002] make statistics.median_grouped more efficient

Upendra Kumar Sun, 03 Jan 2016 06:22:45 -0800

New submission from Upendra Kumar:

statistics.median_grouped currently uses cf=data.index(x) to find the leftmost 
position of x in data ( line number 445 ). Here, data.index() has linear time 
complexity, which may not be good for long lists.


But, here since the list 'data' is sorted beforehand, we can use binary search 
( bisect_left() ) to find the leftmost occurrence of 'x' in 'data'. 

Similarly, for counting number of occurrence of 'x' in data after sorting, we 
can find the position of rightmost occurrence of 'x' in data[l1....len(data)], 
where l1 is position of leftmost occurrence of 'x' (line number 447 )

----------
components: Library (Lib)
files: median_grouped.patch
keywords: patch
messages: 257419
nosy: upendra-k14
priority: normal
severity: normal
status: open
title: make statistics.median_grouped more efficient
type: performance
versions: Python 3.4, Python 3.5, Python 3.6
Added file: http://bugs.python.org/file41484/median_grouped.patch

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26002>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26002] make statistics.median_grouped more efficient

Reply via email to