On Sun, May 5, 2019 at 1:08 PM Luca Baldini <luca.bald...@pi.infn.it> wrote: > > Hi here, > I wonder if the idea of adding to the statistics module a class to > calculate the running statistics (average and standard deviation) of a > generic input data stream has ever come up in the past. > > The basic idea is to do the necessary book-keeping as the data are fed > into the accumulator class and to be able to query the average variance > of the sequence at any point in time without having to loop over the > thing again. The obvious way to do that is well know, and described, > e.g., in Knuth TAOCP vol 2, 3rd edition, page 232. FWIW It is something > that through the years I have coded myself a myriad of times (e.g., for > real-time data processing)---and maybe worth considering for addition to > the standard library.
Personally, I would definitely use this in a number of places in the real-life code I contribute to. The problem that I have with this idea is it's not clear how to store the data in an accumulator class. What about cases with different contexts in asyncio and/or multithreading code? I would say it could be useful to allow to pass a storage implementation from a user's code to address almost any possible scenario. In that case, such an accumulator class doesn't need to be a class at all and bother with any intermediate storage. It could be a number of module-level functions providing an effective algorythm implementation for user to be able to base on. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/