Hi Kyo, Thanks for pointing out the bug, here is the benchmark script again with the shape order fixed.
The reason that I chose not to use reshape was because that it creates a deep copy of the input array, which is very expensive and actually makes the revised function perform worse than the original when the input array is large. Although the current way of doing it is a bit awkward, it is also much more efficient since it does not require a copy of the original array to be constructed. Also, I have opened up a JIRA issue Thanks, Alex On Thu, Jun 6, 2013 at 10:21 PM, Lee, Kyo (3246-Affiliate) < [email protected]> wrote: > Alex, > > What a great job! > > I found a bug. > It should be > > nMonth, nGrdY, nGrdX = data1.shape > > I would also suggest > > stds = dataset1.reshape([nMonth/12,12,nGrdY,nGrdX]).std(axis=0, ddof=1) > > Then we do not have to reset the shape of dataset1. > > Kyo > > > > On Jun 6, 2013, at 9:55 PM, "Mattmann, Chris A (398J)" < > [email protected]> wrote: > > > Alex this is great! > > > > Please bring this conversation onto: > > > > [email protected] > > > > Please file a review here: > > > > http://reviews.apache.org/ > > > > For your patch too and we can discuss on the > > new thread you start up on [email protected]. > > Further feel free to file JIRA issues at: > > > > http://issues.apache.org/jira/browse/CLIMATE > > > > Cheers, > > Chris > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Chris Mattmann, Ph.D. > > Senior Computer Scientist > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 171-266B, Mailstop: 171-246 > > Email: [email protected] > > WWW: http://sunset.usc.edu/~mattmann/ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Adjunct Assistant Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > -----Original Message----- > > From: <Goodman>, "Alexander (398J-Affiliate)" <[email protected]> > > Date: Thursday, June 6, 2013 9:51 PM > > To: rcmes-dev <[email protected]> > > Subject: Potential way to improve performance in metrics > > > >> Hi all, > >> > >> > >> As I mentioned in our meeting this morning, while Kim and Kyo were > >> cleaning up metrics.py, I noticed that some of the functions had room > for > >> performance improvements. Specifically, this pertains to the functions > >> which were doing calculations by month, > >> as they currently use loops for each month. These loops can be > >> eliminated with some shape manipulation and nearly cut the running time > >> in half in some cases. I have attached a python script that shows what > >> the changes would look like for one of the functions > >> as well as provide some benchmarks comparing the running times for the > >> old and new methodology for multiple cases. After Kyo's latest changes > >> are committed, I would like to revise the other functions in metrics.py > >> in this manner. Let me know your thoughts. > >> > >> > >> Thanks, > >> Alex > >> -- > >> Alex Goodman > >> > >> > > > > > >
