On Fri, Jul 1, 2011 at 11:14 AM, Hackett, John (Norcross, GA)
wrote:
> After some experimentation (and judicious peeking at the source code), I
> think I’ve got the hang of writing custom functions to pass into these
> modules – basically, anything that accepts a list of values sliced from a
> single column on the structured array and returns a single list seems to
> work well. In functional programming terms, rec_summarize appears similar to
> “map”, rec_groupby appears similar to “reduce”.
>
>
>
> Now – what if I want to derive a calculation from multiple statistics in the
> original dataset – eg. create a new column on the array which is derived
> from 2 (or up to n) other fields in a custom function which I pass into the
> process?
>
>
>
> For example, conditional counts/summaries (count transactions and sum the
> sales on all orders that weighed > 5K lbs).
>
>
>
> Is there a way to do this within numpy or mlab without going all the way out
> to python and creating a list comprehension?
There are a couple of ways with the existing functions.
One is to use a logical mask::
mask = r.weight>5
rg = mlab.rec_groupby(r[mask], groupby, stats)
You could also create a new categorical variable with one or more
values and attach it to your record array and then use rec_groupby::
heavy = np.where(r.weight>5, 1, 0)
and add that to your record array
r = mlab.rec_append_fields(r, ['heavy'], [heavy])
and then do a rec_group_by using 'heavy' as your group by attribute.
Brian Schwartz has a preliminary implementation of rec_query which
allows you to make a SQL query on a record array by converting it to a
sqllite table, running the sql query, and returning the results as a
new record array, which would solve your problem more cleanly and
generically. The code needs a little more polishing, but perhaps
Brian you can send over what you have in case John wants to take a
look.
JDH
--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users