Re: [Tutor] Help with iterators

Mitya Sirenef Thu, 21 Mar 2013 18:41:36 -0700

On 03/21/2013 08:39 PM, Matthew Johnson wrote:

Dear list,

>
> I have been trying to understand out how to use iterators and in
> particular groupby statements. I am, however, quite lost.
>
> I wish to subset the below list, selecting the observations that have
> an ID ('realtime_start') value that is greater than some date (i've
> used the variable name maxDate), and in the case that there is more
> than one such record, returning only the one that has the largest ID
> ('realtime_start').
>
> The code below does the job, however i have the impression that it
> might be done in a more python way using iterators and groupby
> statements.
>
> could someone please help me understand how to go from this code to
> the pythonic idiom?
>
> thanks in advance,
>
> Matt Johnson
>
> _________________
>
> ## Code example
>
> import pprint
>
> obs = [{'date': '2012-09-01',
> 'realtime_end': '2013-02-18',
> 'realtime_start': '2012-10-15',
> 'value': '231.951'},
> {'date': '2012-09-01',
> 'realtime_end': '2013-02-18',
> 'realtime_start': '2012-11-15',
> 'value': '231.881'},
> {'date': '2012-10-01',
> 'realtime_end': '2013-02-18',
> 'realtime_start': '2012-11-15',
> 'value': '231.751'},
> {'date': '2012-10-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2012-12-19',
> 'value': '231.623'},
> {'date': '2013-02-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-03-21',
> 'value': '231.157'},
> {'date': '2012-11-01',
> 'realtime_end': '2013-02-18',
> 'realtime_start': '2012-12-14',
> 'value': '231.025'},
> {'date': '2012-11-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-01-19',
> 'value': '231.071'},
> {'date': '2012-12-01',
> 'realtime_end': '2013-02-18',
> 'realtime_start': '2013-01-16',
> 'value': '230.979'},
> {'date': '2012-12-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-02-19',
> 'value': '231.137'},
> {'date': '2012-12-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-03-19',
> 'value': '231.197'},
> {'date': '2013-01-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-02-21',
> 'value': '231.198'},
> {'date': '2013-01-01',
> 'realtime_end': '9999-12-31',
> 'realtime_start': '2013-03-21',
> 'value': '231.222'}]
>
> maxDate = "2013-03-21"
>
> dobs = dict([(d, []) for d in set([e['date'] for e in obs])])
>
> for o in obs:
> dobs[o['date']].append(o)
>
> dobs_subMax = dict([(k, [d for d in v if d['realtime_start'] <= maxDate])
> for k, v in dobs.items()])
>
> rts = lambda x: x['realtime_start']
>
> mmax = [sorted(e, key=rts)[-1] for e in dobs_subMax.values() if e]
>
> mmax.sort(key = lambda x: x['date'])
>
> pprint.pprint(mmax)
> _______________________________________________
> Tutor maillist - Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>



You can do it with groupby like so:


from itertools import groupby
from operator import itemgetter


maxDate = "2013-03-21"
mmax    = list()

obs.sort(key=itemgetter('date'))

for k, group in groupby(obs, key=itemgetter('date')):
    group = [dob for dob in group if dob['realtime_start'] <= maxDate]
    if group:
        group.sort(key=itemgetter('realtime_start'))
        mmax.append(group[-1])

pprint.pprint(mmax)


Note that writing multiply-nested comprehensions like you did results in
very unreadable code. Do you find this code more readable?

 -m


--
Lark's Tongue Guide to Python: http://lightbird.net/larks/

Many a man fails as an original thinker simply because his memory it too
good.  Friedrich Nietzsche

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help with iterators

Reply via email to