Thank you very much! You certainly gave me plenty to think and read about. I had not known about functools or collections.
On Sun, Oct 25, 2009 at 7:09 AM, Rich Lovely <[email protected]> wrote: > 2009/10/25 Isaac <[email protected]>: >> Hello all, >> I wrote a function to organize a list of sorted objects ( django blog >> entries ); each object has a Python datetime attribute ( called >> pub_date ). I am posting a message to the tutor mailing list to get >> feedback regarding better ways to accomplish the same task. I have a >> feeling I could use recursion here, instead of looping through the >> entire _query list 3 times. Here is my code: >> >> def create_archive_list(_query): >> """ >> given a sorted queryset, return a list in the format: >> >> archive_list = [{'2009': [{'December': ['entry1', 'entry2']}, >> {'January': ['entry3', 'entry4']} >> ] >> }, >> >> {'2008': [{'January': ['entry5', 'entry6']}, >> ] >> } >> ] >> """ >> >> archive_list = [] >> tmp_year_list = [] >> >> # make a list of Python dictionaries; the keys are the year, as in >> '2009', and the value >> # is an empty list. >> for item in _query: >> if item.pub_date.year not in tmp_year_list: >> tmp_year_list.append(item.pub_date.year) >> tmp_dict = {} >> tmp_dict[item.pub_date.year] = [] >> archive_list.append(tmp_dict) >> else: >> pass >> >> # for every list in the archive_list dictionaries, append a >> dictionary with the >> # blog entry's month name as a key, and the value being an empty list >> for entry in _query: >> for item in archive_list: >> _year = entry.pub_date.year >> if _year in item: >> _tmp_month = entry.pub_date.strftime("%B") >> # make a dictionary with the month name as a key and >> an empty list as a value >> tmp_dict = {} >> tmp_dict[_tmp_month] = [] >> >> if tmp_dict not in item[_year]: >> item[_year].append(tmp_dict) >> >> # append the blog entry object to the list if the pub_date month >> and year match >> # the dictionary keys in the archive_list dictionary/list tree. >> for entry in _query: >> # dict in list >> for item in archive_list: >> # year of entry >> _year = entry.pub_date.year >> if _year in item: >> _year_list = item[_year] >> for _dict_month in _year_list: >> _tmp_month = entry.pub_date.strftime("%B") >> if _tmp_month in _dict_month: >> _dict_month[_tmp_month].append(entry) >> >> return archive_list >> _______________________________________________ >> Tutor maillist - [email protected] >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> > > Instant first comment, after just a glance at your code, Can't you use > an int for the year and months, rather than strings? (This is a > principle of the MVC (Model, View, Controller) structure: you keep > your model (The actual data) as easy to handle as possible, and using > ints as keys makes conversion between the query and the output and > sorting easier. Conversion from {2009:{12: ...}} to "December 2009: > ..." should be handled by the view: either the GUI, or immediatly > before it's displayed. (This may be a point of contention, it could > be argued that the view should only display it, and all manipulation > should be handled by the controller level.) > > This isn't a case for recursion: recursion is best used when you have > one thing containing a container, which contains a container, which > contains another, which contains another, etc, to (usually) unknown > depths. (Yes, there are numeric uses for recursion, but they are > usually best replaced with an iterative approach). > > It would be more pythonic to format your output as a dict of dicts (of > lists), rather than lists of single-item dicts of lists of single-item > dicts: > archive_list = \ > {2009: {12: ['entry1', 'entry2'], > 1: ['entry3', 'entry4']} > }, > '2008': {1: ['entry5', 'entry6']} > } > > You loose the ordering, but that isn't a problem when your keys have > implicit ordering. > > Look up collections.defaultdict. To generate this structure, you can > use dict as the factory function with defaultdict: > > d = collections.defaultdict(dict) > > or, more usefully, > d = collections.defaultdict(functools.partial(collections.defaultdict, list)) > This will give a defaultdict of defaultdicts of lists. (The > functools.partial just returns a callable, which is what defaultdict > needs to create its default value. > > Then, as you iterate through the report, you simply assign to the dict > as follows: > > for entry in month_record: > d[year][month].append(entry) > > It avoids all the messing around with tmp_year_list and so on. > > Another way, using only builtins, is to use dict.setdefault(): > > d = {} > d.setdefault(year, {}).setdefault(month, []) > > This is essentially what the defaultdict does for you. > > Doing it like this will give you the advantage of being easy to read: > I'm struggling to follow what your code does at the moment: you've > got far too many levels of nesting and temporary variables. > > I don't know the structure of your query result, but using pseudo-functions: > > import collections, functools > > def create_archive_list(query): > d = collections.defaultdict(functools.partial(collections.defaultdict, > list)) > for entry in query: #iterate over blog posts, rather than years, > months etc - therefore using a single pass > d[entry.year()][entry.month()].append(entry.data()) > return d > > And finally, a little utility function, which DOES use recursion: > > def mapToDict(d): > """"recursively convert defaultdicts into regular dicts""" > d = dict(d) > d.update((k, map_to_dict(v)) for k,v in d.items() if isinstance(v, > collections.defaultdict)) > return d > > > This function is more for personal comfort than regular use (although > it does clean up debugging output): it's poor programming if code > that works with a regular dict fails if given a defaultdict. If you > DO discover any, I highly recommend reporting it as a bug. > > Like I said, I've not seen your data structure, so I'm making some big > guesses here, but this should give you enough to look into towards > improving your code. > > -- > Rich "Roadie Rich" Lovely > > There are 10 types of people in the world: those who know binary, > those who do not, and those who are off by one. > _______________________________________________ Tutor maillist - [email protected] To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
