On Aug 1, 9:21 pm, goldtech <[EMAIL PROTECTED]> wrote: > In Python 2.1 are there any tools to take a column from a DB and do a > frequency analysis - a breakdown of the values for this column? > > Possibly a histogram or a table saying out of 500 records I have one > hundred and two "301" ninety-eight "212" values and three-hundred > "410"? > > Is SQL the way to for this? > > Of course there'd be 1000's of values.... > > I'm not asking about connecting to the DB, just tools to collate and > tally this sort of thing. > > Thanks, > Lee G.
If you use ORDER BY in your SQL to give you sorted data, you can easily feed it to the itertools.groupby method, following the one-line pattern in this example: from itertools import groupby from random import choice # create some random data and sort it cards = sorted([choice("ABCDEF") for i in range(100)]) # create histogram using groupby on sorted data # histogram is represented as a list of pair tuples, # containing (freq,value) cardGrps = [(len(list(g[1])),g[0]) for g in groupby(cards)] # sort histogram by descending freq to create Pareto cardGrps.sort(reverse=True) print cardGrps Prints: [(22, 'C'), (19, 'B'), (16, 'F'), (16, 'E'), (14, 'A'), (13, 'D')] -- Paul -- http://mail.python.org/mailman/listinfo/python-list