On Aug 2, 12:21 pm, goldtech <[EMAIL PROTECTED]> wrote: > In Python 2.1 are there any tools to take a column from a DB and do a > frequency analysis - a breakdown of the values for this column? > > Possibly a histogram or a table saying out of 500 records I have one > hundred and two "301" ninety-eight "212" values and three-hundred > "410"? > > Is SQL the way to for this?
Umm yes, screamingly obviously, I would have thought, given that you have to use SQL anyway to get the data out of the database .... something like this, with easy embellishment with where, having and order by clauses: select col_name, count(*) from table_name where yadda yadda etc if necessary/desired group by col_name having count(*) >= some_threshold order by count(*) desc Apart from ease of writing, this puts the load on the server, not the client, and saves network traffic -- why transmit thousands/millions of data items around the building or across town when you don't have to? All you have to do in Python is add up the frequencies -- better than doing a "select count(*) from table_name". If you want to do something smart in Python, package the above SQL up so that table_name and col_name are parameters ... you could then have a handy little command-line gadget ready for those moments when you need to find out things like how many different mis-spellings of "developer" there are in the HR database :-) > > Of course there'd be 1000's of values.... > > I'm not asking about connecting to the DB, just tools to collate and > tally this sort of thing. > > Thanks, > Lee G. -- http://mail.python.org/mailman/listinfo/python-list