On 14 November 2012 03:17, David Martins <awesome.me...@outlook.com> wrote: > Hi All > > I'm trying to use python for analysing data from building energy simulations > and was wondering whether there is way to do this without using anything sql > like.
There are many ways to do this. > > The simulations are typically run for a full year, every hour, i.e. there > are 8760 rows and about 100+ variables such as external air temperature, > internal air temperature, humidity, heating load, ... making roughly a > million data points. I've got the data in a csv file and also managed to > write it in a sqlite db. This dataset is not so big that you can't just load it all into memory. > > I would like to make requests like the following: > > Show the number of hours the aircon is running at 10%, 20%, ..., 100% > Show me the average, min, max air temperature, humidity, solar gains,.... > when the aircon is running at 10%, 20%,...,100% > > Eventually I'd also like to generate an automated html or pdf report with > graphs. Creating graphs is actually somewhat essential. Do you mean graphs or plots? I would use matplotlib for plotting. It can automatically generate image files of plots. There are also ways to generate output for visualising graphs but I guess that's not what you mean. Probably I would create a pdf report using latex and matplotlib but that's not the only way. http://en.wikipedia.org/wiki/Graph_(mathematics) http://en.wikipedia.org/wiki/Plot_(graphics) > I tried sql and find it horrible, error prone, too much to write, the logic > somehow seems to work different than my brain and I couldn't find > particulary good documentation (particulary the documentation of the api is > terrible, in my humble opinion). I heard about zope db which might be an > alternative. Would you mind pointing me towards an appropriate way to solve > my problem? Is there a way for me to avoid having to learn sql or am I > doomed? There are many ways to avoid learning SQL. I'll suggest the simplest one: Can you not just read all the data into memory and then perform the computations you want? For example: $ cat tmp.csv Temp,Humidity 23,85 25,87 26,89 23,90 24,81 24,80 $ cat tmp.py #!/usr/bin/env python import csv with open('tmp.csv', 'rb') as f: reader = csv.DictReader(f) data = [] for row in reader: row = dict((k, float(v)) for k, v in row.items()) data.append(row) maxtemp = max(row['Temp'] for row in data) mintemp = min(row['Temp'] for row in data) meanhumidity = sum(row['Humidity'] for row in data) / len(data) print('max temp is: %d' % maxtemp) print('min temp is: %d' % mintemp) print('mean humidity is: %f' % meanhumidity) $ ./tmp.py max temp is: 26 min temp is: 23 mean humidity is: 85.333333 This approach can also be extended to the case where you don't read all the data into memory. Oscar _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor