I'm looking for some advice on handling data collection/analysis in Python. I do a lot of big, time consuming experiments in which I run a long data collection (a day or a weekend) in which I sweep a bunch of variables, then come back offline and try to cut the data into something that makes sense.
For example, my last data collection looked (neglecting all the actual equipment control code in each loop) like: for t in temperatures: for r in voltage_ranges: for v in test_voltages[r]: for c in channels: for n in range(100): record_data() I've been using Sqlite (through peewee) as the data backend, setting up a couple tables with a basically hierarchical relationship, and then handling analysis with a rough cut of SQL queries against the original data, Numpy/Scipy for further refinement, and Matplotlib to actually do the visualization. For example, one graph was "How does the slope of straight line fit between measured and applied voltage vary as a function of temperature on each channel?" The whole process feels a bit grindy; like I keep having to do a lot of ad-hoc stitching things together. And I keep hearing about pandas, PyTables, and HDF5. Would that be making my life notably easier? If so, does anyone have any references on it that they've found particularly useful? The tutorials I've seen so far seem to not give much detail on what the point of what they're doing is; it's all "how you write the code" rather than "why you write the code". Paying money for books is acceptable; this is all on the company's time/dime. Thanks, Rob -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. -- https://mail.python.org/mailman/listinfo/python-list