A Thursday 14 May 2009 16:26:32 David Fokkema escrigué: > Hi list, > > For the last few weeks, I've been happily exploring (parts of) my data > using pytables. Now, however, I've run into a problem. > > I have an experimental setup which reads out photomultiplier tubes and, > after several trigger conditions and for the sake of data reduction, > clips events to only include parts which went over a certain threshold > (with pre- and post parts). Needless to say, these result in variable > length arrays, which I'd like to store alongside my calculated pulse > heights, integrals and whatever. Now, I'd like to do this: > > class Event(tables.IsDescription): > event_id = tables.UInt64Col(pos=0) > timestamp = tables.Time32Col(pos=1) > nanoseconds = tables.UInt32Col(pos=2) > ext_timestamp = tables.UInt64Col(pos=3) > pulseheights = tables.Int16Col(shape=4, dflt=-9999, pos=4) > integrals = tables.Int32Col(shape=4, dflt=-9999, pos=5) > baselines = tables.Float64Col(shape=4, dflt=-9999, pos=6) > baseline_std_devs = tables.Float64Col(shape=4, dflt=-9999, pos=7) > num_peaks = tables.Int16Col(shape=4, dflt=-9999, pos=8) > traces = tables.ArrayCol(shape=4) > > Obviously, the last line is not supported. My question is how to solve > this as best as possible. Of course, I can create a stand-alone array, > but then it would be hard to really make sure a certain event and trace > go together. I could use a StringCol and pickle/unpickle my arrays, but > that doesn't work because I need to specify an itemsize. Or not? Any > thoughts appreciated!
This is typically managed by building a VLArray separately for keeping your list of variable length arrays, and then declare a field in the table where you put references (i.e. the number of row) on it. As an example, let's create a VLArray: In [43]: vla = f.createVLArray(f.root, 'vla', tb.Int32Atom()) and let's populate it with several rows: In [45]: vla.append([5, 6, 9, 8]) In [46]: vla.append([5, 6]) In [47]: vla.append([1, 2, 4]) Now, let's imagine that the row references are kept in the 'traces' field of your table 't'. Then, you can do things like: In [54]: [vla[r['traces']] for r in t.where('(nanoseconds < 3) & (timestamp > 3200000')] Out[54]: [array([5, 6, 9, 8], dtype=int32), array([5, 6], dtype=int32), array([1, 2, 4], dtype=int32)] That's a very simple example, but I hope you've got the idea. -- Francesc Alted "One would expect people to feel threatened by the 'giant brains or machines that think'. In fact, the frightening computer becomes less frightening if it is used only to simulate a familiar noncomputer." -- Edsger W. Dykstra ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users