A Thursday 14 May 2009 16:26:32 David Fokkema escrigué:
> Hi list,
>
> For the last few weeks, I've been happily exploring (parts of) my data
> using pytables. Now, however, I've run into a problem.
>
> I have an experimental setup which reads out photomultiplier tubes and,
> after several trigger conditions and for the sake of data reduction,
> clips events to only include parts which went over a certain threshold
> (with pre- and post parts). Needless to say, these result in variable
> length arrays, which I'd like to store alongside my calculated pulse
> heights, integrals and whatever. Now, I'd like to do this:
>
> class Event(tables.IsDescription):
>     event_id = tables.UInt64Col(pos=0)
>     timestamp = tables.Time32Col(pos=1)
>     nanoseconds = tables.UInt32Col(pos=2)
>     ext_timestamp = tables.UInt64Col(pos=3)
>     pulseheights = tables.Int16Col(shape=4, dflt=-9999, pos=4)
>     integrals = tables.Int32Col(shape=4, dflt=-9999, pos=5)
>     baselines = tables.Float64Col(shape=4, dflt=-9999, pos=6)
>     baseline_std_devs = tables.Float64Col(shape=4, dflt=-9999, pos=7)
>     num_peaks = tables.Int16Col(shape=4, dflt=-9999, pos=8)
>     traces = tables.ArrayCol(shape=4)
>
> Obviously, the last line is not supported. My question is how to solve
> this as best as possible. Of course, I can create a stand-alone array,
> but then it would be hard to really make sure a certain event and trace
> go together. I could use a StringCol and pickle/unpickle my arrays, but
> that doesn't work because I need to specify an itemsize. Or not? Any
> thoughts appreciated!

This is typically managed by building a VLArray separately for keeping your 
list of variable length arrays, and then declare a field in the table where 
you put references (i.e. the number of row) on it.

As an example, let's create a VLArray:

In [43]: vla = f.createVLArray(f.root, 'vla', tb.Int32Atom())

and let's populate it with several rows:

In [45]: vla.append([5, 6, 9, 8])

In [46]: vla.append([5, 6])

In [47]: vla.append([1, 2, 4])

Now, let's imagine that the row references are kept in the 'traces' field of 
your table 't'.  Then, you can do things like:

In [54]: [vla[r['traces']] for r in t.where('(nanoseconds < 3) & (timestamp > 
3200000')]
Out[54]:
[array([5, 6, 9, 8], dtype=int32),
 array([5, 6], dtype=int32),
 array([1, 2, 4], dtype=int32)]

That's a very simple example, but I hope you've got the idea.

-- 
Francesc Alted

"One would expect people to feel threatened by the 'giant
brains or machines that think'.  In fact, the frightening
computer becomes less frightening if it is used only to
simulate a familiar noncomputer."

-- Edsger W. Dykstra


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to