I started off trying to find a way to use numpy with pytables and the
timeseries package in the scipy sandbox.
I have been collaborating with the guys who wrote the timeseries package to
figure out how we can do it.
Here is my latest email to them ...
After last night I have cleared away a little more of my pytables ignorance
...
datetimes in pytables can be stored in hdf5 time format, which means they
must be converted to floats (double-precision floats in our case.)
Alternately, we can store datetime objects as strings.
So we have a couple options.
1) We can try to read/write/append from numpy arrays to pytables EArrays -
since they are homogeneous it means all values of all elements in the time
series will need to be double precision floats (which is required for most
financial data anyway.)
2) We can write/append the data as Tables, and read as numpy arrays, which
gives more flexibility in the data model for time series elements. This
model would also allow storing datetimes as strings. My prototype that uses
this model si a bit mroe complex, because it requires declaring pytables
IsDescription objects to define tables, as well as mapper functions between
the table fields and the in memory objects that are elements of the
timeseries.
A couple of points with respect to the timeseries package, please give me
your thoughts:
-I'm not sure that either of these fits great with the model in the
timeseries package that maintains the datetime as a separate array?
-Dates need to be converted to floats for storage in pytables.
/brad
On Nov 26, 2007 3:33 AM, David Worrall <[EMAIL PROTECTED]> wrote:
> Hi Brad,
>
> On 26/11/2007, at 12:41 PM, Bradford Cross wrote:
>
> > Greetings,
> >
> > I have been working on a prototype for storing large amounts of
> > timeseries data in pytables.
>
> .
> > Those experiments did not work out that great
>
> why not? Please describe the issues.
>
> > and lead me to try storing timeseries data as Tables, with each row
> > representing an observation in the series; the first column is a
> > Time64Col and the rest represent the data model for each observation.
> >
> I do something similar. But I do break it into separate tables where
> possible. i.e. where I don't have to do a matrix multiply across
> different tables.
>
> > I am curious what others experiences are and whether I am headed
> > down a reasonable path.
> >
> > /brad
> >
> >
> David
> >
> >
> >
> > ----------------------------------------------------------------------
> > ---
> > This SF.net email is sponsored by: Microsoft
> > Defy all challenges. Microsoft(R) Visual Studio 2005.
> > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > _______________________________________________
> > Pytables-users mailing list
> > Pytables-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>
> _________________________________________________
> experimental polymedia: www.avatar.com.au
> Sonic Communications Research Group,
> University of Canberra: creative.canberra.edu.au/scrg/
>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users