Hi Brad,
I'm working w. financial HFD from a market w. about 3500 securities.
I looked at date-time/numpy/pytables issues a while ago and I haven't
looked at scipy's timeseries package in a while so my comments might
be a bit off for you. My solution needs to the account for the
speed:space ratio. Numpy's necessity for data homogeneity means
developing another level of abstraction/conversion which was more
fiddle. And for what gain?
I decided to use something like your strategy 2: read the relevant
data out of pytables into numpy arrays only when I need fast array
processing. I have no use of dates but pytable-store time as ints,
everything else in it's most 'natural' format. Sometimes storing date-
time as a string still allows logical comparisons without conversion.
If the data was more homogeneous, perhaps I'd store it in numpy arrays.
cheers,
David
On 27/11/2007, at 6:26 AM, Bradford Cross wrote:
I started off trying to find a way to use numpy with pytables and
the timeseries package in the scipy sandbox.
I have been collaborating with the guys who wrote the timeseries
package to figure out how we can do it.
Here is my latest email to them ...
After last night I have cleared away a little more of my pytables
ignorance ...
datetimes in pytables can be stored in hdf5 time format, which
means they must be converted to floats (double-precision floats in
our case.) Alternately, we can store datetime objects as strings.
So we have a couple options.
1) We can try to read/write/append from numpy arrays to pytables
EArrays - since they are homogeneous it means all values of all
elements in the time series will need to be double precision floats
(which is required for most financial data anyway.)
2) We can write/append the data as Tables, and read as numpy
arrays, which gives more flexibility in the data model for time
series elements. This model would also allow storing datetimes as
strings. My prototype that uses this model si a bit mroe complex,
because it requires declaring pytables IsDescription objects to
define tables, as well as mapper functions between the table fields
and the in memory objects that are elements of the timeseries.
A couple of points with respect to the timeseries package, please
give me your thoughts:
-I'm not sure that either of these fits great with the model in the
timeseries package that maintains the datetime as a separate array?
-Dates need to be converted to floats for storage in pytables.
/brad
On Nov 26, 2007 3:33 AM, David Worrall < [EMAIL PROTECTED]> wrote:
Hi Brad,
On 26/11/2007, at 12:41 PM, Bradford Cross wrote:
> Greetings,
>
> I have been working on a prototype for storing large amounts of
> timeseries data in pytables.
.
> Those experiments did not work out that great
why not? Please describe the issues.
> and lead me to try storing timeseries data as Tables, with each row
> representing an observation in the series; the first column is a
> Time64Col and the rest represent the data model for each
observation.
>
I do something similar. But I do break it into separate tables where
possible. i.e. where I don't have to do a matrix multiply across
different tables.
> I am curious what others experiences are and whether I am headed
> down a reasonable path.
>
> /brad
>
>
David
>
>
>
>
----------------------------------------------------------------------
> ---
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
_________________________________________________
experimental polymedia: www.avatar.com.au
Sonic Communications Research Group,
University of Canberra: creative.canberra.edu.au/scrg/
----------------------------------------------------------------------
---
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users