for time-series and insane fast, nothing beats kdB, I believe
www.kx.com
Not trying to Quisling-out PG here, just hoping to respond to Mr. Olson
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Tuesday, May 10, 2005 2:54 PM
To: Greg Stark
Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Prefetch
Greg Stark [EMAIL PROTECTED] writes:
Actually forcing things to use indexes is the wrong direction to go if
you're trying to process lots of data and want to stream it off disk
as rapidly as possible. I would think about whether you can structure
your data such that you can use sequential scans.
Agreed.
In your application that might be hard. It sounds like you would need
more or less one table per stock ticker which would really be hard to
manage.
Actually, in a previous lifetime I used to do pretty much the same stuff Matt
is working on. The reason I suggested parallelizing is that what you want is
usually not so much the 200day moving average of FOO, as the 200day moving
averages of a whole bunch of things. If your input table contains time-ordered
data for all those things, then a seqscan works out pretty well.
One thing you might look into is using the CLUSTER command. But
postgres doesn't maintain the cluster ordering so it would require
periodically rerunning it.
If the desired sort order is time-based, it falls out pretty much for free in
this application, because historical data doesn't change -- you are only
interested in appending at the right.
In said previous lifetime, we used Postgres for tracking our actual
transactions, but we built a custom file format for storing the individual tick
data. That's not stuff you need transactional semantics for; the historical
data is what it is. Besides, you need to compress it as much as you can
because there's always too much of it. Machines are faster and disk space
cheaper than they were at the time, but I'd still question the wisdom of using
a Postgres row for each daily bar, let alone finer-grain data.
regards, tom lane
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match