Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-10 Thread Asher
John R Pierce wrote: how do you plan on accessing this monster data? do you expect to be looking up single values or small set of values at a specific time? seems to me like this is the sort of data thats more often processed in the aggregate, like running a fourier analysis of sliding

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-10 Thread Greg Stark
On Wed, Feb 10, 2010 at 2:32 PM, Asher as...@piceur.co.uk wrote: The data will initially be accessed via a simple GUI which will allow browsing over a subset of the data (subsampled down to 1 sample/minute/hour, etc. It sounds like you could use a tool like rrd that keeps various levels of

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-10 Thread Vincenzo Romano
2010/2/10 Greg Stark gsst...@mit.edu: On Wed, Feb 10, 2010 at 2:32 PM, Asher as...@piceur.co.uk wrote: The data will initially be accessed via a simple GUI which will allow browsing over a subset of the data (subsampled down to 1 sample/minute/hour, etc. It sounds like you could use a tool

[GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Asher
Hello. I'm putting together a database to store the readings from various measurement devices for later processing. Since these things (water pressure monitors attached to very large water pipes) take readings at 200Hz and are typically deployed over multiple sites for several months at a

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Justin Graf
On 2/9/2010 12:47 PM, Asher wrote: Hello. I'm putting together a database to store the readings from various measurement devices for later processing. Since these things (water pressure monitors attached to very large water pipes) take readings at 200Hz and are typically deployed over

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Alex Thurlow
I've only gone up to about a billion rows, but table partitioning seems to be the way to go to me. I did per-day partitioning, and just had the job that inserts the daily data add the table automatically. With the partitioning, it only has to pull up the tables for the specific days, and is

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Asher Hoskins
Justin Graf wrote: Well first is that 200hz meaning 200 samples per channel per second. That is very fast sampling for pressure sensor, I would be surprised if the meters are actually giving real results at that rate. I would look at reducing that down to what the meter is actual capable

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Scott Marlowe
On Tue, Feb 9, 2010 at 10:47 AM, Asher as...@piceur.co.uk wrote: Hello. I'm putting together a database to store the readings from various measurement devices for later processing. Since these things (water pressure monitors attached to very large water pipes) take readings at 200Hz and are

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread John R Pierce
Asher Hoskins wrote: If I partition so that each partition holds data for a single channel (and set a CHECK constraint for this) then I can presumably remove the channel from the index since constraint exclusion will mean that only partitions holding the channel I'm interested in will be

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Justin Graf
On 2/9/2010 4:41 PM, Asher Hoskins wrote: Thanks for that, it looks like partitioning is the way to go. I'm assuming that I should try and keep my total_relation_sizes less than the memory size of the machine? This depends on what the quires look like. As other have stated when

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread BillR
] Best way to handle multi-billion row read-only table? Hello. I'm putting together a database to store the readings from various measurement devices for later processing. Since these things (water pressure monitors attached to very large water pipes) take readings at 200Hz and are typically

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Scott Marlowe
On Tue, Feb 9, 2010 at 8:06 PM, BillR iamb...@williamrosmus.com wrote: Is it also possible to denormalize by putting the 'channel' data in the first table (especially if it isn't very much)? Maintaining a foreign key constraint can impact performance significantly in most RDBMS's, even when

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Brent Wood
If you will be selecting sets of data within a time range, it should also improve performance if you can build a clustered index on the sample_time. It may also be worth looking at whether partitioning by timestamp channel offers any advantages. Brent Wood Brent Wood DBA/GIS consultant

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Greg Smith
Asher wrote: Once loaded into the database the data will never be deleted or modified and will typically be accessed over a particular date range for a particular channel (e.g. sample_time = X AND sample_time = Y AND channel=Z). A typical query won't return more than a few million rows and

Re: [GENERAL] Best way to handle multi-billion row read-only table?

2010-02-09 Thread Scott Marlowe
On Tue, Feb 9, 2010 at 11:51 PM, Greg Smith g...@2ndquadrant.com wrote: Asher wrote: Once loaded into the database the data will never be deleted or modified and will typically be accessed over a particular date range for a particular channel (e.g. sample_time = X AND sample_time = Y AND