Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-14 Thread Alban Hertroys
John D. Burger wrote: > Dhaval Shah wrote: > >> 2. Most of the streamed rows are very similar. Think syslog rows, >> where for most cases only the timestamp changes. Of course, if the >> data can be compressed, it will result in improved savings in terms of >> disk size. > > If it really is usual

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-14 Thread John D. Burger
Dhaval Shah wrote: 2. Most of the streamed rows are very similar. Think syslog rows, where for most cases only the timestamp changes. Of course, if the data can be compressed, it will result in improved savings in terms of disk size. If it really is usually just the timestamp that changes, one

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-13 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/12/07 19:49, Dhaval Shah wrote: > Consolidating my responses in one email. > > 1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of > the data comes in a period of 10 hours. Rest 25% comes in the 14 > hours. Of course there are wa

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-12 Thread Kevin Hunter
At 8:49p on 12 May 2007, Dhaval Shah wrote: > That leads to the question, can the data be compressed? Since the data > is very similar, any compression would result in some 6x-10x > compression. Is there a way to identify which partitions are in which > data files and compress them until they are a

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-12 Thread Dhaval Shah
Consolidating my responses in one email. 1. The total data that is expected is some 1 - 1.5 Tb a day. 75% of the data comes in a period of 10 hours. Rest 25% comes in the 14 hours. Of course there are ways to smooth the load patterns, however the current scenario is as explained. 2 I do expect t

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-12 Thread Lincoln Yeoh
At 04:43 AM 5/12/2007, Dhaval Shah wrote: 1. Large amount of streamed rows. In the order of @50-100k rows per second. I was thinking that the rows can be stored into a file and the file then copied into a temp table using copy and then appending those rows to the master table. And then dropping

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-12 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/11/07 21:35, Dhaval Shah wrote: > I do care about the following: > > 1. Basic type checking > 2. Knowing failed inserts. > 3. Non-corruption > 4. Macro transactions. That is a minimal read consistency. > > The following is not necessary > > 1.

Re: [UNSURE] Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-11 Thread Tom Allison
One approach would be to spool all the data to a flat file and then pull them into the database as you are able to. This would give you extremely high peak capability. On May 11, 2007, at 10:35 PM, Dhaval Shah wrote: I do care about the following: 1. Basic type checking 2. Knowing failed

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-11 Thread Dhaval Shah
I do care about the following: 1. Basic type checking 2. Knowing failed inserts. 3. Non-corruption 4. Macro transactions. That is a minimal read consistency. The following is not necessary 1. Referential integrity In this particular scenario, 1. There is a sustained load and peak loads. As lo

Re: [GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-11 Thread Ben
Inserting 50,000 rows a second is, uh... difficult to do, no matter what database you're using. You'll probably have to spool the inserts and insert them as fast as you can, and just hope you don't fall too far behind. But I'm suspecting that you aren't going to be doing much, if any, ref

[GENERAL] Streaming large data into postgres [WORM like applications]

2007-05-11 Thread Dhaval Shah
Here is the straight dope, one of internal teams at my customer site is looking into MySql and replacing its storage engine so that they can store large amount of streamed data. The key here is that the data they are getting is several thousands of rows in an extremely short duration. They say tha