Re: [HACKERS] Performance while loading data and indexing
On 26 Sep 2002 at 14:05, Shridhar Daithankar wrote: Some time back I posted a query to build a site with 150GB of database. In last couple of weeks, lots of things were tested at my place and there are some results and again some concerns. 2) Creating index takes huge amount of time. Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec. Create unique composite index on 2 char and a timestamp field: 25226 sec. Database size on disk: 26GB Select query: 1.5 sec. for approx. 150 rows. 2) Sort mem.=12K i.e. 94MB, sounds good enough to me. Does this need further addition to improve create index performance? Just a thought. If I sort the table before making an index, would it be faster than creating index on raw table? And/or if at all, how do I sort the table without duplicating it? Just a wild thought.. Bye Shridhar -- linux: the choice of a GNU generation([EMAIL PROTECTED] put this on Tshirts in '93) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Performance while loading data and indexing
On 26 Sep 2002 at 10:51, [EMAIL PROTECTED] wrote: Hi, it seems you have to cluster it, I don't think you have another choise. Hmm.. That didn't occur to me...I guess some real time clustering like usogres would do. Unless it turns out to be a performance hog.. But this is just insert and select. No updates no deletes(Unless customer makes a 180 degree turn) So I doubt if clustering will help. At the most I can replicate data across machines and spread queries on them. Replication overhead as a down side and low query load on each machine as upside.. I'm retrieving the configuration of our postgres servers (I'm out of office now), so I can send it to you. I was quite disperate about performance, and I was thinking to migrate the data on an oracle database. Then I found this configuration on the net, and I had a succesfull increase of performance. In this case, we are upto postgresql because we/our customer wants to keep the costs down..:-) Even they are asking now if it's possible to keep hardware costs down as well. That's getting some funny responses here but I digress.. Maybe this can help you. Why you use copy to insert records? I usually use perl scripts, and they work well . Performance reasons. As I said in one of my posts earlier, putting upto 100K records in one transaction in steps of 10K did not reach performance of copy. As Tom said rightly, it was a 4-1 ratio despite using transactions.. Thanks once again.. Bye Shridhar -- Secretary's Revenge:Filing almost everything under the. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Performance while loading data and indexing
On 26 Sep 2002 at 14:05, Shridhar Daithankar wrote: Some time back I posted a query to build a site with 150GB of database. In last couple of weeks, lots of things were tested at my place and there are some results and again some concerns. 2) Creating index takes huge amount of time. Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec. Create unique composite index on 2 char and a timestamp field: 25226 sec. Database size on disk: 26GB Select query: 1.5 sec. for approx. 150 rows. I never tried 150GB of data, but 10GB of data, and this worked fine for me. Maybe it will help if you post your table schema, including which indexes you use, and the average size of one tuple. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Performance while loading data and indexing
On 26 Sep 2002 at 11:17, Mario Weilguni wrote: On 26 Sep 2002 at 14:05, Shridhar Daithankar wrote: Some time back I posted a query to build a site with 150GB of database. In last couple of weeks, lots of things were tested at my place and there are some results and again some concerns. 2) Creating index takes huge amount of time. Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec. Create unique composite index on 2 char and a timestamp field: 25226 sec. Database size on disk: 26GB Select query: 1.5 sec. for approx. 150 rows. I never tried 150GB of data, but 10GB of data, and this worked fine for me. Maybe it will help if you post your table schema, including which indexes you use, and the average size of one tuple. Well the test runs were for 10GB of data. Schema is attached. Read in fixed fonts..Last nullable fields are dummies but may be used in fututre and varchars are not acceptable(Not my requirement). Tuple size is around 100 bytes.. The index creation query was CREATE INDEX index1 ON tablename (esn,min,datetime); What if I put datetime ahead? It's likely the the datetime field will have high degree of locality being log data.. Bye Shridhar -- brain, v: [as in to brain]To rebuke bluntly, but not pointedly; to dispel a source of error in an opponent.-- Ambrose Bierce, The Devil's Dictionary Field Name Field Type NullableIndexed typeint no no esn char (10) no yes min char (10) no yes datetimetimestamp no yes opc0char (3)no no opc1char (3)no no opc2char (3)no no dpc0char (3)no no dpc1char (3)no no dpc2char (3)no no npa char (3)no no nxx char (3)no no restchar (4)no no field0 int yes no field1 char (4)yes no field2 int yes no field3 char (4)yes no field4 int yes no field5 char (4)yes no field6 int yes no field7 char (4)yes no field8 int yes no field9 char (4)yes no ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Performance while loading data and indexing
Well the test runs were for 10GB of data. Schema is attached. Read in fixed fonts..Last nullable fields are dummies but may be used in fututre and varchars are not acceptable(Not my requirement). Tuple size is around 100 bytes.. The index creation query was CREATE INDEX index1 ON tablename (esn,min,datetime); What if I put datetime ahead? It's likely the the datetime field will have high degree of locality being log data.. Just an idea, I noticed you use char(10) for esn and min, and use this as index. Are these really fixed len fields all having 10 bytes? Otherwise varchar(10) would be better, because your tables, and especially the indices will be probably much smaller. what average length do you have for min and esn? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Performance while loading data and indexing
Shridhar Daithankar [EMAIL PROTECTED] writes: On 26 Sep 2002 at 11:50, Mario Weilguni wrote: Just an idea, I noticed you use char(10) for esn and min, and use this as index. Are these really fixed len fields all having 10 bytes? 10 bytes. Those are id numbers.. like phone numbers always have all the digits filled in.. If they are numbers, can you store them as bigints instead of char(N)? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly