Re: [PERFORM] problem with pg_statistics
On Thu, 26 Jun 2003 12:03:52 -0400 Tom Lane [EMAIL PROTECTED] wrote: Manfred Koizar [EMAIL PROTECTED] writes: On Thu, 26 Jun 2003 10:08:05 -0400, Tom Lane [EMAIL PROTECTED] wrote: Try reducing random_page_cost With index scan cost being more than 25 * seq scan cost, I guess that - all other things held equal - even random_page_cost = 1 wouldn't help. Oh, you're right, I was comparing the wrong estimated costs. Yeah, changing random_page_cost won't fix it. Or there's something wrong with correlation? That seems like a good bet. Andre, is this table likely to be physically ordered by time_stamp, or nearly so? If so, do you expect that condition to persist, or is it just an artifact of a test setup? Sorry forgot the pg_stat query... SELECT * FROM pg_stats where tablename = 'tbl_traffic' and attname = 'time_stamp'; tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation -++---+---++--- --- ++- --- ---+- tbl_traffic | time_stamp | 0 | 8 | 104009 | {2003-06-03 19:12:01.059625+02,2003-02-03 19:52:06.666296+01,2003-02-13 09:59:45.415763+01,2003 -02-28 18:10:28.536399+01,2003-04-11 18:09:42.30363+02,2003-04-26 20:35:50.110235+02,2003-05-03 11:09:32.991507+02,2003-05-20 09:53:51.271853+02,2003-05-21 2 0:55:59.155387+02,2003-06-02 02:38:28.823182+02} | {0.0013,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001,0.001} | {2002-07-01 00:00:00+02,2003-02-21 01:59: 46.107696+01,2003-03-11 15:00:37.418521+01,2003-03-26 18:14:50.028972+01,2003-04-10 13:43:20.75909+02,2003-04-27 09:03:19.592213+02,2003-05-08 22:35:41.99761 6+02,2003-05-22 15:34:42.932958+02,2003-06-03 00:53:05.870782+02,2003-06-15 08:45:41.154875+02,2003-06-27 07:18:30.265868+02} | -0.479749 (1 row) Thanks, as ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] problem with pg_statistics
On Fri, 27 Jun 2003 08:07:35 +0200, Andre Schubert [EMAIL PROTECTED] wrote: Traffic data are inserted every 5 minutes with the actual datetime of the transaction, thatswhy the table should be physically order by time_stamp. So I'd expect a correlation of nearly 1. Why do your statistics show a value of -0.479749? A negative correlation is a sign of descending sort order, and correlation values closer to 0 indicate poor correspondence between column values and tuple positions. Could this be the effect of initial data loading? Are there any updates or deletions in your traffic table? To answer Manfreds questions: Andre, what hardware is this running on? What are the values of shared_buffers, random_page_cost, effective_cache_size, ... ? Could you show us the result of SELECT * FROM pg_stats WHERE tablename = tbl_traffic AND attname = time_stamp; ^ ^ ^ ^ Oops, these should have been single quotes. It's too hot here these days :-) sort_mem = 32000 shared_buffers = 13000 Personally I would set them to lower values, but if you have good reasons ... #effective_cache_size = 1000 # default in 8k pages This is definitely too low. With 512MB or more I tend to set this to ca. 80% of available RAM. Use top and free to find hints for good values. Servus Manfred ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] problem with pg_statistics
On Fri, 27 Jun 2003 10:43:01 +0200 Manfred Koizar [EMAIL PROTECTED] wrote: On Fri, 27 Jun 2003 08:07:35 +0200, Andre Schubert [EMAIL PROTECTED] wrote: Traffic data are inserted every 5 minutes with the actual datetime of the transaction, thatswhy the table should be physically order by time_stamp. So I'd expect a correlation of nearly 1. Why do your statistics show a value of -0.479749? A negative correlation is a sign of descending sort order, and correlation values closer to 0 indicate poor correspondence between column values and tuple positions. Could this be the effect of initial data loading? Are there any updates or deletions in your traffic table? We dont make updates the traffic table. Once a month we delete the all data of the oldest month. And after that a vacuum full verbose analyze is performed. Could this cause reordering of the data ? And should i do a cluster idx_ts tbl_traffic ? To answer Manfreds questions: Andre, what hardware is this running on? What are the values of shared_buffers, random_page_cost, effective_cache_size, ... ? Could you show us the result of SELECT * FROM pg_stats WHERE tablename = tbl_traffic AND attname = time_stamp; ^ ^ ^ ^ Oops, these should have been single quotes. It's too hot here these days :-) You are so right ... :) #effective_cache_size = 1000 # default in 8k pages This is definitely too low. With 512MB or more I tend to set this to ca. 80% of available RAM. Use top and free to find hints for good values. Ok, i will talk with my coworker ( he is the sysadmin of our machine ) and look if can use such amount of RAM, because there are several other processes that are running on these machines. But i will test and report ... Thanks, as ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] problem with pg_statistics
Andre Schubert [EMAIL PROTECTED] writes: i think i need a little help with a problem with pg_statistic. Try reducing random_page_cost --- although you'd be foolish to set it on the basis of just a single test query. Experiment with a few different tables, and keep in mind that repeated tests will be affected by caching. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] problem with pg_statistics
Manfred Koizar [EMAIL PROTECTED] writes: On Thu, 26 Jun 2003 10:08:05 -0400, Tom Lane [EMAIL PROTECTED] wrote: Try reducing random_page_cost With index scan cost being more than 25 * seq scan cost, I guess that - all other things held equal - even random_page_cost = 1 wouldn't help. Oh, you're right, I was comparing the wrong estimated costs. Yeah, changing random_page_cost won't fix it. Or there's something wrong with correlation? That seems like a good bet. Andre, is this table likely to be physically ordered by time_stamp, or nearly so? If so, do you expect that condition to persist, or is it just an artifact of a test setup? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend