Re: [PERFORM] Performance advice

2003-06-26 Thread Manfred Koizar
On Wed, 25 Jun 2003 11:47:48 +0200, Michael Mattox [EMAIL PROTECTED] wrote: |INFO: --Relation public.jdo_sequencex-- |INFO: Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0. ^ This table could stand more frequent VACUUMs,

Re: [PERFORM] problem with pg_statistics

2003-06-27 Thread Manfred Koizar
On Fri, 27 Jun 2003 08:07:35 +0200, Andre Schubert [EMAIL PROTECTED] wrote: Traffic data are inserted every 5 minutes with the actual datetime of the transaction, thatswhy the table should be physically order by time_stamp. So I'd expect a correlation of nearly 1. Why do your statistics show a

Re: [PERFORM] Serious issues with CPU usage

2003-09-08 Thread Manfred Koizar
On Mon, 8 Sep 2003 13:50:23 +0300, Andri Saar [EMAIL PROTECTED] wrote: Basically I do this: 1) select about ~700 ID's I have to poll 2) poll them 3) update those 700 rows in that table I used (~2700 rows total). And I do this cycle once per minute, so yes, I've got a zillion updates. 700 of 2700

Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Manfred Koizar
On Mon, 15 Sep 2003 22:26:45 -0400, Christopher Browne [EMAIL PROTECTED] wrote: select count (*) from table; The only possible plan for THAT query will involve a seq scan of the whole table. If the postmaster already has the data in cache, it makes sense for it to run in 1 second. If it has to

Re: [PERFORM] osdl-dbt3 run results - puzzled by the execution plans

2003-09-19 Thread Manfred Koizar
On Thu, 18 Sep 2003 15:36:50 -0700, Jenny Zhang [EMAIL PROTECTED] wrote: We thought the large effective_cache_size should lead us to better plans. But we found the opposite. The common structure of your query plans is: Sort Sort Key: sum((partsupp.ps_supplycost * partsupp.ps_availqty))

Re: [PERFORM] osdl-dbt3 run results - puzzled by the execution

2003-09-24 Thread Manfred Koizar
On Fri, 19 Sep 2003 11:35:35 -0700, Jenny Zhang [EMAIL PROTECTED] wrote: I posted more results as you requested: Unfortunately they only confirm what I suspected earlier: 2) - Index Scan using i_ps_suppkey on partsupp (cost=0.00..323.16 rows=80 width=34)

Re: [PERFORM] Optimizing = and = for numbers and dates

2003-10-01 Thread Manfred Koizar
On Wed, 1 Oct 2003 19:45:29 +0200 (MEST), Dimitri Nagiev [EMAIL PROTECTED] wrote: template1=# explain analyze select * from mytable where mydate='2003-09-01'; Seq Scan on mytable (cost=0.00..2209.11 rows=22274 width=562) (actual time=0.06..267.30 rows=22677 loops=1) Filter: (mydate =

Re: [PERFORM] planner doesn't use multicolumn index

2003-10-08 Thread Manfred Koizar
On Wed, 8 Oct 2003 09:08:59 -0500 (CDT), Adrian Demaestri [EMAIL PROTECTED] wrote: the type of the fields are int2 and int4, the where condition is v.g. partido=99 and partida=123). Write your search condition as WHERE partido=99::int2 and partida=123 Servus Manfred

Re: [PERFORM] vacuum locking

2003-10-17 Thread Manfred Koizar
On Fri, 17 Oct 2003 09:52:26 -0600, Rob Nagler [EMAIL PROTECTED] wrote: INFO: Removed 8368 tuples in 427 pages. CPU 0.06s/0.04u sec elapsed 1.54 sec. INFO: Pages 24675: Changed 195, Empty 0; Tup 1031519: Vac 8368, Keep 254, UnUsed 1739. Total CPU 2.92s/2.58u sec elapsed 65.35

Re: [PERFORM] Performance Concern

2003-10-27 Thread Manfred Koizar
On Sun, 26 Oct 2003 00:13:36 +0300, Hannu Krosing [EMAIL PROTECTED] wrote: UPDATE baz SET customer_id = '1234' WHERE baz_key IN ( SELECT baz_key FROM baz innerbaz WHERE customer_id IS NULL and innerbaz.baz_key = baz.baz_key LIMIT 1000 ); AFAICS this is not what the

Re: [PERFORM] Guesses on what this NestLoop is for?

2003-10-28 Thread Manfred Koizar
On Mon, 27 Oct 2003 15:32:41 -0800, Josh Berkus [EMAIL PROTECTED] wrote: FROM event_types, events LEFT OUTER JOIN ... WHERE events.status = 1 or events.status = 11 and events.event_date '2003-10-27' and events.etype_id = event_types.etype_id and ( ... );

Re: [PERFORM] Seeking help with a query that takes too long

2003-11-12 Thread Manfred Koizar
On Wed, 12 Nov 2003 08:34:50 -0500, Nick Fankhauser [EMAIL PROTECTED] wrote: - Index Scan using actor_full_name_uppercase on actor (cost=0.00..6.01 rows=1 width=42) ^^ (actual time=37.62..677.44

Re: [PERFORM] Seeking help with a query that takes too long

2003-11-12 Thread Manfred Koizar
On Wed, 12 Nov 2003 13:27:53 -0500, Nick Fankhauser [EMAIL PROTECTED] wrote: You might have to resort to brute force, like set enable_nestloop=false. - Seq Scan on actor_case_assignment (cost=0.00..209980.49 rows=8669349 width=34) (actual

Re: [PERFORM] TEXT column and indexing

2003-11-19 Thread Manfred Koizar
On Wed, 19 Nov 2003 10:18:18 +0100, Ian Barwick [EMAIL PROTECTED] wrote: Indexes: [...] opv_v_ix btree (substr(value, 1, 128)) SELECT obj_property_id FROM object_property_value opv WHERE opv.value = 'foo' Try ... WHERE substr(opv.value, 1, 128) = 'foo' HTH. Servus Manfred

Re: [PERFORM] index v. seqscan for certain values

2004-04-15 Thread Manfred Koizar
On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane [EMAIL PROTECTED] wrote: Possibly the nonuniform clumping of CID has something to do with the poor results. It shouldn't. The sampling algorithm is designed to give each tuple the same chance of ending up in the sample, and tuples are selected

Re: [PERFORM] query slows down with more accurate stats

2004-04-15 Thread Manfred Koizar
[Just a quick note here; a more thorough discussion of my test results will be posted to -hackers] On Tue, 13 Apr 2004 15:18:42 -0400, Tom Lane [EMAIL PROTECTED] wrote: Well, the first problem is why is ANALYZE's estimate of the total row count so bad :-( ? I suspect you are running into the

Re: [PERFORM] query slows down with more accurate stats

2004-04-16 Thread Manfred Koizar
On Thu, 15 Apr 2004 20:18:49 -0400, Tom Lane [EMAIL PROTECTED] wrote: getting several tuples from the same page is more likely than with the old method. Hm, are you sure? Almost sure. Let's look at a corner case: What is the probability of getting a sample with no two tuples from the same

Re: [PERFORM] query slows down with more accurate stats

2004-04-16 Thread Manfred Koizar
On Fri, 16 Apr 2004 10:34:49 -0400, Tom Lane [EMAIL PROTECTED] wrote: p = prod from{i = 0} to{n - 1} {{c(B - i)} over {cB - i}} So? You haven't proven that either sampling method fails to do the same. On the contrary, I believe that above formula is more or less valid for both methods.

Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem Parameters

2004-04-22 Thread Manfred Koizar
On Thu, 22 Apr 2004 13:51:42 -0400, Pallav Kalva [EMAIL PROTECTED] wrote: I need some help with setting these parameters (shared buffers, effective cache, sort mem) in the pg_conf file. It really depends on the kind of queries you intend to run, the number of concurrent active connections, the

Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem Parameters

2004-04-23 Thread Manfred Koizar
On Fri, 23 Apr 2004 10:20:10 -0400, Pallav Kalva [EMAIL PROTECTED] wrote: the database sizes is around 2- 4 gig and there are 5 of them. this machine is mainly for the databases and nothing is running on them. Did I understand correctly that you run (or plan to run) five postmasters? Is

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar
On Sat, 24 Apr 2004 10:45:40 -0400, Shea,Dan [CIS] [EMAIL PROTECTED] wrote: [...] 87 GB table with a 39 GB index? The vacuum keeps redoing the index, but there is no indication as to why it is doing this. If VACUUM finds a dead tuple, if does not immediately remove index entries pointing to

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar
On Sat, 24 Apr 2004 15:48:19 -0400, Shea,Dan [CIS] [EMAIL PROTECTED] wrote: Manfred is indicating the reason it is taking so long is due to the number of dead tuples in my index and the vacuum_mem setting. nitpicking Not dead tuples in the index, but dead tuples in the table. /nitpicking The

Re: [PERFORM] Why will vacuum not end?

2004-04-24 Thread Manfred Koizar
On Sat, 24 Apr 2004 15:58:08 -0400, Shea,Dan [CIS] [EMAIL PROTECTED] wrote: There were defintely 219,177,133 deletions. The deletions are most likely from the beginning, it was based on the reception_time of the data. I would rather not use re-index, unless it is faster then using vacuum. I

[PERFORM] Number of pages in a random sample (was: query slows down with more accurate stats)

2004-04-25 Thread Manfred Koizar
On Mon, 19 Apr 2004 12:00:10 -0400, Tom Lane [EMAIL PROTECTED] wrote: A possible compromise is to limit the number of pages sampled to something a bit larger than n, perhaps 2n or 3n. I don't have a feeling for the shape of the different-pages probability function; would this make a significant

Re: [PERFORM] Why will vacuum not end?

2004-04-25 Thread Manfred Koizar
On Sun, 25 Apr 2004 09:05:11 -0400, Shea,Dan [CIS] [EMAIL PROTECTED] wrote: It is set at max_fsm_pages = 150 . This might be too low. Your index has ca. 5 M pages, you are going to delete half of its entries, and what you delete is a contiguous range of values. So up to 2.5 M index pages

Re: [PERFORM] analyzer/planner and clustered rows

2004-04-30 Thread Manfred Koizar
On Thu, 29 Apr 2004 19:09:09 -0400, Joseph Shraibman [EMAIL PROTECTED] wrote: How does the analyzer/planner deal with rows clustered together? There's a correlation value per column. Just try SELECT attname, correlation FROM pg_stats WHERE tablename = '...'; if you

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Manfred Koizar
On Fri, 30 Apr 2004 19:46:24 +0200, Jochem van Dieten [EMAIL PROTECTED] wrote: While the storage overhead could be reduced to 1 bit (not a joke) You mean adding an isLossy bit and only where it is set the head tuple has to be checked for visibility, if it is not set the head tuple does not

Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Manfred Koizar
On Sat, 01 May 2004 13:18:04 +0200, Jochem van Dieten [EMAIL PROTECTED] wrote: Tom Lane wrote: Oh really? I think you need to think harder about the transition conditions. Indeed. Dead-to-all is reasonably safe to treat as a hint bit because *it does not ever need to be undone*.

Re: [PERFORM] Shared buffers, Sort memory, Effective Cache Size

2004-04-27 Thread Manfred Koizar
On Wed, 21 Apr 2004 10:01:30 -0700, Qing Zhao [EMAIL PROTECTED] wrote: I have recently configured my PG7.3 on a G5 (8GB RAM) with shmmax set to 512MB and shared_buffer=5, sort_mem=4096 and effective cache size = 1. It seems working great so far but I am wondering if I should make effctive

Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-28 Thread Manfred Koizar
On Tue, 27 Apr 2004 18:01:34 -0400, Rod Taylor [EMAIL PROTECTED] wrote: On Tue, 2004-04-27 at 17:27, Vitaly Belman wrote: Hello pgsql-performance, I discussed the whole subject for some time in DevShed and didn't achieve much (as for results). I wonder if any of you guys can help out:

Re: [PERFORM] Simply join in PostrgeSQL takes too long

2004-04-29 Thread Manfred Koizar
On Thu, 29 Apr 2004 13:36:47 -0400, Rod Taylor [EMAIL PROTECTED] wrote: The reason for the function is that the sort routines (hash aggregation included) will not stop in mid-sort Good point. Servus Manfred ---(end of broadcast)--- TIP 9: the

Re: [PERFORM] Quad processor options

2004-05-12 Thread Manfred Koizar
On Tue, 11 May 2004 15:46:25 -0700, Paul Tuckfield [EMAIL PROTECTED] wrote: - the cache column shows that linux is using 2.3G for cache. (way too much) There is no such thing as way too much cache. you generally want to give memory to postgres to keep it close to the user, Yes, but only a

Re: [PERFORM] [HACKERS] Number of pages in a random sample

2004-04-29 Thread Manfred Koizar
On Mon, 26 Apr 2004 08:08:16 -0700, Sailesh Krishnamurthy [EMAIL PROTECTED] wrote: A Bi-Level Bernoulli Scheme for Database Sampling Peter Haas, Christian Koenig (SIGMOD 2004) Does this apply to our problem? AFAIK with Bernoulli sampling you don't know the sample size in advance. Anyway,

Re: [PERFORM] Use of Functional Indexs and Planner estimates

2004-06-08 Thread Manfred Koizar
On Tue, 8 Jun 2004 17:24:36 +1000, Russell Smith [EMAIL PROTECTED] wrote: Also I am interested in how functional indexes have statistics collected for them, if they do. Not in any released version. http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/analyze.c |

Re: [PERFORM] I could not get postgres to utilizy indexes

2004-08-20 Thread Manfred Koizar
On Thu, 19 Aug 2004 09:54:47 +0200, Leeuw van der, Tim [EMAIL PROTECTED] wrote: You asked the very same question yesterday, and I believe you got some useful answers. Why do you post the question again? Tim, no need to be rude here. We see this effect from time to time when a new user sends a

Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-17 Thread Manfred Koizar
On Thu, 16 Sep 2004 20:51:11 -0500, Stephen Crowley [EMAIL PROTECTED] wrote: explain analyze select * from history where date='2004-09-07' and stock='ORCL' LIMIT 10; - Index Scan using island_history_date_stock_time on island_history (cost=0.00..183099.72 rows=102166 width=83) (actual

Re: [PERFORM] index scan on =, but not ?

2005-03-17 Thread Manfred Koizar
On Thu, 10 Mar 2005 10:24:46 +1000, David Brown [EMAIL PROTECTED] wrote: What concerns me is that this all depends on the correlation factor, and I suspect that the planner is not giving enough weight to this. The planner does the right thing for correlations very close to 1 (and -1) and for

Re: [PERFORM] cpu_tuple_cost

2005-03-17 Thread Manfred Koizar
On Mon, 14 Mar 2005 21:23:29 -0500, Tom Lane [EMAIL PROTECTED] wrote: I think that the reduce random_page_cost mantra is not an indication that that parameter is wrong, but that the cost models it feeds into need more work. One of these areas is the cost interpolation depending on correlation.

Re: [PERFORM] multi-column index

2005-03-17 Thread Manfred Koizar
On Wed, 16 Mar 2005 22:19:13 -0500, Tom Lane [EMAIL PROTECTED] wrote: calculate the correlation explicitly for each index May be it's time to revisit an old proposal that has failed to catch anybody's attention during the 7.4 beta period:

Re: [PERFORM] multi-column index

2005-03-18 Thread Manfred Koizar
On Thu, 17 Mar 2005 23:48:30 -0800, Ron Mayer [EMAIL PROTECTED] wrote: Would this also help estimates in the case where values in a table are tightly clustered, though not in strictly ascending or descending order? No, I was just expanding the existing notion of correlation from single columns to

Re: [PERFORM] multi-column index

2005-03-18 Thread Manfred Koizar
On Thu, 17 Mar 2005 13:15:32 -0500, Tom Lane [EMAIL PROTECTED] wrote: I am coming around to the view that we really do need to calculate index-specific correlation numbers, Correlation is a first step. We might also want distribution information like number of distinct index tuples and

Re: [PERFORM] Sort and index

2005-05-11 Thread Manfred Koizar
On Sun, 24 Apr 2005 17:01:46 -0500, Jim C. Nasby [EMAIL PROTECTED] wrote: Feel free to propose better cost equations. I did. More than once. estimated index scan cost for (project_id, id, date) is 0.00..100117429.34 while the estimate for work_units is 0.00..103168408.62; almost no

Re: [PERFORM] Sort and index

2005-05-12 Thread Manfred Koizar
On Wed, 11 May 2005 16:15:16 -0500, Jim C. Nasby [EMAIL PROTECTED] wrote: This is divided by the number of index columns, so the index correlation is estimated to be 0.219. That seems like a pretty bad assumption to make. Any assumption we make without looking at entire index tuples has to be

Re: [PERFORM] slow queries, possibly disk io

2005-05-31 Thread Manfred Koizar
On 5/31/05, Martin Fandel [EMAIL PROTECTED] wrote: In the documentation of http://www.powerpostgresql.com/Downloads/annotated_conf_80.html is the shared_buffers set to 1/3 of the availble RAM. Well, it says you should never use more than 1/3 of your available RAM which is not quite the same as

Re: [PERFORM] LEFT JOIN optimization

2005-09-12 Thread Manfred Koizar
On Mon, 12 Sep 2005 00:47:57 +0300, Ksenia Marasanova [EMAIL PROTECTED] wrote: - Seq Scan on user_ (cost=0.00..7430.63 rows=12763 width=245) (actual time=360.431..1120.012 rows=12763 loops=1) If 12000 rows of the given size are stored in more than 7000 pages, then there is a lot of free

Re: [PERFORM] Queries taking ages in PG 8.1, have been much faster in PG=8.0

2005-12-10 Thread Manfred Koizar
On Mon, 05 Dec 2005 10:11:41 -0500, Tom Lane [EMAIL PROTECTED] wrote: Correlation -0.0736492 Correlation -0.237136 That has considerable impact on the estimated cost of an indexscan The cost estimator uses correlationsquared. So all correlations between -0.3 and +0.3 can be

Re: [PERFORM] Joining 2 tables with 300 million rows

2005-12-12 Thread Manfred Koizar
On Thu, 8 Dec 2005 11:59:24 -0500 , Amit V Shah [EMAIL PROTECTED] wrote: CONSTRAINT pk_runresult_has_catalogtable PRIMARY KEY (runresult_id_runresult, catalogtable_id_catalogtable, value) ' - Index Scan using runresult_has_catalogtable_id_runresult on runresult_has_catalogtable