subject:"\[PERFORM\] planner\/optimizer question"

Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Manfred Koizar

On Sat, 01 May 2004 13:18:04 +0200, Jochem van Dieten
[EMAIL PROTECTED] wrote:
Tom Lane wrote:
 Oh really?  I think you need to think harder about the transition
 conditions.

Indeed.

 
 Dead-to-all is reasonably safe to treat as a hint bit because *it does
 not ever need to be undone*.  Visible-to-all does not have that
 property.

Yes, really :-)

No, not really :-(

As Tom has explained in a nearby message his concern is that -- unlike
dead-to-all -- visible-to-all starts as false, is set to true at some
point in time, and is eventually set to false again.  Problems arise if
one backend wants to set visible-to-all to true while at the same time
another backend wants to set it to false.

This could be curable by using a second bit as a deleted flag (might be
even the same bit that's now used as dead-to-all, but I'm not sure).  An
index tuple having both the visible flag (formerly called
visible-to-all) and the deleted flag set would cause a heap tuple access
to check visibility.  But that leaves the question of what to do after
the deleting transaction has rolled back.  I see no clean way from the
visible-and-deleted state to visible-to-all.

This obviously needs another round of hard thinking ...


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Jochem van Dieten

Manfred Koizar said:

 As Tom has explained in a nearby message his concern is that --
 unlike dead-to-all -- visible-to-all starts as false, is set to true
 at some point in time, and is eventually set to false again.
 Problems arise if one backend wants to set visible-to-all to true
 while at the same time another backend wants to set it to false.

Got it, I misinterpreted his concern as visible-to-all should not be
set to true when the tuple is inserted.


 This could be curable by using a second bit as a deleted flag (might
 be even the same bit that's now used as dead-to-all, but I'm not
 sure).  An index tuple having both the visible flag (formerly called
 visible-to-all) and the deleted flag set would cause a heap tuple
 access to check visibility.

Or in a more generalized way: with 2 bits written at the same time you
can express 4 states. But only 3 actions need to be signalled:
dead-to-all, visible-to-all and check-heap. So we can have 2 states
that both signal check-heap.

The immediate solution to the race condition Tom presented would be to
have the transaction that invalidates the heap tuple switch the index
tuple from the one check-heap state to the other. The transaction that
wants to update to visible-to-all can now see that the state has
changed (but not the meaning) and aborts its change.


 But that leaves the question of what to
 do after the deleting transaction has rolled back.  I see no clean
 way from the visible-and-deleted state to visible-to-all.

I'm afraid I don't know enough about the inner workings of rollbacks
to determine how the scenario A determines visible-to-all should be
set, B invalidates tuple, B rolls back, C invalidates tuple, C
commits, A reaquires lock on index would work out. I guess I have
some more reading to do.

But if you don't roll back too often it wouldn't even be a huge
problem to just leave them in visible-and-deleted state until
eventually they go into the dead-to-all state.

Jochem





---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] planner/optimizer question

2004-05-01 Thread Jochem van Dieten

Tom Lane wrote:
Manfred Koizar [EMAIL PROTECTED] writes:
Yes, the visible-to-all flag would be set as a by-product of an index
scan, if the heap tuple is found to be visible to all active
transactions.  This update is non-critical
Oh really?  I think you need to think harder about the transition
conditions.
Dead-to-all is reasonably safe to treat as a hint bit because *it does
not ever need to be undone*.  Visible-to-all does not have that
property.
Yes, really :-)
When a tuple is inserted the visible-to-all flag is set to false. 
The effect of this is that every index scan that finds this tuple 
has to visit the heap to verify visibility. If it turns out the 
tuple is not only visible to the current transaction, but to all 
current transactions, the visible-to-all flag can be set to true.
This is non-critical, because if it is set to false scans will 
not miss the tuple, they will just visit the heap to verify 
visibility.

The moment the heap tuple is updated/deleted the visible-to-all 
flag needs to be set to false again in all indexes. This is 
critical, and the I/O and (dead)lock costs of unsetting the 
visible-to-all flag are unknown and might be big enough to ofset 
any advantage on the selects.

But I believe that for applications with a load, select, drop 
usage pattern (warehouses, archives etc.) having this 
visible-to-all flag would be a clear winner.

Jochem
--
I don't get it
immigrants don't work
and steal our jobs
- Loesje
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] planner/optimizer question

2004-05-01 Thread Tom Lane

Jochem van Dieten [EMAIL PROTECTED] writes:
 The moment the heap tuple is updated/deleted the visible-to-all 
 flag needs to be set to false again in all indexes. This is 
 critical,

Exactly.  This gets you out of the hint-bit semantics and into a ton
of interesting problems, such as race conditions.  (Process A determines
that tuple X is visible-to-all, and goes to update the index tuple.
Before it can reacquire lock on the index page, process B updates the
heap tuple and visits the index to clear the flag bit.  Once A obtains
lock it will set the flag bit.  Oops.)

Basically what you are buying into with such a thing is multiple copies
of critical state.  It may be only one bit rather than several words,
but updating it is no less painful than if it were a full copy of the
tuple's commit status.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades

On 29 Apr 2004 at 19:17, Tom Lane wrote:

 Josh Berkus [EMAIL PROTECTED] writes:
  Certainly the fact that MSSQL is essentially a single-user database makes 
  things easier for them.
 
 Our recent testing (cf the Xeon thread) says that the interlocking we
 do to make the world safe for multiple backends has a fairly high cost
 (at least on some hardware) compared to the rest of the work in
 scenarios where you are doing zero-I/O scans of data already in memory.
 Especially so for index scans.  I'm not sure this completely explains
 the differential that Gary is complaining about, but it could be part of
 it.  Is it really true that MSSQL doesn't support concurrent operations?
 
   regards, tom lane

As far as I am aware SQLSever supports concurrent operations. It 
certainly creates more threads for each connection. None of my 
observations of the system under load (50 ish concurrent users, 150 ish 
connections) suggest that it is serializing queries.

These tests are currentl on single processor Athlon XP 2000+ systems.

Regards,
Gary.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades

On 30 Apr 2004 at 7:26, Dennis Bjorklund wrote:

 On Fri, 30 Apr 2004, Gary Doades wrote:

  I should have also pointed out that MSSQL reported that same index scan
  as taking 65% of the overall query time. It was just faster. The
  overall query took 103ms in MSSQL.

 Are your results based on a single client accessing the database and no
 concurrent updates?

 Would adding more clients, and maybe having some client that
 updates/inserts into the tables, still make mssql faster then pg? Maybe
 it's so simple as pg being optimized for more concurrent users then mssql?

 I'm just asking, I don't know much about the inner workings of
 mssql.

 --
 /Dennis Björklund


At the moment it is difficult to set up many clients for testing concurrent
stuff. In the past I have had several SQLServer clients under test,
mainly select queries. MSSQL can certainly execute queries while other
queries are still running in the background.

Our production app is fairly well biased towards selects. Currently it is
about 70% selects, 20% inserts, 6% deletes and 4% updates. Very few
updates are more than one row based on the primary key. Over 90% of
the time spend running SQL is in select queries.

My limited concurrent testing on Postgres gives very good performance
on updates, inserts, deletes, but it is suffering on the selects in certain
areas which why I have been concentrating my efforts on that area.

Having got similar (or the same) access plans in both Postgres and
MSSQL I was getting down to the next level of checking what was going
on when executing the already planned query.

I do have another database system I could try. Sybase SQLAnywhere.
This is not the original Sybase Entrerprise which has the same roots as
MSSQL. In the past my testing suggested that SQLAnywhere
performance was as godd or better than MSSQL. I mey try to set it up
with the same data in these tests for a more detailed comparison.

Regards,
Gary.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Jochem van Dieten

Manfred Koizar wrote:
On Wed, 28 Apr 2004 09:05:04 -0400, Tom Lane [EMAIL PROTECTED] wrote:
[ ... visibility information in index tuples ... ]
Storing that information would at least double the overhead space used
for each index tuple.  The resulting index bloat would significantly
slow index operations by requiring more I/O.  So it's far from clear
that this would be a win, even for those who care only about select
speed.
While the storage overhead could be reduced to 1 bit (not a joke)
You mean adding an isLossy bit and only where it is set the head 
tuple has to be checked for visibility, if it is not set the head 
tuple does not have to be checked?


we'd
still have the I/O overhead of locating and updating index tuples for
every heap tuple deleted/updated.
Would there be additional I/O for the additional bit in the index 
tuple (I am unable to find the layout of index tuple headers in 
the docs)?

Jochem
--
I don't get it
immigrants don't work
and steal our jobs
- Loesje

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades

On 30 Apr 2004 at 8:32, Jeff wrote:
 
   A better comparision query may be a simple select a from mytable 
 where a between foo and bar  to get an index scan.  In that case its a 
 straight up, vanilla index scan.  Nothing else getting in the way.
 

Yes, you're right and I have done this just to prove to myself that it is the index 
scan that 
is the bottleneck. I have some complex SQL that executes very quickly with Postgres, 
similar to MSSQL, but the index scans in most of those only touch a few rows for a few 
loops. It seems to be a problem when the index scan is scanning very many rows and 
for each of these it has to go to the table just to find out if the index it just 
looked at is 
still valid.

Gary.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades

On 30 Apr 2004 at 9:37, Kevin Barnard wrote:

 
 I was always under the impression that MSSQL used leaf and row level locking and 
 therefore 
 was not a concurrent, in the same sense that postgres is, database. It would still 
 allow for 
 concurrent connections and such but updates will get blocked/ delayed. I might be 
 wrong.
 

Ultimately you may be right. I don't know enough about SQLServer 
internals to say either way. Anyway, most of our system is in selects for 
70% of the time. I could try and set up a test for this when I get a bit 
more time.

Unfortunately I suspect that this topic won't get taken much further. In 
order to test this it would mean modifying quite a bit of code. Whether 
putting additional info in the index header and not visiting the data row 
if all the required data is in the index would be beneficial would require 
quite a bit of work by someone who knows more than I do. I reckon that 
no-one has the time to do this at the moment.

Regards,
Gary.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Kris Jurka



On Fri, 30 Apr 2004, Gary Doades wrote:

 Yes, you're right and I have done this just to prove to myself that it
 is the index scan that is the bottleneck. I have some complex SQL that
 executes very quickly with Postgres, similar to MSSQL, but the index
 scans in most of those only touch a few rows for a few loops. It seems
 to be a problem when the index scan is scanning very many rows and for
 each of these it has to go to the table just to find out if the index it
 just looked at is still valid.
 

Another way to speed this up is the TODO item: Use bitmaps to fetch 
heap pages in sequential order  For an indexscan that fetches a number 
of rows those rows may be anywhere in the base table so as each index 
entry is found it fetches the corresponding table row from the heap.  This 
is not ideal because you can be jumping around in the heap and end up 
fetching the same page multiple times because table rows are in the same 
page, but were found in different places in the index.

Kris Jurka

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Manfred Koizar

On Fri, 30 Apr 2004 19:46:24 +0200, Jochem van Dieten
[EMAIL PROTECTED] wrote:
 While the storage overhead could be reduced to 1 bit (not a joke)

You mean adding an isLossy bit and only where it is set the head 
tuple has to be checked for visibility, if it is not set the head 
tuple does not have to be checked?

Yes, something like this.  Actually I imagined it the other way round: a
visible-to-all flag similar to the existing dead-to-all flag (search for
LP_DELETE and ItemIdDeleted in nbtree.c).

 we'd
 still have the I/O overhead of locating and updating index tuples for
 every heap tuple deleted/updated.

Would there be additional I/O for the additional bit in the index 
tuple (I am unable to find the layout of index tuple headers in 
the docs)?

Yes, the visible-to-all flag would be set as a by-product of an index
scan, if the heap tuple is found to be visible to all active
transactions.  This update is non-critical and, I think, not very
expensive.

Deleting (and hence updating) a tuple is more critical, regarding both
consistency and performance.  We'd have to locate all index entries
pointing to the heap tuple and set their visible-to-all flags to false.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Dennis Bjorklund

On Fri, 30 Apr 2004, Gary Doades wrote:

 I should have also pointed out that MSSQL reported that same index scan
 as taking 65% of the overall query time. It was just faster. The
 overall query took 103ms in MSSQL.

Are your results based on a single client accessing the database and no 
concurrent updates?

Would adding more clients, and maybe having some client that
updates/inserts into the tables, still make mssql faster then pg? Maybe
it's so simple as pg being optimized for more concurrent users then mssql?

I'm just asking, I don't know much about the inner workings of 
mssql.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades

 
 I guess the real question is, why maintain index flags and not simply
 drop the index entry altogether?
 
 A more interesting case would be to have the backend process record
 index tuples that it would invalidate (if committed), then on commit
 send that list to a garbage collection process.
 
 It's still vacuum -- just the reaction time for it would be much
 quicker.
 
This was my original question.

I guess the problem is with MVCC. The row may have gone from your 
current view of the table but not from someone elses. I don't (yet) 
understand the way it works to say for sure, but I still think it is worth 
pursuing further for someone who does know the deep stuff. They seem 
to have concluded that it is not worth it however.

Cheers,
Gary.




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades


 It's also entirely possible your indices are using inaccurate
 statistical information.  Have you ANALYZEd recently?
 

In this example the statistics don't matter. The plans used were the same for 
MSSQL and Postgres. I was trying to eliminate the difference in plans 
between the two, which obviously does make a difference, sometimes in 
MSSQL favour and sometimes the other way round. Both systems, having 
decided to do the same index scan, took noticably different times. The 
Postgres database was fully vacuumed and analysed anyway.

I agree about MSSQL recovery time. it sucks. This is why they are making a 
big point about the improved recovery time in yukon. Although the recovery 
time is important, I see this as an exception, whereas at the moment I am 
interested in the everyday.

Cheers,
Gary.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades

On 29 Apr 2004 at 13:54, Josh Berkus wrote:

 Gary,
 
 
 It's also quite possble the MSSQL simply has more efficient index scanning 
 implementation that we do.They've certainly had incentive; their storage 
 system sucks big time for random lookups and they need those fast indexes.  
 (just try to build a 1GB adjacency list tree on SQL Server.   I dare ya).
 
 Certainly the fact that MSSQL is essentially a single-user database makes 
 things easier for them.They don't have to maintain multiple copies of the 
 index tuples in memory.I think that may be our main performance loss.
 

Possibly, but MSSQL certainly uses data from indexes and cuts out the 
subsequent (possibly random seek) data fetch. This is also why the 
Index Tuning Wizard often recommends multi column compound 
indexes in some cases. I've tried these recommendations on occasions 
and they certainly speed up the selects significantly. If anyhing the index 
scan on the new compound index must be slower then the original single 
column index and yet it still gets the data faster.

This indicates to me that it is not the scan (or IO) performance that is 
making the difference, but not having to go get the data row.

Cheers,
Gary.


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Tom Lane

Gary Doades [EMAIL PROTECTED] writes:
 In this example the statistics don't matter.

Don't they?

A prior poster mentioned that he thought MSSQL tries to keep all its
indexes in memory.  I wonder whether you are giving Postgres a fair
chance to do the same.  What postgresql.conf settings are you using?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades

 
 Having picked out an index scan as being the highest time user I 
 concentrated on that in  this case and compared the same index scan on 
 MSSQL. At least MSSQL reported it as  an index scan on the same index 
 for the same number of rows. 
 

I should have also pointed out that MSSQL reported that same index scan as taking 65% 
of the overall query time.
It was just faster. The overall query took 103ms in MSSQL.

Gary.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Tom Lane

Josh Berkus [EMAIL PROTECTED] writes:
 Certainly the fact that MSSQL is essentially a single-user database makes 
 things easier for them.

Our recent testing (cf the Xeon thread) says that the interlocking we
do to make the world safe for multiple backends has a fairly high cost
(at least on some hardware) compared to the rest of the work in
scenarios where you are doing zero-I/O scans of data already in memory.
Especially so for index scans.  I'm not sure this completely explains
the differential that Gary is complaining about, but it could be part of
it.  Is it really true that MSSQL doesn't support concurrent operations?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Gary Doades

I know you will shoot me down, but...

Why is there an entry in the index for a row if the row is not valid? 
Wouldn't it be better for the index entry validity to track the row validity. 
If a particular data value for a query (join, where etc.) can be satisfied 
by the index entry itself this would be a big performance gain.

Cheers,
Gary.

On 28 Apr 2004 at 0:27, Tom Lane wrote:

 [EMAIL PROTECTED] writes:
  ... Wouldn't the most efficient plan be to scan the index regardless
  of crm_id because the only columns needed are in the index?
 
 No.  People coming from other databases often have the misconception
 that queries can be answered by looking only at an index.  That is never
 true in Postgres because row validity info is only stored in the table;
 so we must always visit the table entry to make sure the row is still
 valid/visible for the current query.
 
 Accordingly, columns added to the index that aren't constrained by the
 WHERE clause are not very useful ...
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Christopher Kings-Lynne

Why is there an entry in the index for a row if the row is not valid? 
Wouldn't it be better for the index entry validity to track the row validity. 
If a particular data value for a query (join, where etc.) can be satisfied 
by the index entry itself this would be a big performance gain.
For SELECTs, yes - but for INSERT, UPDATE and DELETE it would be a big 
performance loss.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Gary Doades

I can understand the performance loss on non-selects for keeping the 
index validity state tracking the row validity, but would that outweigh the 
performance gains on selects? Depends on your mix of selects to non 
selects I guess, but other database systems seem to imply that keeping 
the index on track is worth it overall.

Cheers,
Gary.

On 28 Apr 2004 at 15:04, Christopher Kings-Lynne wrote:

  Why is there an entry in the index for a row if the row is not valid? 
  Wouldn't it be better for the index entry validity to track the row validity. 
  If a particular data value for a query (join, where etc.) can be satisfied 
  by the index entry itself this would be a big performance gain.
 
 For SELECTs, yes - but for INSERT, UPDATE and DELETE it would be a big 
 performance loss.
 
 Chris
 



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Tom Lane

Manfred Koizar [EMAIL PROTECTED] writes:
 On Wed, 28 Apr 2004 07:35:41 +0100, Gary Doades [EMAIL PROTECTED]
 wrote:
 Why is there an entry in the index for a row if the row is not valid? 

 Because whether a row is seen as valid or not lies in the eye of the
 transaction looking at it.  Full visibility information is stored in the
 heap tuple header.  The developers' consensus is that this overhead
 should not be in every index tuple.

Storing that information would at least double the overhead space used
for each index tuple.  The resulting index bloat would significantly
slow index operations by requiring more I/O.  So it's far from clear
that this would be a win, even for those who care only about select
speed.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Atesz

Hi,

You should try the next queries:

select support_person_id from ticket_crm_map where crm_id = 7 GROUP BY
support_person_id;
select support_person_id from ticket_crm_map where crm_id = 1 GROUP BY
support_person_id;

It can use the 'ticket_crm_map_crm_id_suppid' index. 

Generally the Postgres use an k-column index if columns of your
conditions are prefix of the index column.
For example:
CREATE INDEX test_idx on test(col1,col2,col3,col4);
SELECT * FROM test WHERE col1=3 AND col2=13;  -- This can use the index.

But the next queries cannot use the index:
SELECT * FROM test WHERE col1=3 AND col3=13;.
SELECT * FROM test WHERE col2=3;

If you have problem with seq_scan or sort, you can disable globally and
locally: 
SET enable_seqscan=0;
SET enable_sort = 0;

Regards, Antal Attila



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Tom Lane

[EMAIL PROTECTED] writes:
 ... Wouldn't the most efficient plan be to scan the index regardless
 of crm_id because the only columns needed are in the index?

No.  People coming from other databases often have the misconception
that queries can be answered by looking only at an index.  That is never
true in Postgres because row validity info is only stored in the table;
so we must always visit the table entry to make sure the row is still
valid/visible for the current query.

Accordingly, columns added to the index that aren't constrained by the
WHERE clause are not very useful ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

[PERFORM] planner/optimizer question

2004-04-26 Thread brad-pgperf

Hi, 

I have a query which I think should be using an index all of the time but 
postgres only uses the index part of the time.  The index 
(ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed 
first followed by the selected column (support_person_id).  Wouldn't the 
most efficient plan be to scan the index regardless of crm_id because the 
only columns needed are in the index?  Below is the table, 2 queries showing 
the difference in plans, followed by the record distribution of 
ticket_crm_map.  I first did a 'vacuum analyze' and am running postgres 
7.4.2. 

Thanks,
Brad 

athenapost= \d ticket_crm_map
   Table public.ticket_crm_map
 Column |Type | 
Modifiers
+-+- 
---
tcrm_map_id| integer | not null
ticket_id  | integer | not null
crm_id | integer | not null
support_person_id  | integer | not null
escalated_to_person_id | integer | not null
status | character varying(50)   | not null default 
'Open'::character varying
close_date | timestamp without time zone |
updated_date   | timestamp without time zone |
updated_by | character varying(255)  |
created_date   | timestamp without time zone |
created_by | character varying(255)  |
additional_info| text|
subject| character varying(255)  |
Indexes:
ticket_crm_map_pkey primary key, btree (tcrm_map_id)
ticket_crm_map_crm_id_key unique, btree (crm_id, ticket_id)
ticket_crm_map_crm_id_suppid btree (crm_id, support_person_id)
ticket_crm_map_status btree (status)
ticket_crm_map_ticket_id btree (ticket_id)
Foreign-key constraints:
$1 FOREIGN KEY (ticket_id) REFERENCES ticket(ticket_id)
$2 FOREIGN KEY (crm_id) REFERENCES company_crm(crm_id)
$3 FOREIGN KEY (support_person_id) REFERENCES person(person_id)
$4 FOREIGN KEY (escalated_to_person_id) REFERENCES person(person_id)
$5 FOREIGN KEY (status) REFERENCES ticket_status(status) 

athenapost= explain analyze select distinct support_person_id from 
ticket_crm_map where crm_id = 7;

QUERY PLAN
 
 
--
Unique  (cost=1262.99..1265.27 rows=1 width=4) (actual time=15.335..18.245 
rows=20 loops=1)
-  Sort  (cost=1262.99..1264.13 rows=456 width=4) (actual 
time=15.332..16.605 rows=2275 loops=1)
 Sort Key: support_person_id
 -  Index Scan using ticket_crm_map_crm_id_suppid on ticket_crm_map  
(cost=0.00..1242.85 rows=456 width=4) (actual time=0.055..11.281 rows=2275 
loops=1)
   Index Cond: (crm_id = 7)
Total runtime: 18.553 ms
(6 rows) 

Time: 20.598 ms
athenapost= explain analyze select distinct support_person_id from 
ticket_crm_map where crm_id = 1;
   QUERY PLAN
 
-
Unique  (cost=10911.12..11349.26 rows=32 width=4) (actual 
time=659.102..791.517 rows=24 loops=1)
-  Sort  (cost=10911.12..11130.19 rows=87628 width=4) (actual 
time=659.090..713.285 rows=93889 loops=1)
 Sort Key: support_person_id
 -  Seq Scan on ticket_crm_map  (cost=0.00..3717.25 rows=87628 
width=4) (actual time=0.027..359.299 rows=93889 loops=1)
   Filter: (crm_id = 1)
Total runtime: 814.601 ms
(6 rows) 

Time: 817.095 ms
athenapost= select count(*), crm_id from ticket_crm_map group by crm_id;
count | crm_id
---+
2554 | 63
129 | 25
17 | 24
110 | 23
74 | 22
69 | 21
 2 | 20
53 | 82
10 | 17
16 | 81
46637 | 16
14 | 80
 2 | 15
1062 | 79
87 | 78
93 | 77
60 | 44
363 | 76
225 | 10
 4 | 74
83 |  9
27 | 73
182 |  8
2275 |  7
15 | 71
554 |  6
44 | 70
631 |  5
37 |  4
190 |  3
112 |  2
93889 |  1
(32 rows) 

Time: 436.697 ms 

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

Re: [PERFORM] planner/optimizer question

[PERFORM] planner/optimizer question

25 matches

Site Navigation

Mail list logo

Footer information