Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Zeugswetter Andreas ADI SD

  The plural seems better to me; there's no such thing as a solitary
  synchronized scan, no?  The whole point of the feature is to affect
  the behavior of multiple scans.
 
 +1. The plural is important IMHO.

ok, good.

 As I stated earlier, I don't really like this argument (we already
 broke badly designed applications a few times in the past) but we
 really need a way to guarantee that the execution of a query is stable
 and doesn't depend on external factors. And the original problem was
 to guarantee that pg_dump builds a dump as identical as possible to
 the existing data by ignoring external factors. It's now the case with
 your patch.
 The fact that it allows us not to break existing applications relying
 too much on physical ordering is a nice side effect though :).

One more question. It would be possible that a session that turned off
the synchronized_seqscans still be a pack leader for other later
sessions.
Do/should we consider that ?

The procedure would be:
start from page 0
iff no other pack is present fill the current scan position for others

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Kenneth Marshall
On Wed, Jan 30, 2008 at 10:56:47AM +0100, Zeugswetter Andreas ADI SD wrote:
 
   The plural seems better to me; there's no such thing as a solitary
   synchronized scan, no?  The whole point of the feature is to affect
   the behavior of multiple scans.
  
  +1. The plural is important IMHO.
 
 ok, good.
 
  As I stated earlier, I don't really like this argument (we already
  broke badly designed applications a few times in the past) but we
  really need a way to guarantee that the execution of a query is stable
  and doesn't depend on external factors. And the original problem was
  to guarantee that pg_dump builds a dump as identical as possible to
  the existing data by ignoring external factors. It's now the case with
  your patch.
  The fact that it allows us not to break existing applications relying
  too much on physical ordering is a nice side effect though :).
 
 One more question. It would be possible that a session that turned off
 the synchronized_seqscans still be a pack leader for other later
 sessions.
 Do/should we consider that ?
 
 The procedure would be:
 start from page 0
 iff no other pack is present fill the current scan position for others
 

I think that allowing other scans to use the scan started by a query that
disabled the sync scans would have value. It would prevent these types
of queries from completely tanking the I/O.

+1

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Alvaro Herrera
Zeugswetter Andreas ADI SD escribió:
 
   The plural seems better to me; there's no such thing as a solitary
   synchronized scan, no?  The whole point of the feature is to affect
   the behavior of multiple scans.
  
  +1. The plural is important IMHO.
 
 ok, good.

Hmm, if you guys are going to add another GUC variable, please hurry
because we have to translate the description text.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Hmm, if you guys are going to add another GUC variable, please hurry
 because we have to translate the description text.

Yeah, I'm going to put it in today --- just the on/off switch.
Any discussions of exposing threshold parameters will have to wait
for 8.4.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
 One more question. It would be possible that a session that turned off
 the synchronized_seqscans still be a pack leader for other later
 sessions.
 Do/should we consider that ?

Seems like a reasonable thing to consider ... for 8.4.  I'm not willing
to go poking the syncscan code that much at this late point in the 8.3
cycle.  (I'm not sure if it's been mentioned yet on -hackers, but the
current plan is to freeze 8.3.0 tomorrow evening.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Simon Riggs
On Wed, 2008-01-30 at 13:07 -0500, Tom Lane wrote:
 Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
  One more question. It would be possible that a session that turned off
  the synchronized_seqscans still be a pack leader for other later
  sessions.
  Do/should we consider that ?
 
 Seems like a reasonable thing to consider ... for 8.4. 

Definitely. I thought about this the other day and decided it had some
strange behaviour in some circumstances, so wouldn't be desirable
overall.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Heikki Linnakangas

Simon Riggs wrote:

On Wed, 2008-01-30 at 18:42 +, Heikki Linnakangas wrote:
It's even worse than that. Elsewhere in this thread Simon mentioned a 
partitioned table, where each partition on its own is smaller than the 
threshold, but you're seq scanning several partitions and the total size 
of the seq scans is larger than memory size. In that scenario, you would 
want BAS and synchronized scans, but even a per-table setting wouldn't 
cut it.


For synchronized scans to help in the partitioned situation, I guess 
you'd want to synchronize across partitions. If someone is already 
scanning partition 5, you'd want to start from that partition and join 
the pack, instead of starting from partition 1.


You're right, but in practice its not quite that bad with the
multi-table route. When you have partitions you generally exclude most
of them, with typically 1-2 per query, usually different ones.


Yep. And in that case, you *don't'* want BAS or sync scans to kick in, 
because you're only accessing a relatively small chunk of data, and it's 
worthwhile to cache it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Zeugswetter Andreas ADI SD

 It's a good point that we don't want pg_dump to screw up the cluster 
 order, but that's the only use case I've seen this far for disabling 
 sync scans. Even that wouldn't matter much if our estimate for 
 clusteredness didn't get screwed up by a table that looks 
 like this: 
 5 6 7 8 9 1 2 3 4

I do think the guc to turn it off is useful, only I don't understand the
reasoning that pg_dump needs it to maintain the basic clustered
property.

Sorry, but I don't grok this at all.
Why the heck would we care if we have 2 parts of the table perfectly
clustered,
because we started in the middle ? Surely our stats collector should
recognize
such a table as perfectly clustered. Does it not ? We are talking about
one
breakage in the readahead logic here, this should only bring the
clustered property
from 100% to some 99.99% depending on table size vs readahead window.

Andreas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Gregory Stark

Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:

 Sorry, but I don't grok this at all. Why the heck would we care if we have 2
 parts of the table perfectly clustered, because we started in the middle ?
 Surely our stats collector should recognize such a table as perfectly
 clustered. Does it not ? We are talking about one breakage in the readahead
 logic here, this should only bring the clustered property from 100% to some
 99.99% depending on table size vs readahead window.

Well clusteredness is used or could be used for a few different heuristics,
not all of which this would be quite as well satisfied as readahead. But for
the most common application, namely trying to figure out whether index probes
for sequential ids will be sequential i/o or random i/o you're right.

Currently the statistic we use to estimate this is the correlation of the
column value with the physical location on disk. That's not a perfect metric
for estimating how much random i/o would be needed to scan the table in index
order though.

It would be great if Postgres picked up a serious statistics geek who could
pipe up in discussions like this with how about using the Euler-Jacobian
Centroid or some such thing. If you have any suggestions of what metric to
use and how to calculate the info we need from it that would be great.

One suggestion from a long way back was scanning the index and counting how
many times the item pointer moves backward to an earlier block. That would
still require a full index scan though. And it doesn't help for columns which
aren't indexed though I'm not sure we need this info for columns which aren't
indexed. It's also not clear how to interpolate from that the amount of random
access a given query would perform.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Zeugswetter Andreas ADI SD

  +1. If we go with 'enable_sync_seqcans' for 8.3, and in a future
release 
  cycle we do test the cases Simon described above and we agree we
need to 
  do a fine tune to benefit from this feature, we will need to
deprecate 
  'enable_sync_seqscans' and invent another one
(sync_seqscans_threshold). 
  Looking at this perpective, IMHO we should go with the number (0.25)

  instead of the boolean.
 
 Surely the risk-of-needing-to-deprecate argument applies ten times
more
 strongly to a number than a boolean.

Yes, I would expect the tuning to be more system than user specific.
So imho a boolean userset would couple well with a tuning guc, that
may usefully not be userset (if we later discover a need for tuning at
all).

so +1 for the bool. 

synchronize[d]_seqscan sounds a bit better in my ears than the plural
synchronize_seqscans.
To me the latter somehow suggests influece on the whole cluster,
probably not 
worth further discussion though, so if someone says no, ok.

Andreas


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Kenneth Marshall
On Tue, Jan 29, 2008 at 10:40:40AM +0100, Zeugswetter Andreas ADI SD wrote:
 
  It's a good point that we don't want pg_dump to screw up the cluster 
  order, but that's the only use case I've seen this far for disabling 
  sync scans. Even that wouldn't matter much if our estimate for 
  clusteredness didn't get screwed up by a table that looks 
  like this: 
  5 6 7 8 9 1 2 3 4
 
 I do think the guc to turn it off is useful, only I don't understand the
 reasoning that pg_dump needs it to maintain the basic clustered
 property.
 
 Sorry, but I don't grok this at all.
 Why the heck would we care if we have 2 parts of the table perfectly
 clustered,
 because we started in the middle ? Surely our stats collector should
 recognize
 such a table as perfectly clustered. Does it not ? We are talking about
 one
 breakage in the readahead logic here, this should only bring the
 clustered property
 from 100% to some 99.99% depending on table size vs readahead window.
 
 Andreas
 

Andreas,

I agree with your logic. If the process that PostgreSQL uses to determine
how clustered a table is that breaks with such a layout, we may need to
see what should be changed to make it work. Having had pg_dump cause a
database to grind to a halt, I would definitely like the option of using
the synchronized scans even for clustered tables.

Ken

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Tom Lane
Euler Taveira de Oliveira [EMAIL PROTECTED] writes:
 +1. If we go with 'enable_sync_seqcans' for 8.3, and in a future release 
 cycle we do test the cases Simon described above and we agree we need to 
 do a fine tune to benefit from this feature, we will need to deprecate 
 'enable_sync_seqscans' and invent another one (sync_seqscans_threshold). 
 Looking at this perpective, IMHO we should go with the number (0.25) 
 instead of the boolean.

Surely the risk-of-needing-to-deprecate argument applies ten times more
strongly to a number than a boolean.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Jeff Davis
On Tue, 2008-01-29 at 10:55 +, Gregory Stark wrote:
 Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
 
  Sorry, but I don't grok this at all. Why the heck would we care if we have 2
  parts of the table perfectly clustered, because we started in the middle ?
  Surely our stats collector should recognize such a table as perfectly
  clustered. Does it not ? We are talking about one breakage in the readahead
  logic here, this should only bring the clustered property from 100% to some
  99.99% depending on table size vs readahead window.
 
 Well clusteredness is used or could be used for a few different heuristics,
 not all of which this would be quite as well satisfied as readahead. But for

Can you give an example? Treating a file as a circular structure does
not impose any significant cost that I can see.

 It would be great if Postgres picked up a serious statistics geek who could
 pipe up in discussions like this with how about using the Euler-Jacobian
 Centroid or some such thing. If you have any suggestions of what metric to
 use and how to calculate the info we need from it that would be great.

Agreed.

 One suggestion from a long way back was scanning the index and counting how
 many times the item pointer moves backward to an earlier block. That would

An interesting metric. As you say, we really need a statistician to
definitively say what the correct metrics are, and what kind of sampling
we need to make good estimates.

 still require a full index scan though. And it doesn't help for columns which
 aren't indexed though I'm not sure we need this info for columns which aren't
 indexed. It's also not clear how to interpolate from that the amount of random
 access a given query would perform.

I don't think clusteredness has any meaning at all in postgres for an
unindexed column. I suppose a table could be clustered without an index,
but currently there's no way to do that in postgresql.

Regards,
Jeff Davis




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Euler Taveira de Oliveira

Simon Riggs wrote:


And if you have a partitioned table with partitions inconveniently
sized? You'd need to *reduce* shared_buffers specifically to get synch
scans and BAS to kick in. Or increase partition size. Both of which
reduce the impact of the benefits we've added.

I don't think the argument that a table is smaller than shared buffers
therefore it is already in shared buffers holds true in all cases. I/O
does matter.

+1. If we go with 'enable_sync_seqcans' for 8.3, and in a future release 
cycle we do test the cases Simon described above and we agree we need to 
do a fine tune to benefit from this feature, we will need to deprecate 
'enable_sync_seqscans' and invent another one (sync_seqscans_threshold). 
Looking at this perpective, IMHO we should go with the number (0.25) 
instead of the boolean.



--
  Euler Taveira de Oliveira
  http://www.timbira.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Kevin Grittner
 On Tue, Jan 29, 2008 at  1:09 PM, in message [EMAIL PROTECTED],
Tom Lane [EMAIL PROTECTED] wrote: 
 Or is someone prepared to argue that there are no applications out
 there that will be broken if the same query, against the same unchanging
 table, yields different results from one trial to the next?
 
If geqo kicks in, we're already there, aren't we?
 
Isn't an application which counts on the order of result rows
without specifying ORDER BY fundamentally broken?
 
-Kevin
 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Tom Lane
Kevin Grittner [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote: 
 Or is someone prepared to argue that there are no applications out
 there that will be broken if the same query, against the same unchanging
 table, yields different results from one trial to the next?
 
 If geqo kicks in, we're already there, aren't we?

Yup, and that's one of the reasons we have a way to turn geqo off.
(geqo is actually a good precedent for this --- notice that it has
an on/off switch that's separate from its tuning knobs.)

 Isn't an application which counts on the order of result rows
 without specifying ORDER BY fundamentally broken?

No doubt, but if it's always worked before, people are going to be
unhappy anyway.

Also, it's not just ordering that's at stake.  Try

regression=# create table foo as select x from generate_series(1,100) x;
SELECT
regression=# select * from foo limit 1;
   x   
---
 1
 2
 3
 4
 
regression=# select * from foo limit 1;
   x   
---
  7233
  7234
  7235
  7236
  
regression=# select * from foo limit 1;
   x   
---
 14465
 14466
 14467
 14468
 

Now admittedly we've never promised LIMIT without ORDER BY to be
well-defined either, but not everybody reads the fine print.
This case is particularly nasty because at smaller LIMIT values
the result *is* consistent, so you might never notice the problem
while testing.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
 synchronize[d]_seqscan sounds a bit better in my ears than the plural
 synchronize_seqscans.

The plural seems better to me; there's no such thing as a solitary
synchronized scan, no?  The whole point of the feature is to affect
the behavior of multiple scans.

BTW, so far as the rest of the thread goes, I'm not necessarily opposed
to exposing the switchover threshold as a tunable.  But I think it needs
more thought to design than we can give it in time for 8.3 (because of
the interaction with the buffer access strategy stuff).  Also I don't
like having pg_dump manipulating a tuning parameter.  I don't see
anything wrong with having both an on/off feature switch and a tunable
in future releases.  The feature switch can be justified on grounds
of backwards compatibility quite independently of whether pg_dump uses
it.  Or is someone prepared to argue that there are no applications out
there that will be broken if the same query, against the same unchanging
table, yields different results from one trial to the next?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Tom Lane
Ron Mayer [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Or is someone prepared to argue that there are no applications out
 there that will be broken if the same query, against the same unchanging
 table, yields different results from one trial to the next? 
 
 Won't even autovacuum analyze cause this too if the
 new stats changes the plan?

Given that the table is unchanging, that's moderately unlikely to happen
(especially for select * from foo ;-))

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Ron Mayer

Tom Lane wrote:

Kevin Grittner [EMAIL PROTECTED] writes:
  
Tom Lane [EMAIL PROTECTED] wrote: 


Or is someone prepared to argue that there are no applications out
there that will be broken if the same query, against the same unchanging
table, yields different results from one trial to the next? 
  

Won't even autovacuum analyze cause this too if the
new stats changes the plan?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Guillaume Smet
On Jan 29, 2008 8:09 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
  synchronize[d]_seqscan sounds a bit better in my ears than the plural
  synchronize_seqscans.

 The plural seems better to me; there's no such thing as a solitary
 synchronized scan, no?  The whole point of the feature is to affect
 the behavior of multiple scans.

+1. The plural is important IMHO.

 BTW, so far as the rest of the thread goes, I'm not necessarily opposed
 to exposing the switchover threshold as a tunable.  But I think it needs
 more thought to design than we can give it in time for 8.3 (because of
 the interaction with the buffer access strategy stuff).

+1. The current patch is simple and so far in the cycle, I really
think we should keep it that way.

 The feature switch can be justified on grounds
 of backwards compatibility quite independently of whether pg_dump uses
 it.  Or is someone prepared to argue that there are no applications out
 there that will be broken if the same query, against the same unchanging
 table, yields different results from one trial to the next?

As I stated earlier, I don't really like this argument (we already
broke badly designed applications a few times in the past) but we
really need a way to guarantee that the execution of a query is stable
and doesn't depend on external factors. And the original problem was
to guarantee that pg_dump builds a dump as identical as possible to
the existing data by ignoring external factors. It's now the case with
your patch.
The fact that it allows us not to break existing applications relying
too much on physical ordering is a nice side effect though :).

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Heikki Linnakangas

Simon Riggs wrote:

On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

Rather than having a boolean GUC, we should have a number and make the
parameter synchronised_scan_threshold.

This would open up a can of worms I'd prefer not to touch, having to do
with whether the buffer-access-strategy behavior should track that or
not.  As the note in heapam.c says,

 * If the table is large relative to NBuffers, use a bulk-read access
 * strategy and enable synchronized scanning (see syncscan.c).  Although
 * the thresholds for these features could be different, we make them the
 * same so that there are only two behaviors to tune rather than four.

It's a bit late in the cycle to be revisiting that choice.  Now we do
already have three behaviors to worry about (BAS on and syncscan off)
but throwing in a randomly settable knob will take it back to four,
and we have no idea how that fourth case will behave.  The other tack we
could take (having the one GUC variable control both thresholds) is
not good since it will result in pg_dump trashing the buffer cache.


OK, good points. 


I'm still concerned that the thresholds gets higher as we increase
shared_buffers. We may be removing performance features as fast as we
gain performance when we set shared_buffers higher.

Might we agree that the threshold should be fixed at 8MB, rather than
varying upwards as we try to tune? 


Synchronized scans, and the bulk-read strategy, don't help if the table 
fits in cache. If it fits in shared buffers, you're better off keeping 
it there, than swap pages between the OS cache and shared buffers, or 
spend any effort synchronizing scans. That's why we agreed back then 
that the threshold should be X% of shared_buffers.


It's a good point that we don't want pg_dump to screw up the cluster 
order, but that's the only use case I've seen this far for disabling 
sync scans. Even that wouldn't matter much if our estimate for 
clusteredness didn't get screwed up by a table that looks like this: 
5 6 7 8 9 1 2 3 4


Now, maybe there's more use cases where you'd want to tune the 
threshold, but I'd like to see some before we add more knobs.


To benefit from a lower threshold, you'd need to have a table large 
enough that its cache footprint matters, but is still smaller than 25% 
of shared_buffers, and have seq scans on it. In that scenario, you might 
benefit from a lower threshold, because that would leave some 
shared_buffers free for other use. Even that is quite hand-wavey; the 
buffer cache LRU algorithm handles that kind of scenarios reasonably 
well already, and whether or not


To benefit from a larger threshold, you'd need to have a table larger 
than 25% of shared_buffers, but still smaller than shared_buffers, and 
seq scan it often enough that you want to keep it in shared buffers. If 
you're frequently seq scanning a table of that size, you're most likely 
suffering from a bad plan. Even then, the performance difference 
shouldn't be that great, the table surely fits in OS cache anyway, with 
typical shared_buffers settings.


Tables that are seq scanned are typically very small, like a summary 
table with just a few rows, or huge tables in a data warehousing 
system. Between the extremes, I don't think the threshold actually has a 
very big impact.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Simon Riggs
On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
 Tables that are seq scanned are typically very small, like a summary 
 table with just a few rows, or huge tables in a data warehousing 
 system. Between the extremes, I don't think the threshold actually has
 a very big impact.

And if you have a partitioned table with partitions inconveniently
sized? You'd need to *reduce* shared_buffers specifically to get synch
scans and BAS to kick in. Or increase partition size. Both of which
reduce the impact of the benefits we've added.

I don't think the argument that a table is smaller than shared buffers
therefore it is already in shared buffers holds true in all cases. I/O
does matter.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Jeff Davis
On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
 It's a good point that we don't want pg_dump to screw up the cluster 
 order, but that's the only use case I've seen this far for disabling 
 sync scans. Even that wouldn't matter much if our estimate for 
 clusteredness didn't get screwed up by a table that looks like this: 
 5 6 7 8 9 1 2 3 4

It doesn't seem like there is any reason for the estimate to get
confused, but it apparently does. I loaded a test table with a similar
distribution to your example, and it shows a correlation of about -0.5,
but it should be as good as something near -1 or +1.

I am not a statistics expert, but it seems like a better measurement
would be: what is the chance that, if the tuples are close together in
index order, the corresponding heap tuples are close together?.

The answer to that question in your example is very likely, so there
would be no problem.

Is there a reason we don't do this?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-28 Thread Ron Mayer
Jeff Davis wrote:
 On Mon, 2008-01-28 at 23:13 +, Heikki Linnakangas wrote:
   
 clusteredness didn't get screwed up by a table that looks like this: 
 5 6 7 8 9 1 2 3 4
 
 ...test table with a similar
 distribution to your example, and it shows a correlation of about -0.5,
 but it should be as good as something near -1 or +1.

 I am not a statistics expert, but it seems like a better measurement
 would be: what is the chance that, if the tuples are close together in
 index order, the corresponding heap tuples are close together?.
   
Same applies for data clustered by zip-code.

All rows for any State or City or County or SchoolZone
are close together on the same pages; yet postgres's
stats think they're totally unclustered.
 The answer to that question in your example is very likely, so there
 would be no problem.
 Is there a reason we don't do this?
   
I've been tempted to do things like

   update pg_statistic set stanumbers3='{1.0}' where starelid=2617 and
staattnum=7;

after every analyze when I have data like this from tables clustered
by zip.  Seems it'd help more plans than it hurts, but haven't been
brave enough to try in production.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster