Re: [PATCHES] Synchronized scans

2007-06-10 Thread Jeff Davis
On Sat, 2007-06-09 at 09:58 -0400, Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: * For a large table, do lazy_scan_heap, scan_heap, and a sequential scan usually progress at approximately the same rate? scan_heap would probably be faster than a regular seqscan, since it isn't

Re: [PATCHES] Synchronized scans

2007-06-10 Thread Heikki Linnakangas
Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: * Just adding in the syncscan to scan_heap and lazy_scan_heap seems very easy at first thought. Are there any complications that I'm missing? I believe there are assumptions buried in both full and lazy vacuum that blocks are scanned in

Re: [PATCHES] Synchronized scans

2007-06-10 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes: I don't think sync-scanning vacuum is worth pursuing, though, because of the other issues: index scans, vacuum cost accounting, and the fact that the 2nd pass would be harder to synchronize. There's a lot of other interesting ideas for vacuum that

Re: [PATCHES] Synchronized scans

2007-06-10 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: I'm sure this has been brought up before, does someone have a pointer to a discussion about doing VACUUM-like work in a sequential scan? Yeah, it's been discussed before; try looking for incremental vacuum and such phrases. The main stumbling block is

Re: [PATCHES] Synchronized scans

2007-06-10 Thread Alvaro Herrera
Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: I'm sure this has been brought up before, does someone have a pointer to a discussion about doing VACUUM-like work in a sequential scan? Yeah, it's been discussed before; try looking for incremental vacuum and such phrases. The main

Re: [PATCHES] Synchronized scans

2007-06-09 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: * For a large table, do lazy_scan_heap, scan_heap, and a sequential scan usually progress at approximately the same rate? scan_heap would probably be faster than a regular seqscan, since it isn't doing any where-clause-checking or data output. Except if

Re: [PATCHES] Synchronized scans

2007-06-09 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: The vacuum-cost-limit issue may be sufficient reason to kill this idea; not sure. We already have a much higher cost for blocks that cause i/o than blocks which don't. I think if we had zero cost for blocks which don't cause i/o it would basically work unless

Re: [PATCHES] Synchronized scans

2007-06-09 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: The vacuum-cost-limit issue may be sufficient reason to kill this idea; not sure. We already have a much higher cost for blocks that cause i/o than blocks which don't. I think if we had zero cost for blocks which

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I fixed a little off-by-one in backward scan, not inited branch, but I was unable to test it. It seems that code is actually never used because that case is optimized to a rewind in the executor. I marked those seemingly unreachable

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Heikki Linnakangas
Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: Just to be sure: a backwards-started scan is currently unreachable code, correct? [ yawn... ] I think so, but I wouldn't swear to it right at the moment. In any case it doesn't seem like a path that we need to optimize. Agreed, let's

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: It occurs to me that there's an actual bug here for catalog access. The code assumes that it can measure rs_nblocks only once and not worry about tuples added beyond that endpoint. But this is only true when using an MVCC-safe

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: BTW: Should we do the synchronization in the non-page-at-a-time mode? It's not many lines of code to do so, but IIRC that codepath is only used for catalog access. System tables really shouldn't grow that big, and if they do we

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: Jeff Davis [EMAIL PROTECTED] writes: Just to be sure: a backwards-started scan is currently unreachable code, correct? [ yawn... ] I think so, but I wouldn't swear to it right at the moment. In any case it doesn't seem like a

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Jeff Davis
On Fri, 2007-06-08 at 11:05 +0100, Heikki Linnakangas wrote: BTW: Should we do the synchronization in the non-page-at-a-time mode? It's not many lines of code to do so, but IIRC that codepath is only used for catalog access. System tables really shouldn't grow that big, and if they do we

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: On Fri, 2007-06-08 at 11:05 +0100, Heikki Linnakangas wrote: BTW: Should we do the synchronization in the non-page-at-a-time mode? http://archives.postgresql.org/pgsql-hackers/2006-09/msg01199.php There is a very minor assumption there that scans on

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Jeff Davis
On Fri, 2007-06-08 at 12:22 -0400, Tom Lane wrote: Now that I'm awake, it is reachable code, per this comment: * Note: when we fall off the end of the scan in either direction, we * reset rs_inited. This means that a further request with the same * scan direction will restart the scan,

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Here's an update of the patch. I reverted the behavior at end of scan back to the way it was in Jeff's original patch, and disabled reporting the position when moving backwards. Applied with minor editorializations --- notably, I got rid of the

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: On Fri, 2007-06-08 at 12:22 -0400, Tom Lane wrote: Now that I'm awake, it is reachable code, per this comment: * Note: when we fall off the end of the scan in either direction, we * reset rs_inited. This means that a further request with the same * scan

Re: [PATCHES] Synchronized scans

2007-06-08 Thread Jeff Davis
On Fri, 2007-06-08 at 11:57 -0700, Jeff Davis wrote: On Fri, 2007-06-08 at 14:36 -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Here's an update of the patch. I reverted the behavior at end of scan back to the way it was in Jeff's original patch, and disabled

Re: [PATCHES] Synchronized scans

2007-06-07 Thread Jeff Davis
On Thu, 2007-06-07 at 22:52 -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I fixed a little off-by-one in backward scan, not inited branch, but I was unable to test it. It seems that code is actually never used because that case is optimized to a rewind in the

Re: [PATCHES] Synchronized scans

2007-06-07 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: Just to be sure: a backwards-started scan is currently unreachable code, correct? [ yawn... ] I think so, but I wouldn't swear to it right at the moment. In any case it doesn't seem like a path that we need to optimize. regards,

Re: [PATCHES] Synchronized scans

2007-06-05 Thread Heikki Linnakangas
Tom Lane wrote: But note that barring backend crash, once all the scans are done it is guaranteed that the hint will be removed --- somebody will be last to update the hint, and therefore will remove it when they do heap_endscan, even if others are not quite done. This is good in the sense that

Re: [PATCHES] Synchronized scans

2007-06-05 Thread Jeff Davis
On Mon, 2007-06-04 at 21:39 -0400, Tom Lane wrote: idea of deleting the hint. But if we could change the hint behavior to say start reading here, successive short LIMITed reads would all start reading from the same point, which fixes both my reproducibility concern and Heikki's original point

Re: [PATCHES] Synchronized scans

2007-06-05 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: That's how it works now. Small limit queries don't change the location in the hint, so if you repeat them, the queries keep starting from the same place, and fetching the same tuples. OK, maybe the problem's not as severe as I thought then.

Re: [PATCHES] Synchronized scans

2007-06-05 Thread Jeff Davis
On Mon, 2007-06-04 at 10:53 +0100, Heikki Linnakangas wrote: I'm now done with this patch and testing it. One difference between our patches is that, in my patch, the ending condition of the scan is after the hint is set back to the starting position. That means, in my patch, if you do:

[PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
I'm now done with this patch and testing it. I fixed a little off-by-one in backward scan, not inited branch, but I was unable to test it. It seems that code is actually never used because that case is optimized to a rewind in the executor. I marked those seemingly unreachable places in the

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch each scan will start from roughly

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch each scan will start

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Bruce Momjian
Heikki Linnakangas wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: As I understand it, the problem is that while currently LIMIT without ORDER BY always starts at the beginning of the table, it will not with this patch. I consider that acceptable. It's definitely going to require stronger warnings than we have now

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: As I understand it, the problem is that while currently LIMIT without ORDER BY always starts at the beginning of the table, it will not with this patch. I consider that acceptable. It's definitely going to require stronger warnings than

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 10:53 +0100, Heikki Linnakangas wrote: I'm now done with this patch and testing it. Great! For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Alvaro Herrera
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch each scan will

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Alvaro Herrera wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: For the record, this patch has a small negative impact on scans like SELECT * FROM foo LIMIT 1000. If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Jeff Davis wrote: No surprise here, as you and Bruce have already pointed out. If we wanted to reduce the occurrence of this phenomena, we could perhaps time out the hints so that it's impossible to pick up a hint from a scan that finished 5 minutes ago. It doesn't seem helpful to further

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. If we care about that, we could keep track of starting locations

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. If we care about that, we could keep track of

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Michael Glaesemann
On Jun 4, 2007, at 15:24 , Heikki Linnakangas wrote: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. The order rows are returned without an ORDER BY clause

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 16:42 -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. If we care

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote: I think the real problem here is that the first scan is leaving state behind that changes the behavior of the next scan. Which can have no positive benefit, since obviously the first scan is not still proceeding; the best you

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Michael Glaesemann
On Jun 4, 2007, at 16:34 , Heikki Linnakangas wrote: LIMIT without ORDER BY is worse because it not only returns tuples in different order, but it can return different tuples altogether when you run it multiple times. Wouldn't DISTINCT ON suffer from the same issue without ORDER BY?

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Jeff Davis wrote: On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote: I think the real problem here is that the first scan is leaving state behind that changes the behavior of the next scan. Which can have no positive benefit, since obviously the first scan is not still proceeding;

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: My thought was that every time the location was reported by a backend, it would store 3 pieces of information, not 2: * relfilenode * the PID of the backend that created or updated this particular hint last * the location Then, on heap_endscan() (if

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes: Were you thinking of storing the PID of the backend that originally created the hint, or updating the PID every time the hint is updated? In any case, we still wouldn't know if there's other scanners still running. My reaction was if you

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 18:25 -0400, Tom Lane wrote: But note that barring backend crash, once all the scans are done it is guaranteed that the hint will be removed --- somebody will be last to update the hint, and therefore will remove it when they do heap_endscan, even if others are not quite

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes: The problem is, I think people would be more frustrated by 1 in 1000 queries starting the scan in the wrong place because a hint was deleted, Yeah --- various people have been complaining recently about how we have good average performance and bad worst