Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Jeff Davis <[EMAIL PROTECTED]> writes: > The problem is, I think people would be more frustrated by 1 in 1000 > queries starting the scan in the wrong place because a hint was deleted, Yeah --- various people have been complaining recently about how we have good average performance and bad worst c

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 18:25 -0400, Tom Lane wrote: > But note that barring backend crash, once all the scans are done it is > guaranteed that the hint will be removed --- somebody will be last to > update the hint, and therefore will remove it when they do heap_endscan, > even if others are not qui

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Gregory Stark
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > LIMIT without ORDER BY is worse because it not only returns tuples in > different > order, but it can return different tuples altogether when you run it multiple > times. I don't think printing a notice is feasible. For regular DML a notice or

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Gregory Stark
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > Were you thinking of storing the PID of the backend that originally created > the > hint, or updating the PID every time the hint is updated? In any case, we > still > wouldn't know if there's other scanners still running. My reaction was if y

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Jeff Davis <[EMAIL PROTECTED]> writes: > My thought was that every time the location was reported by a backend, > it would store 3 pieces of information, not 2: > * relfilenode > * the PID of the backend that created or updated this particular hint > last > * the location > Then, on heap_endsca

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Michael Glaesemann
On Jun 4, 2007, at 16:34 , Heikki Linnakangas wrote: LIMIT without ORDER BY is worse because it not only returns tuples in different order, but it can return different tuples altogether when you run it multiple times. Wouldn't DISTINCT ON suffer from the same issue without ORDER BY? Micha

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 22:57 +0100, Heikki Linnakangas wrote: > > That's what I thought at first, and why I didn't do it. Right now I'm > > thinking we could just add the PID to the hint, so that it would only > > remove its own hint. Would that work? > > Were you thinking of storing the PID of the

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Jeff Davis wrote: On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote: I think the real problem here is that the first scan is leaving state behind that changes the behavior of the next scan. Which can have no positive benefit, since obviously the first scan is not still proceeding; the

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 22:09 +0100, Heikki Linnakangas wrote: > > I think the real problem here is that the first scan is leaving state > > behind that changes the behavior of the next scan. Which can have no > > positive benefit, since obviously the first scan is not still > > proceeding; the best

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Michael Glaesemann wrote: I think the warning on LIMIT without ORDER BY is a good idea, regardless of the synchronized scans patch. I'm not saying this isn't a good idea, but are there other places where there might be gotchas for the unwary, such as DISTINCT without ORDER BY or (for an unrel

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 16:42 -0400, Tom Lane wrote: > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > > I don't think anyone can reasonably expect to get the same ordering when > > the same query issued twice in general, but within the same transaction > > it wouldn't be that unreasonable. If we

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Michael Glaesemann
On Jun 4, 2007, at 15:24 , Heikki Linnakangas wrote: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. The order rows are returned without an ORDER BY clause *is

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: I don't think anyone can reasonably expect to get the same ordering when the same query issued twice in general, but within the same transaction it wouldn't be that unreasonable. If we care about that, we could keep track of starti

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > I don't think anyone can reasonably expect to get the same ordering when > the same query issued twice in general, but within the same transaction > it wouldn't be that unreasonable. If we care about that, we could keep > track of starting locatio

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Jeff Davis wrote: No surprise here, as you and Bruce have already pointed out. If we wanted to reduce the occurrence of this phenomena, we could perhaps "time out" the hints so that it's impossible to pick up a hint from a scan that finished 5 minutes ago. It doesn't seem helpful to further obs

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Alvaro Herrera wrote: Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: For the record, this patch has a small negative impact on scans like "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the pa

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Alvaro Herrera
Tom Lane wrote: > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > > For the record, this patch has a small negative impact on scans like > > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > > HEAD the first 1000 rows will stay in buffer cache, but with the patch > > ea

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Jeff Davis
On Mon, 2007-06-04 at 10:53 +0100, Heikki Linnakangas wrote: > I'm now done with this patch and testing it. > Great! > For the record, this patch has a small negative impact on scans like > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > HEAD the first 1000 rows will

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Bruce Momjian <[EMAIL PROTECTED]> writes: As I understand it, the problem is that while currently LIMIT without ORDER BY always starts at the beginning of the table, it will not with this patch. I consider that acceptable. It's definitely going to require stronger warnings tha

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > As I understand it, the problem is that while currently LIMIT without > ORDER BY always starts at the beginning of the table, it will not with > this patch. I consider that acceptable. It's definitely going to require stronger warnings than we have now

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Bruce Momjian
Heikki Linnakangas wrote: > Tom Lane wrote: > > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > >> For the record, this patch has a small negative impact on scans like > >> "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > >> HEAD the first 1000 rows will stay in buffer

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: For the record, this patch has a small negative impact on scans like "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS HEAD the first 1000 rows will stay in buffer cache, but with the patch each scan will sta

Re: [PATCHES] Synchronized scans

2007-06-04 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > For the record, this patch has a small negative impact on scans like > "SELECT * FROM foo LIMIT 1000". If such a scan is run repeatedly, in CVS > HEAD the first 1000 rows will stay in buffer cache, but with the patch > each scan will start from ro

[PATCHES] Synchronized scans

2007-06-04 Thread Heikki Linnakangas
I'm now done with this patch and testing it. I fixed a little off-by-one in "backward scan, not inited" branch, but I was unable to test it. It seems that code is actually never used because that case is optimized to a rewind in the executor. I marked those seemingly unreachable places in the

Re: [PATCHES] Synchronized Scan WIP patch

2007-06-04 Thread Heikki Linnakangas
Jeff Davis wrote: On Thu, 2007-05-31 at 09:08 +0100, Heikki Linnakangas wrote: * moved the sync scan stuff to a new file access/heapam/syncscan.c. heapam.c is long enough already, and in theory the same mechanism could be used for large bitmap heap scans in the future. Good idea, I hadn't tho