Re: [HACKERS] cheaper snapshots redux

2011-09-13 Thread Amit Kapila
as [mailto:robertmh...@gmail.com] Sent: Monday, September 12, 2011 9:31 PM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] cheaper snapshots redux On Mon, Sep 12, 2011 at 11:07 AM, Amit Kapila wrote: >>If you know what transactions were running the last time a snap

Re: [HACKERS] cheaper snapshots redux

2011-09-13 Thread Robert Haas
On Tue, Sep 13, 2011 at 7:49 AM, Amit Kapila wrote: >>Yep, that's pretty much what it does, although xmax is actually >>defined as the XID *following* the last one that ended, and I think >>xmin needs to also be in xip, so in this case you'd actually end up >>with xmin = 15, xmax = 22, xip = { 15,

Re: [HACKERS] cheaper snapshots redux

2011-09-12 Thread Amit kapila
>> 4. Won't it effect if we don't update xmin everytime and just noting the >> committed XIDs. The reason I am asking is that it is used in tuple >> visibility check so with new idea in some cases instead of just returning >> >> from begining by checking xmin it has to go through the committ

Re: [HACKERS] cheaper snapshots redux

2011-09-12 Thread Amit Kapila
se notify the sender by phone or email immediately and delete it! -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Thursday, September 08, 2011 7:50 PM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] cheaper snapshots redux On Tue, Sep

Re: [HACKERS] cheaper snapshots redux

2011-09-12 Thread Amit Kapila
ne or email immediately and delete it! -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Monday, September 12, 2011 7:39 PM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] cheaper snapshots redux On Sun, Sep 11, 2011 at 11:08 PM, Amit

Re: [HACKERS] cheaper snapshots redux

2011-09-12 Thread Robert Haas
On Mon, Sep 12, 2011 at 11:07 AM, Amit Kapila wrote: >>If you know what transactions were running the last time a snapshot summary >> was written and what >transactions have ended since then, you can work out >> the new xmin on the fly.  I have working >code for this and it's actually >> quite sim

Re: [HACKERS] cheaper snapshots redux

2011-09-12 Thread Robert Haas
On Sun, Sep 11, 2011 at 11:08 PM, Amit Kapila wrote: >   In the approach mentioned in your idea, it mentioned that once after > taking snapshot, only committed XIDs will be updated and sometimes snapshot > itself. > >   So when the xmin will be updated according to your idea as snapshot will > not

Re: [HACKERS] cheaper snapshots redux

2011-09-08 Thread Robert Haas
On Tue, Sep 6, 2011 at 11:06 PM, Amit Kapila wrote: > 1. With the above, you want to reduce/remove the concurrency issue between > the GetSnapshotData() [used at begining of sql command execution] and > ProcArrayEndTransaction() [used at end transaction]. The concurrency issue > is mainly ProcArra

Re: [HACKERS] cheaper snapshots redux

2011-09-07 Thread Amit Kapila
to:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas Sent: Sunday, August 28, 2011 7:17 AM To: Gokulakannan Somasundaram Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] cheaper snapshots redux On Sat, Aug 27, 2011 at 1:38 AM, Gokulakannan Somasundaram wrote: > First i respectfu

Re: [HACKERS] cheaper snapshots redux

2011-08-28 Thread Robert Haas
On Sun, Aug 28, 2011 at 4:33 AM, Gokulakannan Somasundaram wrote: >> No, I don't think it will all be in memory - but that's part of the >> performance calculation.  If you need to check on the status of an XID >> and find that you need to read a page of data in from disk, that's >> going to be ma

Re: [HACKERS] cheaper snapshots redux

2011-08-28 Thread Gokulakannan Somasundaram
> No, I don't think it will all be in memory - but that's part of the > performance calculation. If you need to check on the status of an XID > and find that you need to read a page of data in from disk, that's > going to be many orders of magnitude slower than anything we do with s > snapshot now

Re: [HACKERS] cheaper snapshots redux

2011-08-27 Thread Robert Haas
On Sat, Aug 27, 2011 at 1:38 AM, Gokulakannan Somasundaram wrote: > First i respectfully disagree with you on the point of 80MB. I would say > that its very rare that a small system( with <1 GB RAM ) might have a long > running transaction sitting idle, while 10 million transactions are sitting >

Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Gokulakannan Somasundaram
On Tue, Aug 23, 2011 at 5:25 AM, Robert Haas wrote: > I've been giving this quite a bit more thought, and have decided to > abandon the scheme described above, at least for now. It has the > advantage of avoiding virtually all locking, but it's extremely > inefficient in its use of memory in the

Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Robert Haas
On Thu, Aug 25, 2011 at 6:29 PM, Jim Nasby wrote: > Actually, I wasn't thinking about the system dynamically sizing shared memory > on it's own... I was only thinking of providing the ability for a user to > change something like shared_buffers and allow that change to take effect > with a SIGH

Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Robert Haas
On Thu, Aug 25, 2011 at 6:24 PM, Jim Nasby wrote: > On Aug 25, 2011, at 8:24 AM, Robert Haas wrote: >> My hope (and it might turn out that I'm an optimist) is that even with >> a reasonably small buffer it will be very rare for a backend to >> experience a wraparound condition.  For example, consi

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Jim Nasby
On Aug 22, 2011, at 6:22 PM, Robert Haas wrote: > With respect to a general-purpose shared memory allocator, I think > that there are cases where that would be useful to have, but I don't > think there are as many of them as many people seem to think. I > wouldn't choose to implement this using a

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Jim Nasby
On Aug 25, 2011, at 8:24 AM, Robert Haas wrote: > My hope (and it might turn out that I'm an optimist) is that even with > a reasonably small buffer it will be very rare for a backend to > experience a wraparound condition. For example, consider a buffer > with ~6500 entries, approximately 64 * Ma

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas
On Thu, Aug 25, 2011 at 11:15 AM, Markus Wanner wrote: > On 08/25/2011 04:59 PM, Tom Lane wrote: >> That's a good point.  If the ring buffer size creates a constraint on >> the maximum number of sub-XIDs per transaction, you're going to need a >> fallback path of some sort. > > I think Robert envi

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner
Tom, On 08/25/2011 04:59 PM, Tom Lane wrote: > That's a good point. If the ring buffer size creates a constraint on > the maximum number of sub-XIDs per transaction, you're going to need a > fallback path of some sort. I think Robert envisions the same fallback path we already have: subxids.over

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner
Robert, On 08/25/2011 04:48 PM, Robert Haas wrote: > What's a typical message size for imessages? Most message types in Postgres-R are just a couple bytes in size. Others, especially change sets, can be up to 8k. However, I think you'll have an easier job guaranteeing that backends "consume" the

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Tom Lane
Robert Haas writes: > Well, one long-running transaction that only has a single XID is not > really a problem: the snapshot is still small. But one very old > transaction that also happens to have a large number of > subtransactions all of which have XIDs assigned might be a good way to > stress

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas
On Thu, Aug 25, 2011 at 10:19 AM, Markus Wanner wrote: > Note, however, that for imessages, I've also had the policy in place > that a backend *must* consume its message before sending any.  And that > I took great care for all receivers to consume their messages as early > as possible.  None the

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Markus Wanner
Robert, On 08/25/2011 03:24 PM, Robert Haas wrote: > My hope (and it might turn out that I'm an optimist) is that even with > a reasonably small buffer it will be very rare for a backend to > experience a wraparound condition. It certainly seems less likely than with the ring-buffer for imessages

Re: [HACKERS] cheaper snapshots redux

2011-08-25 Thread Robert Haas
On Thu, Aug 25, 2011 at 1:55 AM, Markus Wanner wrote: >> One difference with snapshots is that only the latest snapshot is of >> any interest. > > Theoretically, yes.  But as far as I understood, you proposed the > backends copy that snapshot to local memory.  And copying takes some > amount of ti

Re: [HACKERS] cheaper snapshots redux

2011-08-24 Thread Markus Wanner
Robert, On 08/25/2011 04:59 AM, Robert Haas wrote: > True; although there are some other complications. With a > sufficiently sophisticated allocator you can avoid mutex contention > when allocating chunks, but then you have to store a pointer to the > chunk somewhere or other, and that then requ

Re: [HACKERS] cheaper snapshots redux

2011-08-24 Thread Robert Haas
On Wed, Aug 24, 2011 at 4:30 AM, Markus Wanner wrote: > I'm in respectful disagreement regarding the ring-buffer approach and > think that dynamic allocation can actually be more efficient if done > properly, because there doesn't need to be head and tail pointers, which > might turn into a point

Re: [HACKERS] cheaper snapshots redux

2011-08-24 Thread Markus Wanner
Robert, Jim, thanks for thinking out loud about dynamic allocation of shared memory. Very much appreciated. On 08/23/2011 01:22 AM, Robert Haas wrote: > With respect to a general-purpose shared memory allocator, I think > that there are cases where that would be useful to have, but I don't > thi

Re: [HACKERS] cheaper snapshots redux

2011-08-24 Thread Markus Wanner
Hello Dimitri, On 08/23/2011 06:39 PM, Dimitri Fontaine wrote: > I'm far from familiar with the detailed concepts here, but allow me to > comment. I have two open questions: > > - is it possible to use a distributed algorithm to produce XIDs, >something like Vector Clocks? > >Then each

Re: [HACKERS] cheaper snapshots redux

2011-08-23 Thread Tom Lane
Robert Haas writes: > That's certainly a fair concern, and it might even be worse than > O(n^2). On the other hand, the current approach involves scanning the > entire ProcArray for every snapshot, even if nothing has changed and > 90% of the backends are sitting around playing tiddlywinks, so I

Re: [HACKERS] cheaper snapshots redux

2011-08-23 Thread Dimitri Fontaine
Robert Haas writes: > I think the real trick is figuring out a design that can improve > concurrency. I'm far from familiar with the detailed concepts here, but allow me to comment. I have two open questions: - is it possible to use a distributed algorithm to produce XIDs, something like Ve

Re: [HACKERS] cheaper snapshots redux

2011-08-23 Thread Robert Haas
On Tue, Aug 23, 2011 at 12:13 PM, Tom Lane wrote: > I'm a bit concerned that this approach is trying to optimize the heavy > contention situation at the cost of actually making things worse anytime > that you're not bottlenecked by contention for access to this shared > data structure.  In particu

Re: [HACKERS] cheaper snapshots redux

2011-08-23 Thread Tom Lane
Robert Haas writes: > With respect to the first problem, what I'm imagining is that we not > do a complete rewrite of the snapshot in shared memory on every > commit. Instead, when a transaction ends, we'll decide whether to (a) > write a new snapshot or (b) just record the XIDs that ended. If w

Re: [HACKERS] cheaper snapshots redux

2011-08-23 Thread Simon Riggs
On Mon, Aug 22, 2011 at 10:25 PM, Robert Haas wrote: > I've been giving this quite a bit more thought, and have decided to > abandon the scheme described above, at least for now. I liked your goal of O(1) snapshots and think you should go for that. I didn't realise you were still working on thi

Re: [HACKERS] cheaper snapshots redux

2011-08-22 Thread Robert Haas
On Mon, Aug 22, 2011 at 6:45 PM, Jim Nasby wrote: > Something that would be really nice to fix is our reliance on a fixed size of > shared memory, and I'm wondering if this could be an opportunity to start in > a new direction. My thought is that we could maintain two distinct shared > memory s

Re: [HACKERS] cheaper snapshots redux

2011-08-22 Thread Jim Nasby
On Aug 22, 2011, at 4:25 PM, Robert Haas wrote: > What I'm thinking about > instead is using a ring buffer with three pointers: a start pointer, a > stop pointer, and a write pointer. When a transaction ends, we > advance the write pointer, write the XIDs or a whole new snapshot into > the buffer,

Re: [HACKERS] cheaper snapshots

2011-07-30 Thread Simon Riggs
On Thu, Jul 28, 2011 at 8:32 PM, Hannu Krosing wrote: > Maybe this is why other databases don't offer per backend async commit ? Oracle has async commit but very few people know about it. --  Simon Riggs   http://www.2ndQuadrant.com/  PostgreSQL Development, 24x7 Support, Train

Re: [HACKERS] cheaper snapshots

2011-07-29 Thread Hannu Krosing
On Fri, 2011-07-29 at 10:23 -0400, Robert Haas wrote: > On Fri, Jul 29, 2011 at 10:20 AM, Hannu Krosing wrote: > >> An additional point to think about: if we were willing to insist on > >> streaming replication, we could send the commit sequence numbers via a > >> side channel rather than writing

Re: [HACKERS] cheaper snapshots

2011-07-29 Thread Robert Haas
On Fri, Jul 29, 2011 at 10:20 AM, Hannu Krosing wrote: >> An additional point to think about: if we were willing to insist on >> streaming replication, we could send the commit sequence numbers via a >> side channel rather than writing them to WAL, which would be a lot >> cheaper. > > Why do you t

Re: [HACKERS] cheaper snapshots

2011-07-29 Thread Hannu Krosing
On Thu, 2011-07-28 at 20:14 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 7:54 PM, Ants Aasma wrote: > > On Thu, Jul 28, 2011 at 11:54 PM, Kevin Grittner > > wrote: > >> (4) We communicate acceptable snapshots to the replica to make the > >> order of visibility visibility match the master

Re: [HACKERS] cheaper snapshots

2011-07-29 Thread Kevin Grittner
Robert Haas wrote: >> (4) We communicate acceptable snapshots to the replica to make >> the order of visibility visibility match the master even when >> that doesn't match the order that transactions returned from >> commit. >> I (predictably) like (4) -- even though it's a lot of work >

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 8:12 PM, Ants Aasma wrote: > On Fri, Jul 29, 2011 at 2:20 AM, Robert Haas wrote: >> Well, again, there are three levels: >> >> (A) synchronous_commit=off.  No waiting! >> (B) synchronous_commit=local transactions, and synchronous_commit=on >> transactions when sync rep is

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 7:54 PM, Ants Aasma wrote: > On Thu, Jul 28, 2011 at 11:54 PM, Kevin Grittner > wrote: >> (4)  We communicate acceptable snapshots to the replica to make the >> order of visibility visibility match the master even when that >> doesn't match the order that transactions retu

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Ants Aasma
On Fri, Jul 29, 2011 at 2:20 AM, Robert Haas wrote: > Well, again, there are three levels: > > (A) synchronous_commit=off.  No waiting! > (B) synchronous_commit=local transactions, and synchronous_commit=on > transactions when sync rep is not in use.  Wait for xlog flush. > (C) synchronous_commit=

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Ants Aasma
On Thu, Jul 28, 2011 at 11:54 PM, Kevin Grittner wrote: > (4)  We communicate acceptable snapshots to the replica to make the > order of visibility visibility match the master even when that > doesn't match the order that transactions returned from commit. I wonder if some interpretation of 2 pha

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 16:42 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 4:36 PM, Hannu Krosing wrote: > > so in case of stuck slave the syncrep transcation is committed after > > crash, but is not committed before the crash happens ? > > Yep. > > > ow will the replay process get stuc gai

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 4:54 PM, Kevin Grittner wrote: > Robert Haas wrote: > >> Having transactions become visible in the same order on the master >> and the standby is very appealing, but I'm pretty well convinced >> that allowing commits to become visible before they've been >> durably committ

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread karavelov
- Цитат от Hannu Krosing (ha...@2ndquadrant.com), на 28.07.2011 в 22:40 - >> >> Maybe this is why other databases don't offer per backend async commit ? >> > Isn't Oracle's COMMIT WRITE NOWAIT; basically the same - ad hoc async commit? Though their idea of backend do not maps exac

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Kevin Grittner
"Kevin Grittner" wrote: > to make visibility atomic with commit I meant: to make visibility atomic with WAL-write of the commit record -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgs

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Kevin Grittner
Jeff Davis wrote: > Wouldn't the same issue exist if one transaction is waiting for > sync rep (synchronous_commit=on), and another is waiting for just > a WAL flush (synchronous_commit=local)? I don't think that a > synchronous_commit=off is required. I think you're right -- basically, to mak

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Jeff Davis
On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: > > Right, but if the visibility order were *defined* as the order in which > > commit records appear in WAL, that problem neatly goes away. It's only > > because we have the implementation artifact that "set my xid to 0 in the > > ProcArray" i

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Kevin Grittner
Robert Haas wrote: > Having transactions become visible in the same order on the master > and the standby is very appealing, but I'm pretty well convinced > that allowing commits to become visible before they've been > durably committed is throwing the "D" an ACID out the window. If > synchrono

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 4:36 PM, Hannu Krosing wrote: > so in case of stuck slave the syncrep transcation is committed after > crash, but is not committed before the crash happens ? Yep. > ow will the replay process get stuc gaian during recovery ? Nope. -- Robert Haas EnterpriseDB: http://ww

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 16:20 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 3:40 PM, Hannu Krosing wrote: > > On Thu, 2011-07-28 at 21:32 +0200, Hannu Krosing wrote: > >> On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: > >> > >> > Hmm, interesting idea. However, consider the scenario wh

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 4:12 PM, Kevin Grittner wrote: > Hannu Krosing wrote: >> but I still think that it is right semantics to make your commit >> visible to others, even before you have gotten back the >> confirmation yourself. > > Possibly. That combined with building snapshots based on the o

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 3:40 PM, Hannu Krosing wrote: > On Thu, 2011-07-28 at 21:32 +0200, Hannu Krosing wrote: >> On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: >> >> > Hmm, interesting idea.  However, consider the scenario where some >> > transactions are using synchronous_commit or synch

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 3:32 PM, Hannu Krosing wrote: >> Hmm, interesting idea.  However, consider the scenario where some >> transactions are using synchronous_commit or synchronous replication, >> and others are not.  If a transaction that needs to wait (either just >> for WAL flush, or for WAL

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Kevin Grittner
Hannu Krosing wrote: > but I still think that it is right semantics to make your commit > visible to others, even before you have gotten back the > confirmation yourself. Possibly. That combined with building snapshots based on the order of WAL entries of commit records certainly has several

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 15:38 -0400, Tom Lane wrote: > Hannu Krosing writes: > > So the basic design could be "a sparse snapshot", consisting of 'xmin, > > xmax, running_txids[numbackends] where each backend manages its own slot > > in running_txids - sets a txid when aquiring one and nulls it at co

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 15:42 -0400, Tom Lane wrote: > Hannu Krosing writes: > > On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: > >> We can't make either transaction visible without making > >> both visible, and we certainly can't acknowledge the second > >> transaction to the client until we

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Tom Lane
Hannu Krosing writes: > On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: >> We can't make either transaction visible without making >> both visible, and we certainly can't acknowledge the second >> transaction to the client until we've made it visible. I'm not going >> to say that's so horri

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 21:32 +0200, Hannu Krosing wrote: > On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: > > > Hmm, interesting idea. However, consider the scenario where some > > transactions are using synchronous_commit or synchronous replication, > > and others are not. If a transacti

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Tom Lane
Hannu Krosing writes: > So the basic design could be "a sparse snapshot", consisting of 'xmin, > xmax, running_txids[numbackends] where each backend manages its own slot > in running_txids - sets a txid when aquiring one and nulls it at commit, > possibly advancing xmin if xmin==mytxid. How is th

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 14:27 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 11:57 AM, Tom Lane wrote: > > Robert Haas writes: > >> On Thu, Jul 28, 2011 at 10:33 AM, Tom Lane wrote: > >>> But should we rethink that? Your point that hot standby transactions on > >>> a slave could see snapshot

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 11:36 AM, Hannu Krosing wrote: > On Thu, 2011-07-28 at 11:15 -0400, Robert Haas wrote: >> On Thu, Jul 28, 2011 at 11:10 AM, Hannu Krosing >> wrote: >> > My main point was, that we already do synchronization when writing wal, >> > why not piggyback on this to also update l

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 11:57 AM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Jul 28, 2011 at 10:33 AM, Tom Lane wrote: >>> But should we rethink that?  Your point that hot standby transactions on >>> a slave could see snapshots that were impossible on the parent was >>> disturbing.  Should

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 18:48 +0200, Hannu Krosing wrote: > On Thu, 2011-07-28 at 18:05 +0200, Hannu Krosing wrote: > > > But it is also possible, that you can get logically consistent snapshots > > by protecting only some ops. for example, if you protect only insert and > > get snapshot, then the w

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 18:05 +0200, Hannu Krosing wrote: > But it is also possible, that you can get logically consistent snapshots > by protecting only some ops. for example, if you protect only insert and > get snapshot, then the worst that can happen is that you get a snapshot > that is a few co

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 11:57 -0400, Tom Lane wrote: > Robert Haas writes: > > On Thu, Jul 28, 2011 at 10:33 AM, Tom Lane wrote: > >> But should we rethink that? Your point that hot standby transactions on > >> a slave could see snapshots that were impossible on the parent was > >> disturbing. Sh

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Wed, 2011-07-27 at 22:51 -0400, Robert Haas wrote: > On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane wrote: > > I wonder whether we could do something involving WAL properties --- the > > current tuple visibility logic was designed before WAL existed, so it's > > not exploiting that resource at all.

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Tom Lane
Robert Haas writes: > On Thu, Jul 28, 2011 at 10:33 AM, Tom Lane wrote: >> But should we rethink that? Your point that hot standby transactions on >> a slave could see snapshots that were impossible on the parent was >> disturbing. Should we look for a way to tie "transaction becomes >> visible

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 11:15 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 11:10 AM, Hannu Krosing wrote: > > My main point was, that we already do synchronization when writing wal, > > why not piggyback on this to also update latest snapshot . > > Well, one problem is that it would break sy

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 17:10 +0200, Hannu Krosing wrote: > On Thu, 2011-07-28 at 10:45 -0400, Tom Lane wrote: > > Hannu Krosing writes: > > > On Thu, 2011-07-28 at 10:23 -0400, Robert Haas wrote: > > >> I'm confused by this, because I don't think any of this can be done > > >> when we insert the co

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 11:10 AM, Hannu Krosing wrote: > My main point was, that we already do synchronization when writing wal, > why not piggyback on this to also update latest snapshot . Well, one problem is that it would break sync rep. Another problem is that pretty much the last thing I wa

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 10:45 -0400, Tom Lane wrote: > Hannu Krosing writes: > > On Thu, 2011-07-28 at 10:23 -0400, Robert Haas wrote: > >> I'm confused by this, because I don't think any of this can be done > >> when we insert the commit record into the WAL stream. > > > The update to stored snaps

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 10:33 AM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Jul 28, 2011 at 10:17 AM, Hannu Krosing >> wrote: >>> My hope was, that this contention would be the same than simply writing >>> the WAL buffers currently, and thus largely hidden by the current WAL >>> writing

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Tom Lane
Hannu Krosing writes: > On Thu, 2011-07-28 at 10:23 -0400, Robert Haas wrote: >> I'm confused by this, because I don't think any of this can be done >> when we insert the commit record into the WAL stream. > The update to stored snapshot needs to happen at the moment when the WAL > record is cons

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 10:23 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 10:17 AM, Hannu Krosing wrote: > > My hope was, that this contention would be the same than simply writing > > the WAL buffers currently, and thus largely hidden by the current WAL > > writing sync mechanisma. > > > >

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Tom Lane
Robert Haas writes: > On Thu, Jul 28, 2011 at 10:17 AM, Hannu Krosing wrote: >> My hope was, that this contention would be the same than simply writing >> the WAL buffers currently, and thus largely hidden by the current WAL >> writing sync mechanisma. >> >> It really covers just the part which

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 10:17 AM, Hannu Krosing wrote: > My hope was, that this contention would be the same than simply writing > the WAL buffers currently, and thus largely hidden by the current WAL > writing sync mechanisma. > > It really covers just the part which writes commit records to WAL,

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Thu, 2011-07-28 at 09:38 -0400, Robert Haas wrote: > On Thu, Jul 28, 2011 at 6:50 AM, Hannu Krosing wrote: > > On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane wrote: > >> > I wonder whether we could do something involving WAL properties --- the > >> > current tuple visibility logic was designed bef

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 6:50 AM, Hannu Krosing wrote: > On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane wrote: >> > I wonder whether we could do something involving WAL properties --- the >> > current tuple visibility logic was designed before WAL existed, so it's >> > not exploiting that resource at

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 4:16 AM, Florian Pflug wrote: > On Jul28, 2011, at 04:51 , Robert Haas wrote: >> One fly in the ointment is that 8-byte >> stores are apparently done as two 4-byte stores on some platforms. >> But if the counter runs backward, I think even that is OK.  If you >> happen to r

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Robert Haas
On Thu, Jul 28, 2011 at 3:46 AM, Simon Riggs wrote: > Sounds like the right set of thoughts to be having. Thanks. > If you do this, you must cover subtransactions and Hot Standby. Work > in this area takes longer than you think when you take the > complexities into account, as you must. Right.

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Hannu Krosing
On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane wrote: > > I wonder whether we could do something involving WAL properties --- the > > current tuple visibility logic was designed before WAL existed, so it's > > not exploiting that resource at all. I'm imagining that the kernel of a > > snapshot is jus

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Florian Pflug
On Jul28, 2011, at 04:51 , Robert Haas wrote: > One fly in the ointment is that 8-byte > stores are apparently done as two 4-byte stores on some platforms. > But if the counter runs backward, I think even that is OK. If you > happen to read an 8 byte value as it's being written, you'll get 4 > byt

Re: [HACKERS] cheaper snapshots

2011-07-28 Thread Simon Riggs
On Thu, Jul 28, 2011 at 3:51 AM, Robert Haas wrote: > All that having been said, even if I haven't made any severe > conceptual errors in the above, I'm not sure how well it will work in > practice.  On the plus side, taking a snapshot becomes O(1) rather > than O(MaxBackends) - that's good.  On