Re: [HACKERS] UNDO and in-place update

2017-01-13 Thread Robert Haas
On Fri, Jan 13, 2017 at 5:57 AM, Amit Kapila wrote: > Sure, we can do that way and I agree that it is worth considering. > Few cases where it can be really costly is when undo chain overflows > to multiple pages and those pages doesn't exist in cache. I think the >

Re: [HACKERS] UNDO and in-place update

2017-01-13 Thread Amit Kapila
On Thu, Jan 12, 2017 at 8:56 PM, Robert Haas wrote: > On Wed, Jan 11, 2017 at 5:18 AM, Amit Kapila wrote: >> Moving further, I have thought about consistent reads and different >> formats for storing undo with pros and cons of each format. >> >>

Re: [HACKERS] UNDO and in-place update

2017-01-12 Thread Robert Haas
On Wed, Jan 11, 2017 at 5:18 AM, Amit Kapila wrote: > Moving further, I have thought about consistent reads and different > formats for storing undo with pros and cons of each format. > > Consistent Reads > > If the page is modified by any

Re: [HACKERS] UNDO and in-place update

2017-01-11 Thread Amit Kapila
On Tue, Jan 10, 2017 at 7:28 PM, Amit Kapila wrote: > On Mon, Jan 9, 2017 at 11:47 PM, Robert Haas wrote: >> >> Yes, something like this can be done. You don't really need any new >> page-level header data, because you can get the XIDs from the

Re: [HACKERS] UNDO and in-place update

2017-01-10 Thread Amit Kapila
On Mon, Jan 9, 2017 at 11:47 PM, Robert Haas wrote: > On Mon, Jan 9, 2017 at 7:50 AM, Amit Kapila wrote: >> One idea could be that we have some fixed number of >> slots (i think we can make it variable as well, but for simplicity, >> lets consider

Re: [HACKERS] UNDO and in-place update

2017-01-09 Thread Robert Haas
On Mon, Jan 9, 2017 at 7:50 AM, Amit Kapila wrote: > Okay, I see the point. I think here UNDO pointer can be marked > invalid either during page-pruning or vacuum. I explicitly want to avoid that, because it means re-dirtying the page. The UNDO pointer becomes invalid

Re: [HACKERS] UNDO and in-place update

2017-01-09 Thread Amit Kapila
On Fri, Jan 6, 2017 at 7:41 PM, Robert Haas wrote: > On Fri, Jan 6, 2017 at 6:28 AM, Amit Kapila wrote: >>> Also, I'm thinking the bit could be stored in the line pointer rather >>> than the tuple, because with this design we don't need >>>

Re: [HACKERS] UNDO and in-place update

2017-01-06 Thread Robert Haas
On Fri, Jan 6, 2017 at 6:28 AM, Amit Kapila wrote: >> Also, I'm thinking the bit could be stored in the line pointer rather >> than the tuple, because with this design we don't need >> LP_UNUSED/LP_NORMAL/LP_REDIRECT/LP_DEAD any more. We could use one >> bit to indicate

Re: [HACKERS] UNDO and in-place update

2017-01-06 Thread Amit Kapila
On Thu, Jan 5, 2017 at 9:25 PM, Robert Haas wrote: > On Wed, Jan 4, 2017 at 6:05 AM, Amit Kapila wrote: >> Okay, so this optimization can work only after all the active >> transactions operating on a page are finished. If that is true, in >> some

Re: [HACKERS] UNDO and in-place update

2017-01-05 Thread Robert Haas
On Thu, Jan 5, 2017 at 6:51 AM, Amit Kapila wrote: > UNDO has to be kept till heap page is marked as all visible. This is > required to check the visibility of index. Now, I think the page can > be marked as all visible when we removed corresponding dead entries in >

Re: [HACKERS] UNDO and in-place update

2017-01-05 Thread Robert Haas
On Wed, Jan 4, 2017 at 6:05 AM, Amit Kapila wrote: > Okay, so this optimization can work only after all the active > transactions operating on a page are finished. If that is true, in > some cases such a design can consume a lot of CPU traversing all the > tuples in a

Re: [HACKERS] UNDO and in-place update

2017-01-05 Thread Amit Kapila
On Wed, Jan 4, 2017 at 8:35 PM, Dilip Kumar wrote: > On Wed, Jan 4, 2017 at 4:35 PM, Amit Kapila wrote: >> In this new system, I >> think we can't remove undo entries of heap page till we clear >> corresponding index entries. I think we need to

Re: [HACKERS] UNDO and in-place update

2017-01-04 Thread Dilip Kumar
On Wed, Jan 4, 2017 at 4:35 PM, Amit Kapila wrote: > In this new system, I > think we can't remove undo entries of heap page till we clear > corresponding index entries. I think we need to somehow collect the > old values from undo corresponding to index and then scan the

Re: [HACKERS] UNDO and in-place update

2017-01-04 Thread Amit Kapila
On Tue, Dec 6, 2016 at 9:35 PM, Robert Haas wrote: > On Mon, Dec 5, 2016 at 4:49 AM, Amit Kapila wrote: >> I can very well understand the reason for not doing so (IIUC, it is >> about complexity and time to get it right when we are already trying

Re: [HACKERS] UNDO and in-place update

2016-12-06 Thread Robert Haas
On Mon, Dec 5, 2016 at 4:49 AM, Amit Kapila wrote: > I can very well understand the reason for not doing so (IIUC, it is > about complexity and time to get it right when we are already trying > to solve one big and complex problem of the system), but saying most > of the

Re: [HACKERS] UNDO and in-place update

2016-12-05 Thread Amit Kapila
On Thu, Dec 1, 2016 at 8:55 PM, Robert Haas wrote: > On Tue, Nov 29, 2016 at 12:21 AM, Amit Kapila wrote: > > I see what you're going for, but I'm not sure it's worth it. I mean, > say you just have one bit per index tuple. If it's set, the heap

Re: [HACKERS] UNDO and in-place update

2016-12-02 Thread Robert Haas
On Fri, Dec 2, 2016 at 5:01 AM, Alexander Korotkov wrote: > Idea of storing just one visibility bit in index tuple is a subject of > serious doubts for me. > > 1. When definitely-all-visible isn't set then we have to recheck during > scanning heap, right? > But our

Re: [HACKERS] UNDO and in-place update

2016-12-02 Thread Alexander Korotkov
On Thu, Dec 1, 2016 at 6:25 PM, Robert Haas wrote: > There's a couple of possible designs here, but there is the > possibility for extra hops in some cases. But there are things we can > do to mitigate that. > > 1. If each tuple has a definitely-all-visible bit, you can

Re: [HACKERS] UNDO and in-place update

2016-12-01 Thread Robert Haas
On Tue, Nov 29, 2016 at 12:21 AM, Amit Kapila wrote: >> I think we need to avoid putting the visibility information in the >> index because that will make the index much bigger. > > I agree that there will be an increase in index size, but it shouldn't > be much if we

Re: [HACKERS] UNDO and in-place update

2016-11-29 Thread Alexander Korotkov
On Tue, Nov 29, 2016 at 8:21 AM, Amit Kapila wrote: > On Mon, Nov 28, 2016 at 11:01 PM, Robert Haas > wrote: > > On Sun, Nov 27, 2016 at 10:44 PM, Amit Kapila > wrote: > >> On Mon, Nov 28, 2016 at 4:50 AM, Robert Haas

Re: [HACKERS] UNDO and in-place update

2016-11-28 Thread Amit Kapila
On Mon, Nov 28, 2016 at 11:01 PM, Robert Haas wrote: > On Sun, Nov 27, 2016 at 10:44 PM, Amit Kapila wrote: >> On Mon, Nov 28, 2016 at 4:50 AM, Robert Haas wrote: >>> Well, my original email did contain a discussion of the

Re: [HACKERS] UNDO and in-place update

2016-11-28 Thread Robert Haas
On Sun, Nov 27, 2016 at 10:44 PM, Amit Kapila wrote: > On Mon, Nov 28, 2016 at 4:50 AM, Robert Haas wrote: >> Well, my original email did contain a discussion of the need for >> delete-marking. I said that you couldn't do in-place updates when >>

Re: [HACKERS] UNDO and in-place update

2016-11-28 Thread Bruce Momjian
On Sun, Nov 27, 2016 at 09:19:06AM +0530, Amit Kapila wrote: > At this point, index scan for value 2 will find index tuple of step-1 > (2) and will conclude 2,def as a right tuple, but actually, that is > wrong as the step-1 (2) index tuple should not be visible to the user. > Do you also this as

Re: [HACKERS] UNDO and in-place update

2016-11-27 Thread Amit Kapila
On Mon, Nov 28, 2016 at 4:50 AM, Robert Haas wrote: > > Well, my original email did contain a discussion of the need for > delete-marking. I said that you couldn't do in-place updates when > indexed columns were modified unless the index AMs had support for >

Re: [HACKERS] UNDO and in-place update

2016-11-27 Thread Robert Haas
On Sun, Nov 27, 2016 at 8:26 PM, Tsunakawa, Takayuki wrote: > I see. autovacuum is certainly almost unpredictable, at least for those who > are not aware of its existence and tuning. Recently, one of our customers > faced the inability to perform INSERTs

Re: [HACKERS] UNDO and in-place update

2016-11-27 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com] > On Thu, Nov 24, 2016 at 2:32 AM, Tsunakawa, Takayuki > wrote: > > IMHO, overall, there should be pros and cons of the current approach and > the new UNDo one (like Oracle?), depending on the workload. Under >

Re: [HACKERS] UNDO and in-place update

2016-11-27 Thread Robert Haas
On Sat, Nov 26, 2016 at 10:49 PM, Amit Kapila wrote: > On Fri, Nov 25, 2016 at 11:23 PM, Bruce Momjian wrote: >> On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote: >>> I agree up to a point. I think we need to design our own system as >>>

Re: [HACKERS] UNDO and in-place update

2016-11-26 Thread Amit Kapila
On Fri, Nov 25, 2016 at 11:23 PM, Bruce Momjian wrote: > On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote: >> I agree up to a point. I think we need to design our own system as >> well as we can, not just copy what others have done. For example, the >> design I

Re: [HACKERS] UNDO and in-place update

2016-11-25 Thread Greg Stark
On 24 November 2016 at 23:03, Robert Haas wrote: >> For snapshot isolation Oracle has yet a *third* copy of the data in a >> space called the "rollback segment(s)". > > My understanding is that this isn't correct. I think the rollback > segments are what they call the

Re: [HACKERS] UNDO and in-place update

2016-11-25 Thread Bruce Momjian
On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote: > I agree up to a point. I think we need to design our own system as > well as we can, not just copy what others have done. For example, the > design I sketched will work with all of PostgreSQL's existing index > types. You need to

Re: [HACKERS] UNDO and in-place update

2016-11-25 Thread Amit Kapila
On Fri, Nov 25, 2016 at 8:03 PM, Amit Kapila wrote: >> > > Another way to do is to write UNDO log for split operation (with exact > record locations on pages that are split) so that you don't need to > perform this expensive search operation. I can understand writing >

Re: [HACKERS] UNDO and in-place update

2016-11-25 Thread Amit Kapila
On Thu, Nov 24, 2016 at 10:09 PM, Robert Haas wrote: > > I don't really understand how a system of this type copes with page > splits. Suppose there are entries 1000, 2000, ..., 1e6 in the btree. > I now start two overlapping transactions, one inserting all the > positive

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Pavel Stehule
2016-11-25 1:44 GMT+01:00 Robert Haas : > On Thu, Nov 24, 2016 at 6:20 PM, Pavel Stehule > wrote: > >> I think that the whole emphasis on whether and to what degree this is > >> like Oracle is somewhat misplaced. I would look at it a different >

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Robert Haas
On Thu, Nov 24, 2016 at 6:20 PM, Pavel Stehule wrote: >> I think that the whole emphasis on whether and to what degree this is >> like Oracle is somewhat misplaced. I would look at it a different >> way. We've talked many times over the years about how PostgreSQL is >>

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Pavel Stehule
> I think that the whole emphasis on whether and to what degree this is > like Oracle is somewhat misplaced. I would look at it a different > way. We've talked many times over the years about how PostgreSQL is > optimized for aborts. Everybody that I've heard comment on that issue > thinks that

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Robert Haas
On Thu, Nov 24, 2016 at 11:06 AM, Greg Stark wrote: > Fwiw, Oracle does not use the undo log for snapshot fetches. It's used > only for transaction rollback and recovery. > > For snapshot isolation Oracle has yet a *third* copy of the data in a > space called the "rollback

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Robert Haas
On Thu, Nov 24, 2016 at 2:32 AM, Tsunakawa, Takayuki wrote: > IMHO, overall, there should be pros and cons of the current approach and the > new UNDo one (like Oracle?), depending on the workload. Under update-heavy > workload, the UNDO method may be better.

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Thomas Kellerer
> FWIW, while this is basically true, the idea of repurposing UNDO to be > usable for MVCC is definitely an Oracleism. Mohan's ARIES paper says > nothing about MVCC. > For snapshot isolation Oracle has yet a *third* copy of the data in a > space called the "rollback segment(s)". UNDO and

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Bruce Momjian
On Thu, Nov 24, 2016 at 04:06:14PM +, Greg Stark wrote: > For snapshot isolation Oracle has yet a *third* copy of the data in a > space called the "rollback segment(s)". When you update a row in a > block you save the whole block in the rollback segment. When you try > to access a block you

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Robert Haas
On Wed, Nov 23, 2016 at 5:18 PM, Thomas Munro wrote: > On Wed, Nov 23, 2016 at 6:01 PM, Peter Geoghegan wrote: >> * Our behavior with many duplicates in secondary indexes is pretty bad >> in general, I suspect. > > From the pie-in-the-sky

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Greg Stark
On 23 November 2016 at 04:28, Peter Geoghegan wrote: > On Tue, Nov 22, 2016 at 7:01 PM, Robert Haas wrote: >> This basic DO-UNDO-REDO protocol has been well-understood for >> decades. > > FWIW, while this is basically true, the idea of repurposing UNDO to

Re: [HACKERS] UNDO and in-place update

2016-11-24 Thread Bruce Momjian
On Wed, Nov 23, 2016 at 11:35:38PM -0800, Peter Geoghegan wrote: > On Wed, Nov 23, 2016 at 11:32 PM, Tsunakawa, Takayuki > wrote: > > IMHO, overall, there should be pros and cons of the current approach and > > the new UNDo one (like Oracle?), depending on the

Re: [HACKERS] UNDO and in-place update

2016-11-23 Thread Peter Geoghegan
On Wed, Nov 23, 2016 at 11:32 PM, Tsunakawa, Takayuki wrote: > IMHO, overall, there should be pros and cons of the current approach and the > new UNDo one (like Oracle?), depending on the workload. Under update-heavy > workload, the UNDO method may be better.

Re: [HACKERS] UNDO and in-place update

2016-11-23 Thread Tsunakawa, Takayuki
IMHO, overall, there should be pros and cons of the current approach and the new UNDo one (like Oracle?), depending on the workload. Under update-heavy workload, the UNDO method may be better. OTOH, under the mostly-INSERT workload (like data warehouse?), the current method will be better

Re: [HACKERS] UNDO and in-place update

2016-11-23 Thread Thomas Munro
On Wed, Nov 23, 2016 at 6:01 PM, Peter Geoghegan wrote: > * Our behavior with many duplicates in secondary indexes is pretty bad > in general, I suspect. >From the pie-in-the-sky department: I believe there are snapshot-based systems that don't ever have more than one entry for

Re: [HACKERS] UNDO and in-place update

2016-11-23 Thread Robert Haas
On Tue, Nov 22, 2016 at 11:18 PM, Tom Lane wrote: > Peter Geoghegan writes: >> On Tue, Nov 22, 2016 at 7:31 PM, Tom Lane wrote: >>> Oracle spends a lot of time on this, and it's really cache-inefficient >>> because the data is spread all

Re: [HACKERS] UNDO and in-place update

2016-11-23 Thread Albe Laurenz
Robert Haas wrote: > To implement this in PostgreSQL, I believe we would need support for > UNDO. > - Reading a page that has been recently modified gets significantly > more expensive; it is necessary to read the associated UNDO entries > and do a bunch of calculation that is significantly more

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Peter Geoghegan
On Tue, Nov 22, 2016 at 8:45 PM, Tom Lane wrote: > Peter Geoghegan writes: >> The best thing by far about an alternative design like this is that it >> performs *consistently*. > > Really? I think it just moves the issues somewhere else. Definitely, yes. *

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Pavan Deolasee
On Wed, Nov 23, 2016 at 10:07 AM, Peter Geoghegan wrote: > On Tue, Nov 22, 2016 at 8:18 PM, Tom Lane wrote: > > Ultimately, I doubt that update-in-place buys much that we don't already > > have with HOT updates (which postdate this old conversation, btw). >

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Mark Kirkwood
On 23/11/16 16:31, Tom Lane wrote: Robert Haas writes: [ Let's invent Oracle-style UNDO logs ] I dunno. I remember being told years ago, by an ex-Oracle engineer, that he thought our approach was better. I don't recall all the details of the conversation but I think

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Tom Lane
Peter Geoghegan writes: > The best thing by far about an alternative design like this is that it > performs *consistently*. Really? I think it just moves the issues somewhere else. regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Peter Geoghegan
On Tue, Nov 22, 2016 at 8:18 PM, Tom Lane wrote: > Ultimately, I doubt that update-in-place buys much that we don't already > have with HOT updates (which postdate this old conversation, btw). > If you want MVCC semantics, you need to hold both versions of the tuple >

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Amit Kapila
On Wed, Nov 23, 2016 at 9:48 AM, Tom Lane wrote: > Peter Geoghegan writes: >> On Tue, Nov 22, 2016 at 7:31 PM, Tom Lane wrote: >>> Oracle spends a lot of time on this, and it's really cache-inefficient >>> because the data is spread all

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Peter Geoghegan
On Tue, Nov 22, 2016 at 7:01 PM, Robert Haas wrote: > This basic DO-UNDO-REDO protocol has been well-understood for > decades. FWIW, while this is basically true, the idea of repurposing UNDO to be usable for MVCC is definitely an Oracleism. Mohan's ARIES paper says

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Amit Kapila
On Wed, Nov 23, 2016 at 9:32 AM, Peter Geoghegan wrote: > On Tue, Nov 22, 2016 at 7:31 PM, Tom Lane wrote: >>> - Reading a page that has been recently modified gets significantly >>> more expensive; it is necessary to read the associated UNDO entries >>> and

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Tom Lane
Peter Geoghegan writes: > On Tue, Nov 22, 2016 at 7:31 PM, Tom Lane wrote: >> Oracle spends a lot of time on this, and it's really cache-inefficient >> because the data is spread all over. This was what this guy felt in >> circa 2001; I'd have to think that

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Peter Geoghegan
On Tue, Nov 22, 2016 at 7:31 PM, Tom Lane wrote: >> - Reading a page that has been recently modified gets significantly >> more expensive; it is necessary to read the associated UNDO entries >> and do a bunch of calculation that is significantly more complex than >> what is

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 10:41 PM, Peter Geoghegan wrote: > On Tue, Nov 22, 2016 at 7:39 PM, Robert Haas wrote: >>> Heikki's been fooling with some ideas that I think have more promise. >>> I wish he'd get to the point of presenting them publicly rather

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Peter Geoghegan
On Tue, Nov 22, 2016 at 7:39 PM, Robert Haas wrote: >> Heikki's been fooling with some ideas that I think have more promise. >> I wish he'd get to the point of presenting them publicly rather than >> just over beers at conferences. > > That would be good, too! I was told

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Robert Haas
On Tue, Nov 22, 2016 at 10:31 PM, Tom Lane wrote: > Robert Haas writes: >> [ Let's invent Oracle-style UNDO logs ] > > I dunno. I remember being told years ago, by an ex-Oracle engineer, > that he thought our approach was better. I don't recall all

Re: [HACKERS] UNDO and in-place update

2016-11-22 Thread Tom Lane
Robert Haas writes: > [ Let's invent Oracle-style UNDO logs ] I dunno. I remember being told years ago, by an ex-Oracle engineer, that he thought our approach was better. I don't recall all the details of the conversation but I think his key point was basically this: >