Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Kevin Grittner wrote: > This now compiles and passes regression tests. I still need to > re-run all the other tests which Florian and I previously used to > test the patch. I don't have any reason to expect that they will > now fail, but one need to be thorough. Once that is confirmed, I > think this will be ready for committer unless someone can think of > something else to throw at it first. I reran the tests at http://github.com/fgp/fk_concurrency and, unsurprisingly, it still works. This patch addresses concerns I heard expressed by a couple guys from an Oracle shop who wanted to convert to PostgreSQL but were much put out by the behavior of SELECT FOR UPDATE under snapshot isolation in PostgreSQL. This patch should do much to ease the migration of some Oracle shops to PostgreSQL. A complete review was done in the last CF, but I held off marking it as Ready for Committer then because there were some more tests coming, which all looked good. http://archives.postgresql.org/message-id/4c419320022500033...@gw.wicourts.gov I am marking this patch "Ready for Committer" now. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On Aug3, 2010, at 00:43 , Florian Pflug wrote: > Sounds good. That'll also give me some time to test the RI trigger > infrastructure now that I've removed the crosscheck code. Ok, I've found some time do run some additional tests. I've created a small test suite to compare the behaviour of native (cascading) FK constraints to an PLPGSQL implementation. There is a dedicated child table (native_child respectively plpgsql_child) for each of the two FK implementation. Into both child rows for random parents are inserted, creating the parent if it does not already exists. Concurrently, random parent rows are deleted. The number of successful inserts into native_child respectively plpgsql_child and the number of successfull deletes are counted, as well as the number of transactions failed due to either a serialization failure or a race condition during the insert (unique_violation or foreign_key_violation). To verify that things behave as expected, the test suite reports a histogram of the number of parents over the child count after the test completes. The expected distribution is output alongside that histogram, assuming that the number of parents with N children follows an exponential distribution. I've pushed the test suite to g...@github.com:fgp/fk_concurrency.git and put a summary of the results of my test runs on http://wiki.github.com/fgp/fk_concurrency/results. The results look good. There is no significant change in the behaviour of FKs between HEAD and HEAD+serializable_row_locks, and no significant difference between the native and plpgsql implementation. As expected, the plpgsql implementation fails on an unpatched HEAD, aborting with the error "database inconsistent" pretty quickly. I'd be interesting to run the plpgsql implementation with the SHARE locking removed against HEAD+true_serializability to see if that changes the number of serialization_failures that occur. I'll do that, but I'll have to wait a bit for another spare time window to open. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Hi I've updated mvcc.sgml to explain the new serialization conflict rules for row-level locks, and added a paragraph to backend/executor/README that explains the implementation of those. I've chosen backend/executor/README because it already contains a description of UPDATE handling in READ COMMITTED mode, which seemed at least somewhat related. The patch now removes all code dealing with the crosscheck_snapshot, since the RI triggers should now be safe without that. I've also fixed the formatting of multi-line comments to match the coding style. I kept the braces around single-line "if" or "else" statements wherever both "if" and "else" blocks were present and one of them wasn't a single-line block. I think, in addition to the documentation changes this patch contains, that a section on how to write RI triggers in pl/pgsql would be nice to have. It's not strictly part of the documentation of this feature though, more a potential use-case, so I didn't tackle it for the moment. I do hope to find the time to write such a section, though, and if that happens I'll post it as a separate documentation-only patch. I've pushed the changes to the branch serializable_row_locks on git://github.com/fgp/postgres.git and also attached a patch. best regards, Florian Pflug serializable_row_locks.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Joe Conway wrote: > "make dcheck" is running now (although seems rather slow). Yeah, most of those tests completely reset the environment for each permutation. I thought about changing it to update back to the same "visible" initial state each time, but it struck me that since this would accumulate more and more dead tuples as the tests advanced, it would exercise different code paths, so I've kinda got it in mind to add the faster tests as *additional* tests rather than eliminating the existing ones. I know they're way to slow to consider including in the normal "make check" suite, but when (if?) we get a "test farm" set up, this sort of thing seems like it would be in scope. On the other hand, maybe we should have a "quick" set of dtester tests and a more comprehensive one? I'm open to ideas. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On 07/18/2010 07:02 PM, Joe Conway wrote: > On 07/18/2010 11:41 AM, Kevin Grittner wrote: >> To run the tests included in the main patch (if you have python, >> twisted, etc., installed), after the make check, run make dcheck. > > Question about dcheck. After install of twisted, I get: > > 8<- > bash-4.1$ make dcheck > make -C src/test dcheck > make[1]: Entering directory `/opt/src/pgsql/src/test' > make -C regress dcheck > make[2]: Entering directory `/opt/src/pgsql/src/test/regress' > ./pg_dtester.py --temp-install --top-builddir=../../.. \ > --multibyte=SQL_ASCII > Traceback (most recent call last): > File "./pg_dtester.py", line 18, in > from dtester.events import EventMatcher, EventSource, Event, \ > ImportError: No module named dtester.events > 8<- > > Another python package I'm missing? Sorry for the noise -- I see the dependency listed on the wiki to Markus Wanner's dtester. Looks like "make dcheck" is running now (although seems rather slow). Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On 07/18/2010 11:41 AM, Kevin Grittner wrote: > To run the tests included in the main patch (if you have python, > twisted, etc., installed), after the make check, run make dcheck. Question about dcheck. After install of twisted, I get: 8<- bash-4.1$ make dcheck make -C src/test dcheck make[1]: Entering directory `/opt/src/pgsql/src/test' make -C regress dcheck make[2]: Entering directory `/opt/src/pgsql/src/test/regress' ./pg_dtester.py --temp-install --top-builddir=../../.. \ --multibyte=SQL_ASCII Traceback (most recent call last): File "./pg_dtester.py", line 18, in from dtester.events import EventMatcher, EventSource, Event, \ ImportError: No module named dtester.events 8<- Another python package I'm missing? Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On 07/18/2010 11:41 AM, Kevin Grittner wrote: > > I'm attaching a fresh patch, but I think the only differences are: Thanks for the detailed info. I managed to make my way through much of the background info in the papers and wiki yesterday, so I will start reviewing shortly. > If you spot anything on the Serializable Wiki page which is unclear, > please feel free to fix it or let me know. I'm hoping to ultimately > draw from that for a README file. Sounds good -- exactly what I was thinking as I reviewed it. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Kevin Grittner wrote: Comment style seems to be defined here: http://developer.postgresql.org/pgdocs/postgres/source-format.html as being: /* * comment text begins here * and continues here */ You have these formats in your patch: /* comment text begins here * and continues here */ /* comment text begins here and continues here */ /* One line comment like this. */ That last one is actually pretty common in PostgreSQL source, so I'm not sure that its omission from the style page isn't accidental. The style doc talks about a standard for multi-line comments - it doesn't forbid single line comments. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Florian Pflug wrote: > On Jul17, 2010, at 18:25 , Kevin Grittner wrote: >> * Does it follow the project coding guidelines? >> >> Comments are not all in standard style. > Does that refer to the language used, or to the formatting? Formatting. Comment style seems to be defined here: http://developer.postgresql.org/pgdocs/postgres/source-format.html as being: /* * comment text begins here * and continues here */ You have these formats in your patch: /* comment text begins here * and continues here */ /* comment text begins here and continues here */ /* One line comment like this. */ That last one is actually pretty common in PostgreSQL source, so I'm not sure that its omission from the style page isn't accidental. > Btw, while the patch obsoletes the crosscheck snapshot, it > currently doesn't remove its traces of it throughout the executor > and the ri triggers. Mainly because I felt doing so would make > forward-porting and reviewing harder without any gain. But > ultimately, those traces should probably all go, unless someone > feels that for some #ifdef NOT_USED is preferable. My view is that we have a revision control system which makes the code easy to restore, should it be found to be useful again. If it has no use at the point of applying this patch, my inclination would be to delete it. If you're particularly concerned that it could later be useful, you might want to do that deletion in as a separate patch to facilitate later resurrection of the code. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On Jul17, 2010, at 18:25 , Kevin Grittner wrote: > * Does it include reasonable tests, necessary doc patches, etc? > > Documentation changes are needed in the "Concurrency Control" > chapter. > > <...> > > * Do we want that? > > Yes. We seem to have reached consensus on the -hackers list to that > effect. I've kinda waited for the latter before putting work into the former, so I guess I should get going. > * Does the patch slow down simple tests? The part that I'm slightly nervous about regarding performance regressions is the call of MultiXactIdSetOldestVisible() I had to add to GetTransactionSnapshot() (Though only for serializable transactions). That could increase the pressure on the multi xact machinery since for some workloads since it will increase the time a multi xact stays alive. I don't see a way around that, though, since the patch depends on being able to find multi xact members which are invisible to a serializable transaction's snapshot. > * Does it follow the project coding guidelines? > > Comments are not all in standard style. Does that refer to the language used, or to the formatting? > In some cases there are unnecessary braces around a single statement > for an *if* or *else*. Will fix. > There are function where the last two parameters were changed from: > > Snapshot crosscheck, bool wait > > to: > > bool wait, Snapshot lockcheck_snapshot > > It appears to be so that the semantic change to the use of the > snapshot doesn't break code at runtime without forcing the > programmer to notice the change based on compile errors, which seems > like a Good Thing. Yeah, that was the idea. Btw, while the patch obsoletes the crosscheck snapshot, it currently doesn't remove its traces of it throughout the executor and the ri triggers. Mainly because I felt doing so would make forward-porting and reviewing harder without any gain. But ultimately, those traces should probably all go, unless someone feels that for some #ifdef NOT_USED is preferable. > * Are the comments sufficient and accurate? > > Given that there is a significant behavioral change, it seems as > though there could be a sentence or two somewhere concisely stating > the how things behave, but I'm not sure quite where it would go. > Perhaps something in the README file in the access/transam > directory? Hm, I'll try to come up with something. > * Are there interdependencies that can cause problems? > > Also, I'm attempting to adapt the dcheck tests for the other patch > to work with this patch. If successful, I'll post with the results > of that additional testing. Cool! And thanks for the work you put into reviewing this! I'll update the documentation and comments ASAP. Probably not within the next few days though, since I'm unfortunately quite busy at the moment, trying to finally complete my master's thesis. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On 7/17/10 12:09 PM, Kevin Grittner wrote: > Joe Conway wrote: > >> Should I be installing Florian's patch in addition to yours when I >> start testing? > > There's some manual fix-up needed, primarily because we need to > differentiate between SERIALIZABLE and REPEATABLE READ isolation > levels, and therefore replaced the IsXactIsoLevelSerializable macro > with IsXactIsoLevelXactSnapshotBased and > IsXactIsoLevelFullySerializable. If you can wait until tomorrow, > I'll create a merged patch for you and confirm that it passes the > modified Florian's pgbench test (with the FOR SHARED clause > removed). I'll include a crude hack to pgbench I had to use to test > this, with an explanation of why it was needed. I'm still trying to > put together better testing techniques for the long term. > >> Also, where can I get the latest and greatest version of your >> patch? > > There's always the git repository, but I'll post a new patch > tomorrow, based on what I've recently found. That all sounds great. I'll concentrate today on understanding the theory and high level design. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
Joe Conway wrote: > Should I be installing Florian's patch in addition to yours when I > start testing? There's some manual fix-up needed, primarily because we need to differentiate between SERIALIZABLE and REPEATABLE READ isolation levels, and therefore replaced the IsXactIsoLevelSerializable macro with IsXactIsoLevelXactSnapshotBased and IsXactIsoLevelFullySerializable. If you can wait until tomorrow, I'll create a merged patch for you and confirm that it passes the modified Florian's pgbench test (with the FOR SHARED clause removed). I'll include a crude hack to pgbench I had to use to test this, with an explanation of why it was needed. I'm still trying to put together better testing techniques for the long term. > Also, where can I get the latest and greatest version of your > patch? There's always the git repository, but I'll post a new patch tomorrow, based on what I've recently found. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
On 07/17/2010 09:25 AM, Kevin Grittner wrote: > I was concerned about its interaction with the other serializable > patch (by myself and Dan Ports), so I also combined the patches and > tested. Florian's pgbench test did expose bugs in the *other* > patch, which I then fixed in the combined setting. There was still > some breakage in the other patch when Florian's patch was backed > out, so at the moment, this patch would appear to be a > *prerequisite* for the other. (That can hopefully be corrected > by changes to the other patch.) I am currently reading the Cahill paper [1] in preparation for reviewing your patch [2]. Should I be installing Florian's patch in addition to yours when I start testing? Also, where can I get the latest and greatest version of your patch? Thanks, Joe [1] http://ses.library.usyd.edu.au/bitstream/2123/5353/1/michael-cahill-2009-thesis.pdf [2] https://commitfest.postgresql.org/action/patch_view?id=310 -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & Support signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle
= Submission review = * Is the patch in context diff format? Yes. * Does it apply cleanly to the current CVS HEAD? Yes. * Does it include reasonable tests, necessary doc patches, etc? There is one pgbench test which shows incorrect behavior without the patch and correct behavior with the patch for a significant use case. Documentation changes are needed in the "Concurrency Control" chapter. Usability review * Does the patch actually implement that? Yes. * Do we want that? Yes. We seem to have reached consensus on the -hackers list to that effect. On a personal note, I heard some current Oracle users who were considering a switch to PostgreSQL grumbling after Robert's trigger presentation at PostgreSQL Conference U.S. East about how they didn't need to use such complex coding techniques to ensure data integrity in Oracle as is required in PostgreSQL. I was surprised, since I know that they also get snapshot isolation when they request serializable, but they explained that SELECT FOR UPDATE provides stronger guarantees in Oracle. This patch should provide equivalent behavior, which should ease migration from Oracle and allow simpler coding techniques in snapshot isolation to protect data. * Do we already have it? No. * Does it follow SQL spec, or the community-agreed behavior? It's not in the spec, but it moves to safer behavior which is consistent with the current Oracle implementation. * Does it include pg_dump support (if applicable)? Not applicable. * Are there dangers? Some code which continues after blocking will now get a serialization failure. It's possible that this could cause problems for some existing code, although that code was likely either using SELECT FOR UPDATE unnecessarily or was unsafe without this patch. * Have all the bases been covered? As far as I can see. Feature test * Does the feature work as advertised? Yes. * Are there corner cases the author has failed to consider? Not that I found. * Are there any assertion failures or crashes? No. == Performance review == * Does the patch slow down simple tests? No. * If it claims to improve performance, does it? It makes no such claim. * Does it slow down other things? No. = Coding review = * Does it follow the project coding guidelines? Comments are not all in standard style. In some cases there are unnecessary braces around a single statement for an *if* or *else*. There are function where the last two parameters were changed from: Snapshot crosscheck, bool wait to: bool wait, Snapshot lockcheck_snapshot It appears to be so that the semantic change to the use of the snapshot doesn't break code at runtime without forcing the programmer to notice the change based on compile errors, which seems like a Good Thing. * Are there portability issues? No. * Will it work on Windows/BSD etc? Yes. * Are the comments sufficient and accurate? Given that there is a significant behavioral change, it seems as though there could be a sentence or two somewhere concisely stating the how things behave, but I'm not sure quite where it would go. Perhaps something in the README file in the access/transam directory? * Does it do what it says, correctly? Yes. * Does it produce compiler warnings? No. * Can you make it crash? No. === Architecture review === * Is everything done in a way that fits together coherently with other features/modules? I think so. * Are there interdependencies that can cause problems? Not that I could identify with committed code. I was concerned about its interaction with the other serializable patch (by myself and Dan Ports), so I also combined the patches and tested. Florian's pgbench test did expose bugs in the *other* patch, which I then fixed in the combined setting. There was still some breakage in the other patch when Florian's patch was backed out, so at the moment, this patch would appear to be a *prerequisite* for the other. (That can hopefully be corrected by changes to the other patch.) Also, I'm attempting to adapt the dcheck tests for the other patch to work with this patch. If successful, I'll post with the results of that additional testing. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers