Re: [HACKERS] A third lock method
Bruce Momjian wrote: I must be missing something but I thought the only problem with our existing snapshot system was that you could see a row updated after your snapshot was created, and that the solution to that was to abort the transaction that would see the new row. Can you tell me what I am missing? But with snapshot isolation (what our serializable corresponds to) you cannot see rows updated after snapshot creation, right? So phantom reads cannot occur, but we still are not truly serializable. See the example I concocted in http://archives.postgresql.org/pgsql-hackers/2009-05/msg00316.php for illustration. Yours, Laurenz Albe PS: Different from what Kevin claimed, Oracle also cannot grant you strictly serializable transactions, because they also use snapshot isolation. Seems that they get away with it. My feeling is that the cases where this would be a problem are pretty rare; my example referenced above feels artificial for a good reason. If we can do it better than Oracle, I'm not against it :^) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] add xml support function
Hi - Does anyone know how to add a xml support function into postgresql? I want to add a function named xmlquery into postgresql. I modify gram.y by adding xmlquery relatedcode wherever other xml support functions appear. but the parser can not find xmlquery, the makeXmlExpr is not called at all. Thanks!
Re: [HACKERS] add xml support function
On Thu, Dec 31, 2009 at 3:33 AM, fangfang liu yisuoyanyu...@gmail.com wrote: Hi - Does anyone know how to add a xml support function into postgresql? I want to add a function named xmlquery into postgresql. I modify gram.y by adding xmlquery relatedcode wherever other xml support functions appear. but the parser can not find xmlquery, the makeXmlExpr is not called at all. How about CREATE OR REPLACE FUNCTION xmlquery(...) ? If that doesn't meet your needs, you'll need to explain what you're trying to do - and probably provide a patch showing what you've tried so far. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On ons, 2009-12-30 at 12:53 -0500, Robert Haas wrote: It looks like they are all very permissive, though I wonder what the legal effect of a license clause that the software be used for Good and not Evil might be. It's not without issues, apparently: http://grep.be/blog/en/computer/legal/good_not_evil -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On ons, 2009-12-30 at 13:23 -0500, Andrew Dunstan wrote: I'd like to see at least the outline of an API before we go any further. JSON is, shall we say, lightly specified, and doesn't appear to have any equivalent to XPath and friends, for example. How will we extract values from a JSON object? How will we be able to set values inside them? I think the primary use will be to load a JSON value into Perl or Python and process it there. So a json type that doesn't have any interesting operators doesn't sound useless to me. The features I would like to get out of it are input validation and encoding handling and smooth integration with said languages. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thread safety and libxml2
On ons, 2009-12-30 at 12:55 -0500, Greg Smith wrote: Basically, configure failed on their OpenBSD system because thread safety is on but the libxml2 wasn't compiled with threaded support: http://xmlsoft.org/threads.html Disabling either feature (no --with-libxml or --disable-thread-safety) gives a working build. This could perhaps be fixed by excluding libxml when running the thread test. The thread result is only used in the client libraries and libxml is only used in the backend, so those two shouldn't meet each other in practice. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] More frame options in window functions
Attached is the fix pointed out in the previous CommitFest plus RANGE offset support. *fix - move window regression test to another parallel group, but regarding the limitation of 20 per group the union test goes to the group the window test belonged to. - allow NULL iswindowagg as an argument of AggGetMemoryContext - change view name to longer one *RANGE offset - allow all of RANGE BETWEEN value PRECEDING/FOLLOWING AND value PRECEDING/FOLLOWING for any data types that support ORDER BY and additions/subtractions, which is extended design to the spec. The spec says data types allowed in RANGE offset are only numeric and temporal ones but we don't have such limitation. - add +/- operator search code in parsing, which is used to calculate frame bound, but I'm not sure if this is right approach. - add more regression tests Regards, -- Hitoshi Harada more_frame_options.20091231.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Windows x64 [repost]
2009/12/31 Magnus Hagander mag...@hagander.net: 2009/12/4 Tsutomu Yamada tsut...@sraoss.co.jp: A bit further down, it has: + /* The size of `void *', as computed by sizeof. */ + #define SIZEOF_VOID_P 4 + shouldn't that be 8 for win64 platforms? Nevermind this second comment. Now that it's no longer 1AM, I see that this is included in the *second* patch... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cancelling idle in transaction state
On Thu, 2009-12-24 at 21:38 +0100, Joachim Wieland wrote: On Sun, Dec 6, 2009 at 4:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: We are using NOTICE, not NOTIFY, assuming that we use anything at all (which I still regard as unnecessary). Please stop injecting confusion into the discussion. Attached is a minimal POC patch that allows to cancel an idle transaction with SIGINT. The HS patch also allows this in its current form but as Simon points out the client gets out of sync with it. The proposal is to send an additional NOTICE to the client and abort all open transactions and subtransactions (this is what I got from the previous discussion). This all works and I'm looking to post a reviewed patch soon. I had to write an additional function AbortAnyTransaction() which aborts all transactions and subtransactions and leaves the transaction in the aborted state, is there an existing function to do this? My use of AbortOutOfAnyTransaction() was what caused the problem-I-couldn't-solve. It aborted too far, confusing clients. Joachim's function does the right thing and leaves the transaction state correctly, so that clients don't get confused. Problem solved, thanks Joachim. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
Actually, I expect the function looks like xmlquery(xmlcontent,xquery) and return the query result. So xmlconcat(xml1,xml2) become a good example to study. starting from gram.y, I try to make parser to call xmlquery,the following is the result of diff: diff gram.y c:\new_gram.y 449c449 XMLPI XMLQUERY XMLROOT XMLSERIALIZE --- XMLPI XMLROOT XMLSERIALIZE 8270,8273d8269 | XMLQUERY '(' expr_list ')' { $$ = makeXmlExpr(IS_XMLQUERY, NULL, NIL,$3); } 8287d8282 9296d9290 | XMLQUERY IS_XMLQUERY is added in primenodes.h but the parser can not find xmlquery, the makeXmlExpr is not called at all. thanks 2009/12/31 Robert Haas robertmh...@gmail.com On Thu, Dec 31, 2009 at 3:33 AM, fangfang liu yisuoyanyu...@gmail.com wrote: Hi - Does anyone know how to add a xml support function into postgresql? I want to add a function named xmlquery into postgresql. I modify gram.y by adding xmlquery relatedcode wherever other xml support functions appear. but the parser can not find xmlquery, the makeXmlExpr is not called at all. How about CREATE OR REPLACE FUNCTION xmlquery(...) ? If that doesn't meet your needs, you'll need to explain what you're trying to do - and probably provide a patch showing what you've tried so far. ...Robert
Re: [HACKERS] Hot Standy introduced problem with query cancel behavior
On Wed, 2009-12-30 at 20:06 +0100, Andres Freund wrote: Hm. I just read a bit of that multiplexing facility (out of a different reason) and I have some doubt about it being used unmodified for canceling backends: procsignal.c: /* * Note: Since there's no locking, it's possible that the target * process detaches from shared memory and exits right after this * test, before we set the flag and send signal. And the signal slot * might even be recycled by a new process, so it's remotely possible * that we set a flag for a wrong process. That's OK, all the signals * are such that no harm is done if they're mistakenly fired. */ procsignal.h: ... * Also, because of race conditions, it's important that all the signals be * defined so that no harm is done if a process mistakenly receives one. */ When cancelling a backend that behaviour could be a bit annoying ;-) Reading comments alone doesn't show the full situation here. Before we send signal we check pid also, so the chances of this happening are fairly small. i.e. we would need to have a backend slot reused by a new backend with exactly same pid as the last slot holder. I'm proposing to use this for killing transactions and connections, so I don't think there's too much harm done there. If the backend is leaving anyway, that's what we wanted. If its a new guy that is wearing the same boots then a little collateral damage doesn't leave the server in a bad place. HS cancellations aren't yet so precise that this matters. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
On tor, 2009-12-31 at 19:22 +0800, fangfang liu wrote: Actually, I expect the function looks like xmlquery(xmlcontent,xquery) and return the query result. You don't need to modify the parser for that at all. You can implement that in user-space as a user-defined function. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
you mean create and replace funtion xmlquery()? maybe you are right, but I still want to know why parser does not find xmlquery. thanks. 2009/12/31 Peter Eisentraut pete...@gmx.net On tor, 2009-12-31 at 19:22 +0800, fangfang liu wrote: Actually, I expect the function looks like xmlquery(xmlcontent,xquery) and return the query result. You don't need to modify the parser for that at all. You can implement that in user-space as a user-defined function.
[HACKERS] problem with realizing gist index
Hello all, I'm creating gist index for testing purpuses :) I created index element structure: typedef struct moving_object { double x_high; double y_high; double x_low; double y_low; time_t mov_time; double x_plus; double y_plus; double x_minus; double y_minus; } moving_object; And defined macros that return pointer to this structure: #define DatumGetMovP(x) ((moving_object*)DatumGetPointer(x)) but index interface function gives me error: Incompatible type in assignment Error line I style - bold Datum gist_mov_union(PG_FUNCTION_ARGS) { GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0); int *sizep = (int *) PG_GETARG_POINTER(1); int num_obj; moving_object *pageunion, curr; num_obj = entryvec-n; pageunion = (moving_object*) palloc(sizeof(moving_object)); // THIS IS THE ERROR LINE * curr = DatumGetMovP(entryvec-vector[0].key);* make_now(curr); memcpy((void *) pageunion, (void *) curr, sizeof(moving_object)); for(int i = 1; i num_obj; i++) { curr = DatumGetMovP(entryvec-vector[i].key); make_now(curr); if (pageunion-x_high curr-x_high) pageunion-x_high = curr-x_high; if (pageunion-y_high curr-y_high) pageunion-y_high = curr-y_high; if (pageunion-x_low curr-x_low) pageunion-x_low = curr-x_low; if (pageunion-y_low curr-y_low) pageunion-y_low = curr-y_low; } *sizep = sizeof(moving_object); PG_RETURN_POINTER(pageunion); }* * Can anybody know what the problem ? I imagine that my defined function returns not pointer, but stucture (??) ? Best regards, Sergej Galkin
Re: [HACKERS] Hot Standy introduced problem with query cancel behavior
On Thursday 31 December 2009 12:25:19 Simon Riggs wrote: On Wed, 2009-12-30 at 20:06 +0100, Andres Freund wrote: Hm. I just read a bit of that multiplexing facility (out of a different reason) and I have some doubt about it being used unmodified for canceling backends: procsignal.c: /* * Note: Since there's no locking, it's possible that the target * process detaches from shared memory and exits right after this * test, before we set the flag and send signal. And the signal slot * might even be recycled by a new process, so it's remotely possible * that we set a flag for a wrong process. That's OK, all the signals * are such that no harm is done if they're mistakenly fired. */ procsignal.h: ... * Also, because of race conditions, it's important that all the signals be * defined so that no harm is done if a process mistakenly receives one. */ When cancelling a backend that behaviour could be a bit annoying ;-) Reading comments alone doesn't show the full situation here. Before we send signal we check pid also, so the chances of this happening are fairly small. i.e. we would need to have a backend slot reused by a new backend with exactly same pid as the last slot holder. Well. The problem does not occur for critical errors (i.e. session death) only. As signal delivery is asynchronous it can very well happen for transaction cancellation as well. I'm proposing to use this for killing transactions and connections, so I don't think there's too much harm done there. If the backend is leaving anyway, that's what we wanted. If its a new guy that is wearing the same boots then a little collateral damage doesn't leave the server in a bad place. HS cancellations aren't yet so precise that this matters. Building racy infrastructure when it can be avoided with a little care still seems not to be the best path to me. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A third lock method
2009/12/31 Bruce Momjian br...@momjian.us: I must be missing something but I thought the only problem with our existing snapshot system was that you could see a row updated after your snapshot was created, and that the solution to that was to abort the transaction that would see the new row. Can you tell me what I am missing? The problem is rather the opposite. A minimal example of a situation that the current implementation allows, and which the new proposal tries to fix is: 1. The database contains rows X and Y having one column, and having different values for that column (i.e., X != Y). 2. Serializable (in the current PG sense) transactions A and B run concurrently (i.e., both take their snapshot before the other commits, so they don't see each other's changes). 3. Y := X; A reads X and updates Y to become the same as X. 4. X := Y; B reads Y and updates X to become the same as Y. Result: Sequentially executing A and B in either order leads to a result where X = Y. Still, after the above steps 1-4, the values of X and Y are switched around (and thus X != Y). Therefore, the execution was (by definition) not serializable. This is caused by the fact that in a serializable execution either A would have seen the update performed by B, or B would have seen the update performed by A. This problem is called write skew in the paper (their example is less theoretical, but also more complex because of the use of COUNT(..).) So instead of aborting transactions because otherwise they would see too many changes, the goal is rather to abort transactions because otherwise they wouldn't have seen enough changes. The SIREAD locks are used to mark the versions that have been read by whom (for all transactions that were concurrent with any of the active transactions), so that potentially problematic writes that occur after reads can be detected: I wrote a new version of something that was already read by a concurrent transaction, so in any serialization, I must come after that other transaction. The other direction (I read something that has a newer version than what I just read, so in any serialization, I must come before that other transaction) can be detected straightforwardly. Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] KNNGiST for knn-search (WIP)
On Wed, Dec 30, 2009 at 4:56 PM, Robert Haas robertmh...@gmail.com wrote: From my point of view, what makes a patch invasive is the likelihood that it might break something other than itself. For example, your patch touches the core planner code and the core GIST code, so it seems possible that adding support for this feature might break something else in one of those areas. It doesn't seem obvious to me that this is a high-risk patch. It's touching the planner which is tricky but it's not the kind of massive overhaul that touches every module that HOT or HS were. I'm really glad HS got in before the end because lots of people with different areas of expertise and different use cases are going to get to exercise it in the time remaining. This patch I would expect relatively few people to need to try it out before any oversights are caught. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
[ Reviving this old thread because a recent one referred to it. ] 2009/5/7 Albe Laurenz laurenz.a...@wien.gv.at: Kevin Grittner wrote: maybe I misunderstood something. Consider a function makehighlander(personid integer) RETURNS void defined like this: SELECT ishighlander INTO b FROM scots WHERE id=personid; IF b THEN RETURN; /* no need to do anything */ END IF; UPDATE scots SET ishighlander=TRUE WHERE id=personid; SELECT count(*) INTO n FROM scots WHERE ishighlander; IF (n 1) THEN RAISE EXCEPTION 'There can be only one'; END IF; If we assume that ishighlander is false for all records in the beginning, and there are two calls to the function with two personid's of records *in different pages*, then there cannot be any conflicts since all (write and intention) locks taken by each of these calls should only affect the one page that contains the one record that is updated and then found in the subsequent SELECT. Yet if the two execute concurrently and the two first SELECTs are executed before the two UPDATEs, then both functions have a snapshot so that the final SELECT statements will return 1 and both functions will succeed, leaving the table with two highlanders. I do think you misunderstood. If there are two concurrent executions and each reads one row, there will be an SIREAD lock for each of those rows. As an example, let's say that one of them (T0) updates its row and does its count, finds everything looks fine, and commits. In reading the row the other transaction (T1) modified it sets the T0.outConflict flag to true and the T1.inConflict flag to true. Where does T0 read the row that T1 modified? * Typically, concurrency theory doesn't care about the specifics of relational databases: it works on a (possibly countably infinite) number of data items (sometimes called variables). * If a certain concurrency control technique works for such data items (i.e., can only result in serializable executions or whatever), then it must necessarily also work for relational databases which map their data in pages, if those pages are treated the same way the data items are. Indexes and any other structures that can be used to *find out* which other pages to read/write must then also be treated this way. * To answer your specific question: T0 might not read that specific row, but the COUNT(..) definitely must read *something* that must be modified by T1 when it updates the ishighlander field: either the row itself (which I would expect if no index on ishighlander exists), or some page in an index that it used to find out that it didn't need to inspect the row itself. Otherwise, the update wasn't effective because re-executing the COUNT(..) later on would not result in any change in the result (which leads to a contradiction: changing the ishighlander field of one row must result in a change in the number of highlanders). Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem with realizing gist index
2009/12/31 Sergej Galkin sergej.gal...@gmail.com: typedef struct moving_object { double x_high; double y_high; double x_low; double y_low; time_t mov_time; double x_plus; double y_plus; double x_minus; double y_minus; } moving_object; [..] #define DatumGetMovP(x) ((moving_object*)DatumGetPointer(x)) [..] but index interface function gives me error: Incompatible type in assignment [..] moving_object *pageunion, curr; Note that curr is not defined as a pointer. // THIS IS THE ERROR LINE curr = DatumGetMovP(entryvec-vector[0].key); But here you want to assign a pointer to it. Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Doesn't seem insurmountable, though, just one more thing to think about as we're having this conversation. Someone else will need to weigh in on this point though, as I don't use JSON in a way that would make anything beyond validation particularly relevant. I don't use JSON, but I do use YAML. Attached, please find a patch that implements hstore_to_yaml(). just kidding. :) I think we are getting the cart way before the horse. +1. Smells like a solution in search of a problem, as they say. - -- Greg Sabino Mullane g...@turnstep.com PGP Key: 0x14964AC8 200912310759 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAks8oC4ACgkQvJuQZxSWSsgHfQCgznfnazYgVDz9ak5xfQZj6Fsk b6UAoMH/v3Lu0R+wkoN024GcZtxqpEI2 =ELcu -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re-enabling SET ROLE in security definer functions
Hi All, We are seeking to enable SET ROLE in security-definer functions, since @ D.E Shaw there are scripts from the past that used this feature, and I think you'd also agree that SET ROLE is security definer functions has it's uses. As the code stands right now, I see that the only concern against enabling SET ROLE in SecDef functions is that, that when the local-user-id change context is exited, the GUC might be left out of sync. We currently have two bits indicating separately whether we are in context where (i) the CurrentUserId has changed, or (ii) Security concerns do not allow certain operation. But we have only one flag for GUC that stops us from performing any of SET ROLE or SET SESSION AUTHORIZATION while any of the above two flags are set. I propose that we have a separate GUC flag to indicate whether we are in UserId-changed context. So, we disallow SET ROLE only when we are in Security-Restricted context, and disallow SET SESSION AUTHORIZATION when we are either in Security-Restricted context or in UserId-changed context. So SET ROLE would be prohibited in maintenance operations, but allowed in SecDef functions (only if they are not invoked on a stack where maintenance operation was initiated earlier). And SET SESSION AUTHORIZATION will be disallowed when we are in either of the UserId-changed or Security-Restricted contexts. To address the problem of GUC getting out of sync when a SecDef function is exited, we can perform a check at the end, just before reverting to the calling userid, that if the called function stack has used SET ROLE to change the CurrentUserId, then we keep that user id to be in sync with GUC, rather than sync the GUC with current settings. This keeps the current semantics of GUC where if the called function (whether SecDef or not) used SET to change a GUC parameter, then that setting prevails even after the function has exited successfully. Attaching patch to implement the above proposals. I have given some thought to nesting of such call scenarios, and haven't found one which could cause an issue with this approach. Hope I haven't overlooked something. NB: In the patch, the block surrounded with if(InSecurityRestrictedOperation()) in guc.c will never be called, since the GUC parameter that it applies to (session_auth) is also marked as not allowed while in UserId-changed context, and that condition is cecked in previous block of code. This can be remedied by swapping the two relevant if blocks. I did not do it to keep the patch simple and small. Best regards, PS: For some context, this started with an aim to enable SET ROLE command in security definer functions, which D.E Shaw needed. This command is still not enabled in SecDef functions, but it led to a security exposure followed by the security fix; commit id: 31d0bddf77b9e2b5581816aa96d3a3 92ab7d8543. See also: http://gurjeet-tech.blogspot.com/2009/12/conversation-on-fixing-security-issue.html On Sat, Dec 12, 2009 at 9:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh gurjeet.si...@enterprisedb.com writes: SET ROLE is safe in any context since it can be used to switch to only to those roles that the Session User is a member of, whereas SET SESSION AUTHORIZATION is unsafe since it can be used to switch to any role in the cluster iff the Authenticated User is a superuser. Maybe you had better read that statement again, and remember that the session user is typically a superuser in exactly the cases we are concerned about. regards, tom lane -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.enterprisedb.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device allow_set_role.2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IntArray in c.h
On ons, 2009-12-30 at 11:46 +0900, Hitoshi Harada wrote: 2009/12/30 Tom Lane t...@sss.pgh.pa.us: Hitoshi Harada umi.tan...@gmail.com writes: I found the struct IntArray defined in c.h is actually used only in execQual.c. ISTM the definition should be at least moved to the right place. It's a general-purpose datatype that might be used anywhere that array indexing happens. I think the fact that it's currently used only in execQual is mere happenstance, and should not be enforced by moving or removing the declaration. I would be convinced if the struct or the logic was complex, but actually it is so simple that it can be replaced by primitive int array. Also, it seems to me that c.h is too general place to declare it for such purpose. Does nobody else think so? The definition of c.h is bogus anyway. You might think it contains includes and defines to set up a portable C environment, which is what the first half indeed does. But then things like regproc, transaction ID types, IntArray, varlena, bytea, oidvector, NameData, etc. do not belong there and should be moved to postgres.h. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Security vulnerability regarding SET ROLE and REINDEX
Hi All, We are seeking to enable SET ROLE in security-definer functions, since D.E Shaw has a usage for this, and I think you'd also agree that SET ROLE is security definer functions has it's uses. As the code stands right now, I see that the only concern against enabling SET ROLE in SecDef functions is that, that when the local-user-id change context is exited, the GUC might be left out of sync. We currently have two bits indicating separately whether we are in context where (i) the CurrentUserId has changed, or (ii) Security concerns do not allow certain operation. But we have only one flag for GUC that stops us from performing any of SET ROLE or SET SESSION AUTHORIZATION while any of the above two flags are set. I propose that we have a separate GUC flag to indicate whether we are in UserId-changed context. So, we disallow SET ROLE only when we are in Security-Restricted context, and disallow SET SESSION AUTHORIZATION when we are either in Security-Restricted context or in UserId-changed context. So SET ROLE would be prohibited in maintenance operations, but allowed in SecDef functions (only if they are not invoked on a stack where maintenance operation was initiated earlier). And SET SESSION AUTHORIZATION will be disallowed when we are in either of the UserId-changed or Security-Restricted contexts. To address the problem of GUC getting out of sync when a SecDef function is exited, we can perform a check at the end, just before reverting to the calling userid, that if the called function stack has used SET ROLE to change the CurrentUserId, then we keep that user id to be in sync with GUC, rather than sync the GUC with current settings. This keeps the current semantics of GUC where if the called function (whether SecDef or not) used SET to change a GUC parameter, then that setting prevails even after the function has exited successfully. Attaching patch to implement the above proposals. I have given some thought to nesting of such call scenarios, and haven't found one which could cause an issue with this approach. Hope I haven't overlooked something. NB: In the patch, the block surrounded with if(InSecurityRestrictedOperation()) in guc.c will never be called, since the GUC parameter that it applies to (session_auth) is also marked as not allowed while in UserId-changed context, and that condition is cecked in previous block of code. This can be remedied by swapping the two relevant if blocks. I did not do it to keep the patch simple and small. Best regards, PS: For some context, this started with an aim to enable SET ROLE command in security definer functions, which D.E Shaw needed. This command is still not enabled in SecDef functions, but it led to a security exposure followed by the security fix; commit id: 31d0bddf77b9e2b5581816aa96d3a392ab7d8543. See also: http://gurjeet-tech.blogspot.com/2009/12/conversation-on-fixing-security-issue.html On Sat, Dec 12, 2009 at 9:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh gurjeet.si...@enterprisedb.com writes: SET ROLE is safe in any context since it can be used to switch to only to those roles that the Session User is a member of, whereas SET SESSION AUTHORIZATION is unsafe since it can be used to switch to any role in the cluster iff the Authenticated User is a superuser. Maybe you had better read that statement again, and remember that the session user is typically a superuser in exactly the cases we are concerned about. regards, tom lane -- gurjeet[.sin...@enterprisedb.com EnterpriseDB http://www.enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com -- gurjeet[.sin...@enterprisedb.com EnterpriseDB http://www.enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com allow_set_role.2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cancelling idle in transaction state
On Thu, 2009-12-31 at 11:10 +, Simon Riggs wrote: On Thu, 2009-12-24 at 21:38 +0100, Joachim Wieland wrote: On Sun, Dec 6, 2009 at 4:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: We are using NOTICE, not NOTIFY, assuming that we use anything at all (which I still regard as unnecessary). Please stop injecting confusion into the discussion. Attached is a minimal POC patch that allows to cancel an idle transaction with SIGINT. The HS patch also allows this in its current form but as Simon points out the client gets out of sync with it. The proposal is to send an additional NOTICE to the client and abort all open transactions and subtransactions (this is what I got from the previous discussion). This all works and I'm looking to post a reviewed patch soon. Attached is the patch I intend to commit, barring objections. This patch extends SIGINT to allow cancellation of transactions while idle in both HS and normal mode. It also changes the standard message reported on an idle transaction in aborted state to 'IDLE in transaction (aborted)', so that once aborted we keep the message even if the user tries to issue further statements other than ROLLBACK or COMMIT. This also solves the bug reported by Kris Jurka. Joachim, credit will be to you, so please re-check. (Further changes pending on HS side, so not all issues resolved by this. I intend to use this mechanism for HS cancellations when CONFLICT_MODE_ERROR, and another mechanism for CONFLICT_MODE_FATAL.) -- Simon Riggs www.2ndQuadrant.com *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 12899,12905 SELECT set_config('log_statement_stats', 'off', false); literalfunctionpg_cancel_backend/function(parameterpid/parameter typeint/)/literal /entry entrytypeboolean/type/entry !entryCancel a backend's current query/entry /row row entry --- 12899,12905 literalfunctionpg_cancel_backend/function(parameterpid/parameter typeint/)/literal /entry entrytypeboolean/type/entry !entryCancel a backend's current query or abort an idle transaction/entry /row row entry *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 55,60 postgres 965 0.0 1.1 6152 1512 pts/1SN 13:17 0:00 postgres: stats c --- 55,61 postgres 998 0.0 2.3 6532 2992 pts/1SN 13:18 0:00 postgres: tgl runbug 127.0.0.1 idle postgres 1003 0.0 2.4 6532 3128 pts/1SN 13:19 0:00 postgres: tgl regression [local] SELECT waiting postgres 1016 0.1 2.4 6532 3080 pts/1SN 13:19 0:00 postgres: tgl regression [local] idle in transaction + postgres 1066 0.1 2.4 6532 3080 pts/1SN 13:19 0:00 postgres: tgl regression [local] idle in transaction (aborted) /screen (The appropriate invocation of commandps/ varies across different *** *** 77,82 postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re --- 78,84 the life of the client connection, but the activity indicator changes. The activity can be literalidle/ (i.e., waiting for a client command), literalidle in transaction/ (waiting for client inside a commandBEGIN/ block), + literalidle in transaction (aborted)/ (waiting for client to commandROLLBACK/), or a command type name such as literalSELECT/. Also, literalwaiting/ is attached if the server process is presently waiting on a lock held by another server process. In the above example we can infer *** a/src/backend/access/transam/xact.c --- b/src/backend/access/transam/xact.c *** *** 313,320 IsTransactionState(void) /* * IsAbortedTransactionBlockState * ! * This returns true if we are currently running a query ! * within an aborted transaction block. */ bool IsAbortedTransactionBlockState(void) --- 313,319 /* * IsAbortedTransactionBlockState * ! * This returns true if we are within an aborted transaction block. */ bool IsAbortedTransactionBlockState(void) *** *** 2692,2697 AbortCurrentTransaction(void) --- 2691,2737 } /* + * AbortTransactionAndAnySubtransactions + * + * Similar to AbortCurrentTransaction but if any subtransactions + * in progress we abort them and all of their parents. So this is + * used when the caller wishes to make the abort untrappable by the user. + */ + void + AbortTransactionAndAnySubtransactions(void) + { + TransactionState s = CurrentTransactionState; + + switch (s-blockState) + { + case TBLOCK_DEFAULT: + case TBLOCK_STARTED: + case TBLOCK_BEGIN: + case TBLOCK_INPROGRESS: + case TBLOCK_END: + case TBLOCK_ABORT: + case TBLOCK_SUBABORT: + case TBLOCK_ABORT_END: + case TBLOCK_ABORT_PENDING: + case TBLOCK_PREPARE: + case TBLOCK_SUBABORT_END: + case TBLOCK_SUBABORT_RESTART: + AbortCurrentTransaction();
Re: [HACKERS] A third lock method
Albe Laurenz wrote: See the example I concocted in http://archives.postgresql.org/pgsql-hackers/2009-05/msg00316.php Sure, let's look at that example. Of course, *any* transaction run by itself won't show differences from true serializable behavior *regardless* of the mode in which it runs -- because it actually was serialized. Let's see how your example might work if the function was being run on two different backends at the same time with different personid values. Connection 1: == [Currently no highlander; the function does this for personid = 1] SELECT ishighlander INTO b FROM scots WHERE id=personid; IF b THEN RETURN; /* no need to do anything */ END IF; UPDATE scots SET ishighlander=TRUE WHERE id=personid; SELECT count(*) INTO n FROM scots WHERE ishighlander; IF (n 1) THEN RAISE EXCEPTION 'There can be only one'; END IF; [Connection 1 now sees a highlander; not yet committed] Connection 2: === [Currently no highlander according to this snapshot] [the function does exactly the same thing as on Connection 1, but for personid 2] [It doesn't see the work of Connection 1, so it's count shows the update is OK] Now they commit, in either order. You now have two highlanders in the database. You have just demonstrated another case of write skew, where snapshot isolation does not behave in a truly serializable fashion, allowing constraints enforced in application software or functions (including triggers) to be violated. With the changes I'm working on, one of these would be rolled back with a serialization error. PS: Different from what Kevin claimed, Oracle also cannot grant you strictly serializable transactions, because they also use snapshot isolation. Apologies if that is still true. I don't use Oracle and one of the recent articles I recently read seemed to indicate otherwise. Thanks for the correction. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 12:45 PM, Nicolas Barbier nicolas.barb...@gmail.com wrote: * To answer your specific question: T0 might not read that specific row, but the COUNT(..) definitely must read *something* that must be modified by T1 when it updates the ishighlander field: The problem occurs when the update happens. It doesn't have any way to know currently that a SELECT has already looked at that record and that the same transaction has performed an update which this transaction has already ignored when performing the count(*). The unsolved problems that have been raised are: - How and where do we have SELECTs note all the rows they read -- and all the rows they *would* have read that don't exist already. Note that something like select count(*) where id=? needs to be able to detect new rows from being inserted with the specified value, not merely lock the existing rows. - Can we do it without requiring major code changes in every index am and destroying modularity between the transaction management and the index api. - How do we do that without causing SELECTS to perform tons of write i/o they don't have to do now. People already complain about the hint bit updates the first time you do selects, doing i/o on every select would be disastrous. - Can we do that without destroying concurrency with course locks a la MySQL ISAM tables. - Can we do it without introducing unexpected serialization failure between transactions updating unrelated rows. Ideally, can we do it in a way that serialization errors are predictable rather than depending on access paths the planner chooses so they don't just randomly start happening when plans change. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
fangfang liu wrote: you mean create and replace funtion xmlquery()? maybe you are right, but I still want to know why parser does not find xmlquery. We would have to see your code to know why it didn't do what you expect. What exactly are you intending that this function would do anyway? If you're looking at XQuery support, there are serious issues regarding what library to use, see http://archives.postgresql.org/pgsql-hackers/2009-11/msg01445.php cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
fangfang liu escribió: IS_XMLQUERY is added in primenodes.h but the parser can not find xmlquery, the makeXmlExpr is not called at all. Did you add it to keywords.c etc? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A third lock method
Albe Laurenz wrote: But with snapshot isolation (what our serializable corresponds to) you cannot see rows updated after snapshot creation, right? So phantom reads cannot occur, but we still are not truly serializable. My previous reply missed your point entirely, didn't it? Let me try again. You are absolutely right that the phantom phantom rows can't pop up during a transaction running at snapshot isolation level. So the phantom read problem, per se, is not an issue. The problem with phantoms rows in snapshot isolation is not that they pop up within a concurrent transaction, but that a concurrent transaction does not block on trying to read them (as they would with an S2PL serializable implementation) but will just miss them, even though they might later be (or might already be) committed ahead of the current transaction. They need to be considered in the SSI read-write dependency checks. Perhaps the affect of such inserts (or updates into a selected range) manifest is a different enough way that a new term is merited, but I'm inclined to think not. The conditions in which they become an issue are the same. The techniques for detecting them are the same. These phantoms just appear to the current connection upon commit of its transaction rather than in the middle of it, but either way they cause problems if the current transaction is modifying data based on its view of them. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cancelling idle in transaction state
On Thu, Dec 31, 2009 at 3:03 PM, Simon Riggs si...@2ndquadrant.com wrote: This patch extends SIGINT to allow cancellation of transactions while idle in both HS and normal mode. It also changes the standard message reported on an idle transaction in aborted state to 'IDLE in transaction (aborted)', so that once aborted we keep the message even if the user tries to issue further statements other than ROLLBACK or COMMIT. This also solves the bug reported by Kris Jurka. Was the bug reported by Kris really only about lost synchronization or was it about SIGINT now cancelling idle transactions which it did not do previously? I still think that we should have three transaction cancel modes, one to cancel an idle transaction, another one to cancel a running query and a third one that just cancels the transaction regardless of it being idle or not. This last one is what you are implementing now, and it is what HS wants to do. However I think that Kris only wants to cancel a running query but not an idle transaction. And an administrator who wants to cancel an idle transaction can never be sure that the transaction that he checked which has just been idle is still idle... (Further changes pending on HS side, so not all issues resolved by this. I intend to use this mechanism for HS cancellations when CONFLICT_MODE_ERROR, and another mechanism for CONFLICT_MODE_FATAL.) CONFLICT_MODE_FATAL is what you are planning to implement via SIGUSR1 multiplexing then? Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
yes,whereever xmlconcat appears. 2009/12/31 Alvaro Herrera alvhe...@commandprompt.com fangfang liu escribió: IS_XMLQUERY is added in primenodes.h but the parser can not find xmlquery, the makeXmlExpr is not called at all. Did you add it to keywords.c etc? -- Alvaro Herrera http://www.CommandPrompt.com/ http://www.commandprompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Re: [HACKERS] add xml support function
thanks for Xquery info, I do not start to implenment the xmlquery() itself , maybe call xqilla lib or sth else in the future. I think xmltable is almost equal to xmlquery , except that it retuns talbeset insead of xml. I take xmlconcat as an example , add xmlquery whereever xmlconcat appers and nothing else. the first problem is , I have to make xmlquery accepted by the parser. 2009/12/31 Andrew Dunstan and...@dunslane.net fangfang liu wrote: you mean create and replace funtion xmlquery()? maybe you are right, but I still want to know why parser does not find xmlquery. We would have to see your code to know why it didn't do what you expect. What exactly are you intending that this function would do anyway? If you're looking at XQuery support, there are serious issues regarding what library to use, see http://archives.postgresql.org/pgsql-hackers/2009-11/msg01445.php cheers andrew
Re: [HACKERS] Serializable Isolation without blocking
Greg Stark wrote: The unsolved problems that have been raised are: [legion] Yeah, that's why this is a two to four year project. And I would point out that if there *wasn't* a performance hit in serializable mode, none of the other isolation levels would exist. These less rigorous modes exist precisely because people are often willing to give up some data integrity guarantees, or solve them with more labor-intensive techniques, to gain performance. I certainly wouldn't consider removing any of the existing transaction isolation levels or attempt to coerce anyone into using them against their will. ;-) I am keeping track of the lists you're putting out there; they should be quite useful in the optimization phase. I do intend to first get a patch which is correct in the sense of never allowing non- serializable behavior, but which contains some of the problems you list (although destroying modularity is obviously off the table even for that point), and then refine the granularity and performance to try to get within bounds which are acceptable for our use, and hopefully (eventually) the PostgreSQL community. One of the things I'm currently working on is what would make a good set of tests to run during development to track progress. I welcome any particular use-cases you want to ensure are covered. If you could provide a detailed description or (even better) a self- contained test case for something you would like to ensure is covered, that would be most welcome. Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cancelling idle in transaction state
On Thu, 2009-12-31 at 15:41 +0100, Joachim Wieland wrote: I still think that we should have three transaction cancel modes, one to cancel an idle transaction, another one to cancel a running query and a third one that just cancels the transaction regardless of it being idle or not. This last one is what you are implementing now, and it is what HS wants to do. pg_cancel_backend() is currently conditional on whether a statement is active or not, so should really be called pg_cancel_if_active(). What people want is an unconditional way to stop a transaction. I don't see the need for 3 modes (and that has nothing to do with HS). -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: I think the primary use will be to load a JSON value into Perl or Python and process it there. So a json type that doesn't have any interesting operators doesn't sound useless to me. The features I would like to get out of it are input validation and encoding handling and smooth integration with said languages. What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] uintptr_t for Datum
Attached patch is the part of the win64 patch that changes Datum to be uintptr_t, and associated changes, with only very minor changes from me. It also includes autoconf tests that I tricked Bruce into fixing for me :-) Comments? Unless there are objections, I'll go ahead and apply this one for broader buildfarm testing. It's working on all the platforms I could test, and also on Bruces (where the original patch broke). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ uintptr.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add xml support function
Sorry,guys, my mistakes, the keywords list should be sorted , and I just append xmlquery at the end of keyword list , that is why parser does not call makexmlexpr.Instead, the parser try to find a function to match xmlquery and faild. 2009/12/31 fangfang liu yisuoyanyu...@gmail.com thanks for Xquery info, I do not start to implenment the xmlquery() itself , maybe call xqilla lib or sth else in the future. I think xmltable is almost equal to xmlquery , except that it retuns talbeset insead of xml. I take xmlconcat as an example , add xmlquery whereever xmlconcat appers and nothing else. the first problem is , I have to make xmlquery accepted by the parser. 2009/12/31 Andrew Dunstan and...@dunslane.net fangfang liu wrote: you mean create and replace funtion xmlquery()? maybe you are right, but I still want to know why parser does not find xmlquery. We would have to see your code to know why it didn't do what you expect. What exactly are you intending that this function would do anyway? If you're looking at XQuery support, there are serious issues regarding what library to use, see http://archives.postgresql.org/pgsql-hackers/2009-11/msg01445.php cheers andrew
Re: [HACKERS] PATCH: Add hstore_to_json()
David E. Wheeler wrote: On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: I think the primary use will be to load a JSON value into Perl or Python and process it there. So a json type that doesn't have any interesting operators doesn't sound useless to me. The features I would like to get out of it are input validation and encoding handling and smooth integration with said languages. What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? IMNSHO it's essential. I think Peter's approach of ignoring this requirement is extremely shortsighted. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 3:11 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Yeah, that's why this is a two to four year project. And I would point out that if there *wasn't* a performance hit in serializable mode, none of the other isolation levels would exist. These less rigorous modes exist precisely because people are often willing to give up some data integrity guarantees, or solve them with more labor-intensive techniques, to gain performance. I certainly wouldn't consider removing any of the existing transaction isolation levels or attempt to coerce anyone into using them against their will. ;-) Hm, this raises the issue that you'll have to figure out what should happen if two different transactions are using different isolation modes. Currently our two isolation modes only control behaviour within your transaction so they co-exist perfectly fine. ISTM you would need to have transactions in read-comitted and snapshot isolation modes recording what sets of records they read in order to be able to guarantee serializable transactions to make any guarantees as well. Separately even if nobody on the system is using true serializable isolation the on-disk data structures would need space to store the additional information just in case anyone uses it. If that's a substantial amount of space it might impact performance even if you never use it. I am keeping track of the lists you're putting out there; they should be quite useful in the optimization phase. I do intend to first get a patch which is correct in the sense of never allowing non- serializable behavior, but which contains some of the problems you list (although destroying modularity is obviously off the table even for that point), and then refine the granularity and performance to try to get within bounds which are acceptable for our use, and hopefully (eventually) the PostgreSQL community. Yeah, I'm mostly concerned about the initial design question of where to store all this extra information. It seems like once you've made that decision most of the consequences will be pretty much set. The two proposals on the table -- neither of which seem acceptable to me -- are: 1) Store information in indexes used in a scans indicating what scans are in progress or have been done by the transaction. This means you need some way to store the xid and the scan keys such that you can guarantee any index maintenance will see it. You also have to be able to handle arbitrary sets of xids similar to how we handle multi-xids currently. Also you'll need some way to handle deletions which currently don't do any index maintenance. This would have to be done for every index type and would impose design constraints on the index behaviours since in many cases it's not so easy to arrange things to provide these guarantees. It also creates a big dependency on the planner behaviour such that a change in plans will create user-visible changes in the serialization failures that are possible. I fear it would have terrible false-positive rates for some types of plans possibly even being unable to ever complete some queries, notably sequential table scans which would make people try to arrange for less efficient plans because they're trying to avoid serialization failures. It would also impose space costs on every index on disk. 2) Have some kind of in-memory data structure which stores the filter conditions for different scans that are in progress or have been executed by live transactions. This would have to be consulted by updates to find conflicting filter conditions. This has the problem that there's no efficient way to search for conflicting filter conditions -- you would have to do a linear search across all filter conditions on the same table and do fairly complex theorem-proving for each one to prove there's no conflict. It would probably perform terribly. It has the advantage of working regardless of the access method and not touching index am code at all. Both of these seem unacceptable to me but perhaps there are solutions I'm not seeing. I wonder if some kind of hybrid approach is possible but it seems like it would have the worst of both worlds in that case. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IntArray in c.h
Peter Eisentraut pete...@gmx.net writes: The definition of c.h is bogus anyway. You might think it contains includes and defines to set up a portable C environment, which is what the first half indeed does. But then things like regproc, transaction ID types, IntArray, varlena, bytea, oidvector, NameData, etc. do not belong there and should be moved to postgres.h. Actually, what c.h does is to provide definitions that are needed in both frontend and backend code. And we do NOT want to start including postgres.h in frontend code. It might be that some of the declarations there are useless to frontend code and could be moved, but trying to be as strict as you suggest is only going to create problems. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uintptr_t for Datum
Magnus Hagander mag...@hagander.net writes: Attached patch is the part of the win64 patch that changes Datum to be uintptr_t, and associated changes, with only very minor changes from me. It also includes autoconf tests that I tricked Bruce into fixing for me :-) Comments? This is a joke no? Where's the logic to provide a definition of intptr_t if the platform fails to? The lack of attention to updating the comments about Datum doesn't give me a warm feeling either. BTW, it looks like the patch is showing a manual change to pg_config.h.in. Don't do that. Run autoheader. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uintptr_t for Datum
2009/12/31 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: Attached patch is the part of the win64 patch that changes Datum to be uintptr_t, and associated changes, with only very minor changes from me. It also includes autoconf tests that I tricked Bruce into fixing for me :-) Comments? This is a joke no? Hey, it got your attention ;) Where's the logic to provide a definition of intptr_t if the platform fails to? The lack of attention to updating autoconf does that. This is exactly what broke on Bruce's platform, and autoconf fixed it in the way that is included in the patch. the comments about Datum doesn't give me a warm feeling either. Will look over that. BTW, it looks like the patch is showing a manual change to pg_config.h.in. Don't do that. Run autoheader. That also came out of Bruce's patch. Bruce, can you look at doing that? I don't have a machine easily accessible with the right autoconf version ATM :( -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] point_ops for GiST
2009/12/30 Teodor Sigaev teo...@sigaev.ru: Sync with current CVS I have reviewed this patch and it looks good to me. The only substantive question I have is why gist_point_consistent() uses a different coding pattern for the box case than it does for the polygon and circle cases? It's not obvious to me on the face of it why these aren't consistent. Beyond that, I have a variety of minor whitespace and commenting suggestions, so I am attaching an updated version of the patch as well as an incremental diff between your version and mine, for your consideration. The changes are: (1) comment reuse of gist_box functions for point_ops, (2) format point ops function analogously to existing sections in same file, (3) uncuddle opening braces, (4) adjust indentation and spacing in a few places, (5) rename StrategyNumberOffsetRange to GeoStrategyNumberOffset, and (6) use a plain block instead of do {} while (0) - the latter construct is really only needed in certain types of macros. ...Robert point_ops-0.5-rmh Description: Binary data point_ops-0.5-rmh-incremental Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
Gurjeet Singh singh.gurj...@gmail.com writes: We are seeking to enable SET ROLE in security-definer functions, since @ D.E Shaw there are scripts from the past that used this feature, and I think you'd also agree that SET ROLE is security definer functions has it's uses. Actually, I don't find that to be a given. Exactly what use-cases have you got that aren't solved as well or better by calling a SECURITY DEFINER function owned by the target role? As the code stands right now, I see that the only concern against enabling SET ROLE in SecDef functions is that, that when the local-user-id change context is exited, the GUC might be left out of sync. This statement represents a complete lack of understanding of the actual security problem. The actual security problem is that SET ROLE allows you to become any role that the *session* user is allowed to become. The reason for locking it down in security-restricted contexts is that we don't want that to happen: we need to confine the available privileges to only those that, say, the owner of the table being vacuumed would have. While it's possible that we could design some mechanism that would enforce this properly, I fear that it would be tricky and a likely source of future new security problems. In any case the net result would be that SET ROLE would behave differently from spec, so it would still be non-standard-compliant, just differently from before. So IMHO you really need to offer a convincing reason why we should even try to solve this, as opposed to telling people to use security definer functions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Greg Stark wrote: Hm, this raises the issue that you'll have to figure out what should happen if two different transactions are using different isolation modes. Currently our two isolation modes only control behaviour within your transaction so they co-exist perfectly fine. ISTM you would need to have transactions in read-comitted and snapshot isolation modes recording what sets of records they read in order to be able to guarantee serializable transactions to make any guarantees as well. No, there is no requirement that serializable transactions serialize with weaker modes. The Cahill thesis addresses this point directly. Unless you can point out some flaw in his proofs, this is not an issue. [criticisms of hypothetical implementation techniques] There are no such proposals on the table, and the hypothetical techniques you mention seem unlikely to be ones I would use. The one and only issue I have on the table at the moment is to create a new lock method for SIREAD locks. I'd welcome any comments on that. If I get to the point of feeling comfortable with that, I'll move forward to other issues. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uintptr_t for Datum
Magnus Hagander mag...@hagander.net writes: 2009/12/31 Tom Lane t...@sss.pgh.pa.us: Where's the logic to provide a definition of intptr_t if the platform fails to? autoconf does that. Oh, that's what I get for trying to review a patch before absorbing any caffeine :-( ... I missed that you were relying on a built-in autoconf macro. That also came out of Bruce's patch. Bruce, can you look at doing that? I don't have a machine easily accessible with the right autoconf version ATM :( It's a really bad idea to be committing configure changes without having personally run the patch through autoconf. As penance for being too quick to complain, I'll review and commit this myself. If it works on my old HPUX box, it'll probably work everywhere ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
On Thu, Dec 31, 2009 at 10:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh singh.gurj...@gmail.com writes: We are seeking to enable SET ROLE in security-definer functions, since @ D.E Shaw there are scripts from the past that used this feature, and I think you'd also agree that SET ROLE is security definer functions has it's uses. Actually, I don't find that to be a given. Exactly what use-cases have you got that aren't solved as well or better by calling a SECURITY DEFINER function owned by the target role? As the code stands right now, I see that the only concern against enabling SET ROLE in SecDef functions is that, that when the local-user-id change context is exited, the GUC might be left out of sync. This statement represents a complete lack of understanding of the actual security problem. The actual security problem is that SET ROLE allows you to become any role that the *session* user is allowed to become. I understand that reasoning very well, its just that I forgot to cover that in the statement above. The reason for locking it down in security-restricted contexts is that we don't want that to happen: we need to confine the available privileges to only those that, say, the owner of the table being vacuumed would have. The patch submitted still prohibits SET ROLE in security restricted contexts, and yet allows it in security definer functions iff the function is not executed while security restrictions are enabled. I think I covered that here: quote So SET ROLE would be prohibited in maintenance operations, but allowed in SecDef functions (only if they are not invoked on a stack where maintenance operation was initiated earlier). /quote While it's possible that we could design some mechanism that would enforce this properly, I fear that it would be tricky and a likely source of future new security problems. In any case the net result would be that SET ROLE would behave differently from spec, so it would still be non-standard-compliant, just differently from before. So IMHO you really need to offer a convincing reason why we should even try to solve this, as opposed to telling people to use security definer functions. Ian would be in a better position to provide a use-case. Best regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.enterprisedb.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] uintptr_t for Datum
2009/12/31 Tom Lane t...@sss.pgh.pa.us: Magnus Hagander mag...@hagander.net writes: 2009/12/31 Tom Lane t...@sss.pgh.pa.us: Where's the logic to provide a definition of intptr_t if the platform fails to? autoconf does that. Oh, that's what I get for trying to review a patch before absorbing any caffeine :-( ... I missed that you were relying on a built-in autoconf macro. :-) That also came out of Bruce's patch. Bruce, can you look at doing that? I don't have a machine easily accessible with the right autoconf version ATM :( It's a really bad idea to be committing configure changes without having personally run the patch through autoconf. Right, this is why I had Bruce do that part, and send it to me separately. I figured one committer is as good as another. As penance for being too quick to complain, I'll review and commit this myself. If it works on my old HPUX box, it'll probably work everywhere ;-) Ok, deal :-) That's probably the one other platform beside Bruce's that gets reasonably-regular-testing and still doesn't have intptr_t. I'll be off to my newyears party now, enjoy the patch! Happy new year to you and other PostgreSQL hackers! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standy introduced problem with query cancel behavior
On Thu, 2009-12-31 at 13:14 +0100, Andres Freund wrote: When cancelling a backend that behaviour could be a bit annoying ;-) Reading comments alone doesn't show the full situation here. Before we send signal we check pid also, so the chances of this happening are fairly small. i.e. we would need to have a backend slot reused by a new backend with exactly same pid as the last slot holder. Well. The problem does not occur for critical errors (i.e. session death) only. As signal delivery is asynchronous it can very well happen for transaction cancellation as well. I'm proposing to use this for killing transactions and connections, so I don't think there's too much harm done there. If the backend is leaving anyway, that's what we wanted. If its a new guy that is wearing the same boots then a little collateral damage doesn't leave the server in a bad place. HS cancellations aren't yet so precise that this matters. Building racy infrastructure when it can be avoided with a little care still seems not to be the best path to me. Doing that will add more complexity in an area that is hard to test effectively. I think the risk of introducing further bugs while trying to fix this rare condition is high. Right now the conflict processing needs more work and is often much less precise than this, so improving this aspect of it would not be a priority. I've added it to the TODO though. Thank you for your research. I enclose the patch I am currently testing, as a patch-on-patch on top of Joachim's changes, recently published on this thread. POC only as yet. Patch implements recovery conflict signalling using SIGUSR1 multiplexing, then uses a SessionCancelPending mode similar to Joachim's TransactionCancelPending. -- Simon Riggs www.2ndQuadrant.com *** a/src/backend/storage/ipc/procarray.c --- b/src/backend/storage/ipc/procarray.c *** *** 1704,1710 GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid, * Returns pid of the process signaled, or 0 if not found. */ pid_t ! CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode) { ProcArrayStruct *arrayP = procArray; int index; --- 1704,1710 * Returns pid of the process signaled, or 0 if not found. */ pid_t ! CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode) { ProcArrayStruct *arrayP = procArray; int index; *** *** 1722,1733 CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode) if (procvxid.backendId == vxid.backendId procvxid.localTransactionId == vxid.localTransactionId) { - /* - * Issue orders for the proc to read next time it receives SIGINT - */ - if (proc-recoveryConflictMode cancel_mode) - proc-recoveryConflictMode = cancel_mode; - pid = proc-pid; break; } --- 1722,1727 *** *** 1741,1747 CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode) * Kill the pid if it's still here. If not, that's what we wanted * so ignore any errors. */ ! kill(pid, SIGINT); } return pid; --- 1735,1741 * Kill the pid if it's still here. If not, that's what we wanted * so ignore any errors. */ ! (void) SendProcSignal(pid, sigmode, vxid.backendId); } return pid; *** a/src/backend/storage/ipc/procsignal.c --- b/src/backend/storage/ipc/procsignal.c *** *** 24,29 --- 24,30 #include storage/procsignal.h #include storage/shmem.h #include storage/sinval.h + #include storage/standby.h /* *** *** 258,262 procsignal_sigusr1_handler(SIGNAL_ARGS) --- 259,269 if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT)) HandleNotifyInterrupt(); + if (CheckProcSignal(PROCSIG_CONFLICT_ERROR_INTERRUPT)) + HandleConflictInterrupt(ERROR); + + if (CheckProcSignal(PROCSIG_CONFLICT_FATAL_INTERRUPT)) + HandleConflictInterrupt(FATAL); + errno = save_errno; } *** a/src/backend/storage/ipc/standby.c --- b/src/backend/storage/ipc/standby.c *** *** 218,229 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist, if (WaitExceedsMaxStandbyDelay()) { pid_t pid; /* * Now find out who to throw out of the balloon. */ Assert(VirtualTransactionIdIsValid(*waitlist)); ! pid = CancelVirtualTransaction(*waitlist, cancel_mode); if (pid != 0) { --- 218,235 if (WaitExceedsMaxStandbyDelay()) { pid_t pid; + ProcSignalReason sigmode; + + if (cancel_mode == CONFLICT_MODE_ERROR) + sigmode = PROCSIG_CONFLICT_ERROR_INTERRUPT; + else + sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT; /* * Now find out who to throw out of the balloon. */ Assert(VirtualTransactionIdIsValid(*waitlist)); ! pid = CancelVirtualTransaction(*waitlist, sigmode); if (pid != 0) {
Re: [HACKERS] uintptr_t for Datum
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: Attached patch is the part of the win64 patch that changes Datum to be uintptr_t, and associated changes, with only very minor changes from me. It also includes autoconf tests that I tricked Bruce into fixing for me :-) Comments? This is a joke no? Where's the logic to provide a definition of intptr_t if the platform fails to? The lack of attention to updating the comments about Datum doesn't give me a warm feeling either. BTW, it looks like the patch is showing a manual change to pg_config.h.in. Don't do that. Run autoheader. I wasn't aware autoheader existed. Is that new or has it alwasy been part of autoconf? Attached is the diff for pg_config.h.in generated by autoheader. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: pg_config.h.in === RCS file: /cvsroot/pgsql/src/include/pg_config.h.in,v retrieving revision 1.143 diff -c -r1.143 pg_config.h.in *** pg_config.h.in 1 Oct 2009 01:58:58 - 1.143 --- pg_config.h.in 31 Dec 2009 17:35:33 - *** *** 236,241 --- 236,244 /* Define to 1 if the system has the type `int8'. */ #undef HAVE_INT8 + /* Define to 1 if the system has the type `intptr_t'. */ + #undef HAVE_INTPTR_T + /* Define to 1 if you have the inttypes.h header file. */ #undef HAVE_INTTYPES_H *** *** 599,604 --- 602,610 /* Define to 1 if the system has the type `uint8'. */ #undef HAVE_UINT8 + /* Define to 1 if the system has the type `uintptr_t'. */ + #undef HAVE_UINTPTR_T + /* Define to 1 if the system has the type `union semun'. */ #undef HAVE_UNION_SEMUN *** *** 827,835 --- 833,849 #undef inline #endif + /* Define to the type of a signed integer type wide enough to hold a pointer, +if such a type exists, and if the system does not define it. */ + #undef intptr_t + /* Define to empty if the C compiler does not understand signed types. */ #undef signed + /* Define to the type of an unsigned integer type wide enough to hold a +pointer, if such a type exists, and if the system does not define it. */ + #undef uintptr_t + /* Define to empty if the keyword `volatile' does not work. Warning: valid code using `volatile' can become incorrect without. Disable with care. */ #undef volatile -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of plperl inter-sp calling
On Dec 30, 2009, at 2:54 PM, Tim Bunce wrote: That handles the arity of the calls and invokes the right SP, bypassing SQL if the SP is already loaded. Nice. That much works currently. Behind the scenes, when a stored procedure is loaded into plperl the code ref for the perl sub is stored in a cache. Effectively just $cache{$name}[$nargs] = $coderef; An SP::AUTOLOAD sub intercepts any SP::* call and effectively does lookup_sp($name, \...@_)-(@_); For SPs that are already loaded lookup_sp returns $cache{$name}[$nargs] so the overhead of the call is very small. Definite benefit, there. How does it handle the difference between IMMUTABLE | STABLE | VOLATILE, as well as STRICT functions? And what does it do if the function called is not actually a Perl function? For SPs that are not cached, lookup_sp returns a code ref of a closure that will invoke $name with the args in @_ via spi_exec_query(select * from $name($encoded_args)); The fallback-to-SQL behaviour neatly handles non-cached SPs (forcing them to be loaded and thus cached), and inter-language calling (both plperl-plperl and other PLs). Is there a way for such a function to be cached? If not, I'm not sure where cached functions come from. Limitations: * It's not meant to handle type polymorphism, only the number of args. Well, spi_exec_query() handles the type polymorphism. So might it be possible to call SP::function() and have it not use a cached query? That way, one gets the benefit of polymorphism. Maybe there's a SP package that does caching, and an SPI package that does not? (Better named, though.) * When invoked via SQL, because the SP isn't cached, all non-ref args are all expressed as strings via quote_nullable(). Any array refs are encoded as ARRAY[...] via encode_array_constructor(). Hrm. Why not use spi_prepare() and let spi_exec_prepared() handle the quoting? I don't see either of those as significant issues: If you need more control for a particular SP then don't use SP::* to call that SP. If there was a non-cached version that was essentially just sugar for the SPI stuff, I think that would be more predicable, no? I'm not saying there shouldn't be a cached interface, just that it should not be the first choice when using polymorphic functions and non-PL/Perl functions. Open issues: * What should SP::foo(...) return? The plain as-if-called-by-perl return value, or something closer to what spi_exec_query() returns? The former. * If the called SP::foo(...) calls return_next those rows are returned directly to the client. That can be construed as a feature. As a list? Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 12:10 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Greg Stark wrote: Hm, this raises the issue that you'll have to figure out what should happen if two different transactions are using different isolation modes. Currently our two isolation modes only control behaviour within your transaction so they co-exist perfectly fine. ISTM you would need to have transactions in read-comitted and snapshot isolation modes recording what sets of records they read in order to be able to guarantee serializable transactions to make any guarantees as well. No, there is no requirement that serializable transactions serialize with weaker modes. The Cahill thesis addresses this point directly. Unless you can point out some flaw in his proofs, this is not an issue. [criticisms of hypothetical implementation techniques] There are no such proposals on the table, and the hypothetical techniques you mention seem unlikely to be ones I would use. The one and only issue I have on the table at the moment is to create a new lock method for SIREAD locks. I'd welcome any comments on that. If I get to the point of feeling comfortable with that, I'll move forward to other issues. Kevin, I think I understand why you're trying to break this down into manageable pieces, but I don't think it's really possible to have this conversation in isolation. If your question is Could it ever be acceptable to add a new lock mode? then I answer yes. And I am pretty confident that this will also be the consensus view. But if your question is Does it make sense to handle SIREAD locks as a new lock mode? then I answer I don't know, because I haven't seen the whole design yet. You just can't answer a question like this in isolation. It seems to me that the hard part of this problem is to describe the general mechanism by which conflicts will be detected, with specific references to the types of data structures that will be used to hold that information. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uintptr_t for Datum
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: BTW, it looks like the patch is showing a manual change to pg_config.h.in. Don't do that. Run autoheader. I wasn't aware autoheader existed. Is that new or has it alwasy been part of autoconf? It's always been there, or at least for many years. pg_config.h.in really ought to be thought of the same as configure: you don't edit it, you just generate it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Actually, I don't find that to be a given. Exactly what use-cases have you got that aren't solved as well or better by calling a SECURITY DEFINER function owned by the target role? Oh, that's easy: If you want to do the equivalent of setreuid(geteuid(), getuid()); that is, if you want to drop privileges for a particular operation. Our particular use case is that we want to evaluate an expression provided by the caller but with the caller's privileges. Cheers, --Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of plperl inter-sp calling
On Wed, Dec 30, 2009 at 7:41 PM, David E. Wheeler da...@kineticode.com wrote: On Dec 30, 2009, at 4:17 PM, Robert Haas wrote: That much works currently. Behind the scenes, when a stored procedure is loaded into plperl the code ref for the perl sub is stored in a cache. Effectively just $cache{$name}[$nargs] = $coderef; That doesn't seem like enough to guarantee that you've got the right function. What if you have two functions with the same number of arguments but different argument types? And what about optional arguments, variable arguments, etc.? As Tim said elsewhere: I don't see either of those as significant issues: If you need more control for a particular SP then don't use SP::* to call that SP. Sorry, I missed that. I guess it seems weird to me to handle overloading, but only partially. If we're OK with punting, why not punt the whole thing and just have $cache{$name} = $coderef? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Robert Haas wrote: It seems to me that the hard part of this problem is to describe the general mechanism by which conflicts will be detected, with specific references to the types of data structures that will be used to hold that information. Well, the general approach to tracking SIREAD locks I had in mind is to keep them in the existing lock data structures. I have in mind to use multiple granularities, with automatic escalation to coarser granularities at thresholds, to keep RAM usage reasonable. There are clearly some tough problems with the pluggable indexes, types, operators, and such that will take time to sort out an acceptable implementation at any fine-grained level, so my intent it to punt those to very coarse granularity in the first pass, with XXX SIREAD optimization opportunity comments where that's not a production- quality solution or it just seems likely that we can do better with some work. I didn't want to get too detailed before I checked that creating a new lock method for this seemed sane, since the details of the implementation depend on that choice. Lack of detail tends to draw accusations of hand-waving, so I was trying to stay away from those details until my intuition on the lock method was confirmed or shot down, so I could solidify those details before presenting them. There is a bit of a chicken and egg problem with moving this forward -- I guess I was overly conservative on what I presented. I do understand that this does mean that more RAM will need to be allocated to the lock structures to support serializable mode. I don't think that any other option is likely to provide acceptable performance. I also realize that this means that in the final form, optimized to where my shop considers it usable, there will still be coarser granularity than theoretically possible and resulting false positives causing serialization failures for which the cause is obscure. We don't care, and anyone who does will probably not want to use this isolation level. Traditional S2PL doesn't have that fault, but it blocks so badly that performance is worse; I'll take the transaction restarts over that any day. I know there are others who won't. Basically, the reasons given for having separate lock methods for DEFAULT (normal) locks and USER locks seem to apply with almost as much force to SIREAD locks (no blocking between them, different source of setting, different lifespans), so I was pretty sure this was a sane choice, but I just wanted a quick reality check before developing the level of detail that would move this past hand-waving. Other than the SIREAD locks to cover predicate locking for serializable transactions, there is no change to what locks are acquired. There is no change to blocking or deadlock detection and recovery. Other transaction isolation levels do not need to change, except perhaps to fast-path a skip over blocking and deadlock checking against SIREAD locks (one of those details I'm looking at). Let me know if you need more information to firm up an opinion on the sanity of my intuition regarding the new lock method; I'm eager to move on to the next level of detail. And thanks for the feedback. :-) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] uintptr_t for Datum
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: BTW, it looks like the patch is showing a manual change to pg_config.h.in. Don't do that. Run autoheader. I wasn't aware autoheader existed. Is that new or has it alwasy been part of autoconf? It's always been there, or at least for many years. pg_config.h.in really ought to be thought of the same as configure: you don't edit it, you just generate it. Well, that's pretty confusing considering it has a .in suffix, just like configure.in, which we do edit, but I get your point. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on statistics for continuously advancing columns
Tom Lane t...@sss.pgh.pa.us writes: Actually, in the problematic cases, it's interesting to consider the following strategy: when scalarineqsel notices that it's being asked for a range estimate that's outside the current histogram bounds, first try to obtain the actual current max() or min() of the column value --- this is something we can get fairly cheaply if there's a btree index on the column. If we can get it, plug it into the histogram, replacing the high or low bin boundary. Then estimate as we currently do. This would work reasonably well as long as re-analyzes happen at a time scale such that the histogram doesn't move much overall, ie, the number of insertions between analyzes isn't a lot compared to the number of rows per bin. We'd have some linear-in-the-bin-size estimation error because the modified last or first bin actually contains more rows than other bins, but it would certainly work a lot better than it does now. I know very little about statistics in general, but your proposal seems straigth enough for me to understand it, and looks good: +1. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on statistics for continuously advancing columns
On Wed, 2009-12-30 at 14:55 -0500, Tom Lane wrote: Actually, in the problematic cases, it's interesting to consider the following strategy: when scalarineqsel notices that it's being asked for a range estimate that's outside the current histogram bounds, first try to obtain the actual current max() or min() of the column value --- this is something we can get fairly cheaply if there's a btree index on the column. If we can get it, plug it into the histogram, replacing the high or low bin boundary. Then estimate as we currently do. This would work reasonably well as long as re-analyzes happen at a time scale such that the histogram doesn't move much overall, ie, the number of insertions between analyzes isn't a lot compared to the number of rows per bin. We'd have some linear-in-the-bin-size estimation error because the modified last or first bin actually contains more rows than other bins, but it would certainly work a lot better than it does now. Histograms often move quickly, but they seldom change shape. Why not get both max() and min(), then rebase the histogram according to those values. That way the histogram can still move significantly and the technique will still work. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
Turner, Ian wrote: -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Actually, I don't find that to be a given. Exactly what use-cases have you got that aren't solved as well or better by calling a SECURITY DEFINER function owned by the target role? Oh, that's easy: If you want to do the equivalent of setreuid(geteuid(), getuid()); that is, if you want to drop privileges for a particular operation. Our particular use case is that we want to evaluate an expression provided by the caller but with the caller's privileges. Now *that's* what we should focus on. That's a reasonable use case, but it doesn't seem like SET ROLE quite cuts it. For starters, wouldn't it be possible for the caller's expression to call SET ROLE or RESET ROLE to regain the privileges? You could write a user-defined C function that does the same that VACUUM/ANALYZE etc. do (now that we've fixed the vulnerabilities). Ie. something like: GetUserIdAndSecContext(save_userid, save_sec_context); SetUserIdAndSecContext(userid with less privileges, save_sec_context | SECURITY_RESTRICTED_OPERATION); call function /* Restore userid and security context */ SetUserIdAndSecContext(save_userid, save_sec_context); No modifications to the server code required. Another question is, could we provide some built-in support for dropping privileges like this? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Turner, Ian wrote: From: Tom Lane [mailto:t...@sss.pgh.pa.us] Actually, I don't find that to be a given. Exactly what use-cases have you got that aren't solved as well or better by calling a SECURITY DEFINER function owned by the target role? Oh, that's easy: If you want to do the equivalent of setreuid(geteuid(), getuid()); that is, if you want to drop privileges for a particular operation. Our particular use case is that we want to evaluate an expression provided by the caller but with the caller's privileges. Now *that's* what we should focus on. That's a reasonable use case, but it doesn't seem like SET ROLE quite cuts it. Exactly. If that's what you want, we can talk about it, but *SET ROLE doesn't solve that problem*. In fact, a security definer function is a lot closer to solving that problem than SET ROLE is. The premise of SET ROLE is that you can always get to any role that the session user could get to, so it doesn't give up permissions in any non-subvertible fashion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Exactly. If that's what you want, we can talk about it, but *SET ROLE doesn't solve that problem*. In fact, a security definer function is a lot closer to solving that problem than SET ROLE is. The premise of SET ROLE is that you can always get to any role that the session user could get to, so it doesn't give up permissions in any non-subvertible fashion. For our purposes, SET ROLE is adequate, because the expression can't contain function calls. But there are alternative: We could create an in-transaction SECURITY DEFINER procedure which executes the expression, then drop the procedure before committing. A built-in feature for doing something like what Heikki suggests could be even more useful. Cheers, --Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 1:43 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas wrote: It seems to me that the hard part of this problem is to describe the general mechanism by which conflicts will be detected, with specific references to the types of data structures that will be used to hold that information. Well, the general approach to tracking SIREAD locks I had in mind is to keep them in the existing lock data structures. I have in mind to use multiple granularities, with automatic escalation to coarser granularities at thresholds, to keep RAM usage reasonable. OK. I think it will become more clear whether the existing lock data structures are adequate as you move into detailed design. It doesn't seem critical to make a final decision about that right now. One bad thing about using the existing lock structures is that they are entirely in shared memory, which limits how large they can be. If you should find out that you're going to need more work space than can be conveniently accommodated in shared memory, you will have to think about other options. But I don't know for sure whether that will be the case. The fact that the locks need to be kept around until transactions other than the owner commit is certainly going to drive the size up. There are clearly some tough problems with the pluggable indexes, types, operators, and such that will take time to sort out an acceptable implementation at any fine-grained level, so my intent it to punt those to very coarse granularity in the first pass, with XXX SIREAD optimization opportunity comments where that's not a production- quality solution or it just seems likely that we can do better with some work. It seems to me that if you lock the heap (either individual rows, or the whole thing) none of that stuff really matters. It might defeat future optimizations such as index-only scans in some cases, and it might create full-table locks in situations where a more intelligent implementation might use less than a full-table lock, but those may be (probably are) prices you are willing to pay. As an overall design comment, I sometimes find that it helps to create a working implementation of something, even if I know that the performance will suck or that the result will not be committable for other reasons. There is often value to that just in terms of getting your head around the parts of the code that need to be modified. I wonder if you couldn't start with something ridiculously poor, like maybe an S2PL implementation with only table-level granularity - just make any operation that reads or writes a table grab an ACCESS EXCLUSIVE lock until transaction commit. Convince yourself that it is CORRECT - forget performance. Then either change the locks to SIREAD, or try to weaken the locks to row-level in certain cases. Then do the other one. It'll take you a while before you have something that can seriously be considered for commit, but that's not the point. The point is you'll have working code that you can fool with. And use git so you can keep merging up to CVS HEAD easily. And thanks for the feedback. :-) Sure thing. :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on statistics for continuously advancing columns
Simon Riggs si...@2ndquadrant.com writes: Why not get both max() and min(), then rebase the histogram according to those values. That way the histogram can still move significantly and the technique will still work. Define rebase, keeping in mind that this has to work on datatypes that we don't have a distance metric for. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
Turner, Ian ian.tur...@deshaw.com writes: From: Tom Lane [mailto:t...@sss.pgh.pa.us] Exactly. If that's what you want, we can talk about it, but *SET ROLE doesn't solve that problem*. In fact, a security definer function is a lot closer to solving that problem than SET ROLE is. The premise of SET ROLE is that you can always get to any role that the session user could get to, so it doesn't give up permissions in any non-subvertible fashion. For our purposes, SET ROLE is adequate, because the expression can't contain function calls. Really? What can it contain, and how are you enforcing that? Even more to the point, if you have managed to restrict it to the point where there's no possibility of someone executing a SET ROLE, why do you need any permissions switch at all? That's isomorphic to claiming that it won't execute any SQL command at all, in which case you needn't worry about changing permissions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Really? What can it contain, and how are you enforcing that? Anything except a function call. We look for non-keyword identifier followed by open parenthesis, which is probably excessively restrictive. I'd rather have something less kludgey, of course. This will also become a real nightmare if new identifier quoting approaches are introduced. Even more to the point, if you have managed to restrict it to the point where there's no possibility of someone executing a SET ROLE, why do you need any permissions switch at all? We don't want to have to check access privileges on every object referenced by the statement, which (as I'm sure you're aware) can get real nasty real quick. That's isomorphic to claiming that it won't execute any SQL command at all, in which case you needn't worry about changing permissions. Not sure what you mean here, can you elaborate? --Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
Turner, Ian ian.tur...@deshaw.com writes: From: Tom Lane [mailto:t...@sss.pgh.pa.us] Really? What can it contain, and how are you enforcing that? Anything except a function call. We look for non-keyword identifier followed by open parenthesis, which is probably excessively restrictive. I'm afraid this is security by wishful thinking. Operators and casts, to name two obvious examples, can invoke user-defined code. Postgres is sufficiently extensible that preventing any execution of non-core code is nearly impossible, at least not without limiting things to the point of uselessness. Even more to the point, if you have managed to restrict it to the point where there's no possibility of someone executing a SET ROLE, why do you need any permissions switch at all? That's isomorphic to claiming that it won't execute any SQL command at all, in which case you needn't worry about changing permissions. Not sure what you mean here, can you elaborate? If you had a mechanism that ensured that the untrusted user couldn't execute SET ROLE (which you don't), they couldn't execute anything else either, and therefore the question of what permissions they're running with isn't really important. I agree that you have a problem to solve, but defining the problem as please can we have SET ROLE back is not going to lead you to a secure solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re-enabling SET ROLE in security definer functions
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] I agree that you have a problem to solve, but defining the problem as please can we have SET ROLE back is not going to lead you to a secure solution. Fair enough. Thanks for the analysis. --Ian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Red-black tree for GIN
2009/11/23 Teodor Sigaev teo...@sigaev.ru: Hi there, attached is a patch, which contains implementation of a red-black tree, a self-balanced implementation of binary tree. The main goal of this patch is to improve creation time of GIN index in the corner cases. While creation, GIN collects data in memory in binary tree until it reach some limit and then it flush tree to disk. Some data could produces unbalanced binary tree, for example, sorted data, so the tree degenerates to the list with O(N^2) processing time (see thread http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php) ), which cause very slow index creation. Tom has fixed that by limiting depth of tree (committed to 8.3 and 8.4), but we found it's not enough and propose to use red-black tree, which is very good for skewed data and has almost the same performance for unsorted data, see http://www.sai.msu.su/~megera/wiki/2009-07-27 and http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information. Implementation of red-black tree has several currently unused methods, but they will be used in next patches. I did a quick read-through of this, and one question that immediately occurred to me is that rbtree.c says that it is adopted from http://algolist.manual.ru/ds/rbtree.php. But I'm not sure what license that code is under, so I'm not sure whether it's OK for us to use it. My other question is as related to performance. Can you provide a test case that shows the performance improvement with this patch? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, 2009-12-31 at 09:11 -0600, Kevin Grittner wrote: Yeah, that's why this is a two to four year project. I started a wiki page here: http://wiki.postgresql.org/wiki/Serializable I didn't add much content yet, but can participants in this discussion please try to organize the various issues as they progress? If there was an existing wiki page I couldn't find it. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Thu, Dec 31, 2009 at 11:12 AM, Andrew Dunstan and...@dunslane.net wrote: David E. Wheeler wrote: On Dec 31, 2009, at 1:04 AM, Peter Eisentraut wrote: I think the primary use will be to load a JSON value into Perl or Python and process it there. So a json type that doesn't have any interesting operators doesn't sound useless to me. The features I would like to get out of it are input validation and encoding handling and smooth integration with said languages. What about access to various parts of a JSON data structure? Or is that just asking for too much trouble up-front? IMNSHO it's essential. I think Peter's approach of ignoring this requirement is extremely shortsighted. I could go either way on this. As a practical matter, we probably shouldn't pick a library that is only a validator without any ability to manipulate the data structure. And as a further practical matter, that done, it's probably not that much work to expose whatever other functionality that library provides. But I would not go to the extent of saying that we should try to figure out from first principles what functionality we want to include and then make it a requirement that the chosen library must support all of those things. That seems like a recipe for failure... Anyhow, that brings me back to the question I asked upthread, which is Can/should we suck one of these libraries into our code base (and if so, which?) or do we need to add an analogue of --with-libxml so that we can link against an external library if present and omit the feature otherwise?. Does anyone have any real-world experience with any of the JSON C libraries? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on statistics for continuously advancing columns
On Thu, 2009-12-31 at 15:18 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Why not get both max() and min(), then rebase the histogram according to those values. That way the histogram can still move significantly and the technique will still work. Define rebase, keeping in mind that this has to work on datatypes that we don't have a distance metric for. Make it work differently according to whether we have, or not, just as we do elsewhere with stats. No point in limiting ourselves to the lowest common denominator, especially when the common case is integer keys and time datatypes. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Robert Haas wrote: OK. I think it will become more clear whether the existing lock data structures are adequate as you move into detailed design. I've gotten far enough in reviewing it to be pretty convinced that they'll cover all the granularities I'm likely to want unless I get to the point of wanting to try to lock individual columns within individual rows. It probably won't come to that, so I figure I'll cross that bridge if and when I come to it. As an overall design comment, I sometimes find that it helps to create a working implementation of something, even if I know that the performance will suck or that the result will not be committable for other reasons. There is often value to that just in terms of getting your head around the parts of the code that need to be modified. That's exactly where I've been trying to go at this point. I wonder if you couldn't start with something ridiculously poor, like maybe an S2PL implementation with only table-level granularity - just make any operation that reads or writes a table grab an ACCESS EXCLUSIVE lock until transaction commit. There's an idea I hadn't thought of -- doing S2PL but with ACCESS EXCLUSIVE locks for the read locks to test the predicate locking. It would let me play around with testing that phase before I moved to the next with minimal wasted effort. Convince yourself that it is CORRECT - forget performance. Then either change the locks to SIREAD, or try to weaken the locks to row-level in certain cases. Then do the other one. It'll take you a while before you have something that can seriously be considered for commit, but that's not the point. The point is you'll have working code that you can fool with. We're very much on the same page. My goal was to get predicate locking that didn't miss anything, even though it was ridiculously coarse, then implement the simplest possible SSI on top of it, without worrying about optimizations, then incrementally move toward production quality. I clearly didn't communicate that as well as I'd hoped. :-( Anyway, I'll think about the S2PL with ACCESS EXCLUSIVE locks for reads; if I can't punch any holes in that as a valid environment to test the predicate locking logic, I'll do that first, then switch them to SIREAD locks and work on the SSI logic. And use git so you can keep merging up to CVS HEAD easily. I know. It's on my list to do soon. Thanks again, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A third lock method
I wrote: It's just a little bit of a stretch to call SILOCKs locks, because they don't actually block anything. They are used at various points to see where a transaction is reading data which has been modified by another transaction or vice versa. And they do need to be kept until all concurrent transactions have completed. Other than those quirks, they behave pretty much like read locks, though, so it seems to make sense to use the locking system for them. The differences are such that I thought a new lock method might be appropriate. This thread is to try to solicit opinions on whether that makes sense to anyone but me. :-) Once I sort out the subject issue, I'm about ready to try to start generating a very rough prototype of predicate locking. I don't want to start a discussion of those details on this thread, because it seems to me that a decision on the subject issue affects significant details about how I go about that. Based on feedback from Robert Haas on another thread, I think this thread should be considered wrapped. It seems to me like SIREAD locks should be handled by a different lock method, but before I go there I will probably initially develop and test the predicate locking logic by using ACCESS EXCLUSIVE locks for all reads, just to confirm correct coverage of the predicates. Thanks, all. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Jeff Davis wrote: I started a wiki page here: http://wiki.postgresql.org/wiki/Serializable I'll try to get that filled in with something useful over the weekend. I'm heading to a party soon, and may not be in shape to work on it tomorrow ;-) Happy New Year, all! -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Robert Haas wrote: Anyhow, that brings me back to the question I asked upthread, which is Can/should we suck one of these libraries into our code base (and if so, which?) or do we need to add an analogue of --with-libxml so that we can link against an external library if present and omit the feature otherwise?. Does anyone have any real-world experience with any of the JSON C libraries? I do not, but I see that YAJL http://lloyd.github.com/yajl/ is now in Fedora, and has a BSDish license, so maybe that's a good place to start. Maybe someone would like to try designing an API which could sit atop that. Then we would not need to speculate based on principle. I'd rather we use a library we can pull in like libxml than have to import the source and have to keep in sync with the upstream. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
Robert Haas robertmh...@gmail.com writes: Anyhow, that brings me back to the question I asked upthread, which is Can/should we suck one of these libraries into our code base (and if so, which?) or do we need to add an analogue of --with-libxml so that we can link against an external library if present and omit the feature otherwise?. Count me as -1 for sucking in any sizable amount of code for this. I do not wish to be on the hook to maintain something like that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
aaah... I think I see where we've gone off track in previous discussions...you think postgres keeps row level locks in a shared memory data structure. It doesn't it stores all row level locks *in* the tuple itself. It only stores the lock in memory briefly while actually acquiring the lock. Once it acquires it the only record of the lock is the xid in the tuple itself. This means there are no memory limits on the number of records locked by a transaction. storing the lock data in the tuples won't work for you at all because you need to lock rows that don't exist yet at all.that's why where to store the lock is a critical blocking issue to figure out to know whether the plan is feasible at all. On Thursday, December 31, 2009, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas wrote: It seems to me that the hard part of this problem is to describe the general mechanism by which conflicts will be detected, with specific references to the types of data structures that will be used to hold that information. Well, the general approach to tracking SIREAD locks I had in mind is to keep them in the existing lock data structures. I have in mind to use multiple granularities, with automatic escalation to coarser granularities at thresholds, to keep RAM usage reasonable. There are clearly some tough problems with the pluggable indexes, types, operators, and such that will take time to sort out an acceptable implementation at any fine-grained level, so my intent it to punt those to very coarse granularity in the first pass, with XXX SIREAD optimization opportunity comments where that's not a production- quality solution or it just seems likely that we can do better with some work. I didn't want to get too detailed before I checked that creating a new lock method for this seemed sane, since the details of the implementation depend on that choice. Lack of detail tends to draw accusations of hand-waving, so I was trying to stay away from those details until my intuition on the lock method was confirmed or shot down, so I could solidify those details before presenting them. There is a bit of a chicken and egg problem with moving this forward -- I guess I was overly conservative on what I presented. I do understand that this does mean that more RAM will need to be allocated to the lock structures to support serializable mode. I don't think that any other option is likely to provide acceptable performance. I also realize that this means that in the final form, optimized to where my shop considers it usable, there will still be coarser granularity than theoretically possible and resulting false positives causing serialization failures for which the cause is obscure. We don't care, and anyone who does will probably not want to use this isolation level. Traditional S2PL doesn't have that fault, but it blocks so badly that performance is worse; I'll take the transaction restarts over that any day. I know there are others who won't. Basically, the reasons given for having separate lock methods for DEFAULT (normal) locks and USER locks seem to apply with almost as much force to SIREAD locks (no blocking between them, different source of setting, different lifespans), so I was pretty sure this was a sane choice, but I just wanted a quick reality check before developing the level of detail that would move this past hand-waving. Other than the SIREAD locks to cover predicate locking for serializable transactions, there is no change to what locks are acquired. There is no change to blocking or deadlock detection and recovery. Other transaction isolation levels do not need to change, except perhaps to fast-path a skip over blocking and deadlock checking against SIREAD locks (one of those details I'm looking at). Let me know if you need more information to firm up an opinion on the sanity of my intuition regarding the new lock method; I'm eager to move on to the next level of detail. And thanks for the feedback. :-) -Kevin -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Group and Role Disagreement
Fellow Hackers, Given this SQL: BEGIN; CREATE ROLE foo WITH NOLOGIN; CREATE ROLE foo_bar WITH LOGIN PASSWORD '***' INHERIT IN ROLE foo; CREATE ROLE foo_baz WITH LOGIN PASSWORD '***' INHERIT IN ROLE foo; CREATE ROLE foo_yow WITH LOGIN PASSWORD '***' INHERIT IN ROLE foo, foo_bar, foo_baz; SELECT groname, array_agg(rolname) FROM pg_group JOIN pg_roles ON pg_roles.oid = ANY(grolist) WHERE groname IN ('foo', 'foo_bar', 'foo_baz', 'foo_yow') GROUP BY groname; SELECT r.rolname, ARRAY(SELECT b.rolname FROM pg_catalog.pg_auth_members m JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid) WHERE m.member = r.oid) as memberof FROM pg_catalog.pg_roles r WHERE rolname IN ('foo', 'foo_bar', 'foo_baz', 'foo_yow'); ROLLBACK; The output is: BEGIN CREATE ROLE CREATE ROLE CREATE ROLE CREATE ROLE groname | array_agg -+--- foo | {foo_bar,foo_baz,foo_yow} (1 row) rolname | memberof -+--- foo | {} foo_bar | {foo} foo_baz | {foo} foo_yow | {foo,foo_bar,foo_baz} (4 rows) ROLLBACK My question is: why is the group membership of the foo_bar, foo_baz, and foo_yow roles not reflected in pg_group? Should it not have the same associations as pg_roles? A quick query shows that the only record in pg_group is for the foo group -- it doesn't even know that the foo_bar, foo_baz, and foo_yow roles also act as groups. Should it? Thanks, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exec_execute_message crash
Tatsuo Ishii is...@postgresql.org writes: Done. Inclded are C test program along with modified fe-exec.c. The modification made to fe-exec.c is sending Sync after Parse, Bind and Describe. Pgpool-II does this in order to get current transaction status. I tried this but didn't have any luck crashing the backend. libpq gets tremendously confused by the extra ReadyForQuery responses, which is unsurprising. The postmaster log shows LOG: could not send data to client: Broken pipe ERROR: relation foo does not exist at character 15 STATEMENT: SELECT * FROM foo ERROR: unnamed prepared statement does not exist ERROR: current transaction is aborted, commands ignored until end of transaction block ERROR: current transaction is aborted, commands ignored until end of transaction block STATEMENT: SELECT NULL , n.nspname, ct.relname, a.attname, a.attnum, ci.relname FROM pg_catalog.pg_namespace n, pg_catalog.pg_class ct, pg_catalog.pg_class ci, pg_catalog.pg_attribute a, pg_catalog.pg_index i WHERE ct.oid=i.indrelid AND ci.oid=i.indexrelid AND a.attrelid=ci.oid AND i.indisprimary AND ct.relname = 'mst_Ucompany_feature_setting' AND ct.relnamespace = n.oid AND n.nspname = 'foo' ORDER BY 1, 2, 3 So the unnamed prepared statement does not exist bit seems to be related to what you are talking about, but it doesn't actually fail. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Group and Role Disagreement
David E. Wheeler da...@kineticode.com writes: My question is: why is the group membership of the foo_bar, foo_baz, and foo_yow roles not reflected in pg_group? Per the fine manual: The view pg_group exists for backwards compatibility: it emulates a catalog that existed in PostgreSQL before version 8.1. It shows the names and members of all roles that are marked as not rolcanlogin, which is an approximation to the set of roles that are being used as groups. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Group and Role Disagreement
On Dec 31, 2009, at 3:41 PM, Tom Lane wrote: My question is: why is the group membership of the foo_bar, foo_baz, and foo_yow roles not reflected in pg_group? Per the fine manual: The view pg_group exists for backwards compatibility: it emulates a catalog that existed in PostgreSQL before version 8.1. It shows the names and members of all roles that are marked as not rolcanlogin, which is an approximation to the set of roles that are being used as groups. Ah, hadn't noticed that, thanks for the pointer to TFM. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Greg Stark wrote: aaah... I think I see where we've gone off track in previous discussions...you think postgres keeps row level locks in a shared memory data structure. It doesn't it stores all row level locks *in* the tuple itself. It only stores the lock in memory briefly while actually acquiring the lock. Once it acquires it the only record of the lock is the xid in the tuple itself. This means there are no memory limits on the number of records locked by a transaction. storing the lock data in the tuples won't work for you at all because you need to lock rows that don't exist yet at all.that's why where to store the lock is a critical blocking issue to figure out to know whether the plan is feasible at all. I'm probably not quite as clueless as you think on this; I realize that keeping SIREAD locks in memory will require many more slots for locks, escalation from tuple level to page or coarser when there are many on a table, or (most likely) both. Please have patience while I try to work out the details; until I get a bit farther, everyone will be spinning their wheels if we try to get too far into details -- it will all be speculation and/or hand-waving, with much mayhem to straw men. This much is fairly firm in my head, so I probably should share: What I do think is that the structures for regular locks seem usable to track the information I need without having to burden read operations with disk output, which I see as absolutely necessary for adequate performance. It also gives me somewhere to store locks related to search ranges where data doesn't currently exist, and the ability to store read locks from many transactions against the same data. An open question in my mind is whether I might need to keep write locks for serializable transactions in the shared memory lock hash table until commit or rollback; I rather suspect that I will, with similar granularity escalation policies. That's likely to raise a hue and cry, but like I've said before -- I won't try to force anybody to use this, and the structures involved are of reasonable size to allow using many of them. I suppose these more persistent write locks should be kept out of the DEFAULT lock method, too Thanks, and Happy New Year! -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add hstore_to_json()
On Thu, Dec 31, 2009 at 5:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Anyhow, that brings me back to the question I asked upthread, which is Can/should we suck one of these libraries into our code base (and if so, which?) or do we need to add an analogue of --with-libxml so that we can link against an external library if present and omit the feature otherwise?. Count me as -1 for sucking in any sizable amount of code for this. I do not wish to be on the hook to maintain something like that. OK, that's why I ask these questions. :-) How much would be siz(e)able? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 4:44 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: I wonder if you couldn't start with something ridiculously poor, like maybe an S2PL implementation with only table-level granularity - just make any operation that reads or writes a table grab an ACCESS EXCLUSIVE lock until transaction commit. There's an idea I hadn't thought of -- doing S2PL but with ACCESS EXCLUSIVE locks for the read locks to test the predicate locking. It would let me play around with testing that phase before I moved to the next with minimal wasted effort. What predicate locking? If you take ACCESS EXCLUSIVE locks on every read, that should serialize all access to every table. Predicate locking wouldn't do anything, because the table would be completely inaccessible to all competing transactions. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
Robert Haas wrote: What predicate locking? If you take ACCESS EXCLUSIVE locks on every read, that should serialize all access to every table. Predicate locking wouldn't do anything, because the table would be completely inaccessible to all competing transactions. Yeah, that's the benefit of starting with the ACCESS EXCLUSIVE locks, but once I've confirmed that I've found all the places to get the table level locks, the next step is to turn them into table level SIREAD locks, and then to implement the SSI. Locking against referenced objects is the only practical technique for implementing predicate locking for production environments that I've seen. The phase where I'm making each referenced table totally inaccessible to all competing transaction should be pretty short-lived. It just gives me an interim milestone to test that piece in isolation before going on to use it; which is great, but not a place to stop for long. Or have I totally misunderstood your suggestion? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serializable Isolation without blocking
On Thu, Dec 31, 2009 at 7:45 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas wrote: What predicate locking? If you take ACCESS EXCLUSIVE locks on every read, that should serialize all access to every table. Predicate locking wouldn't do anything, because the table would be completely inaccessible to all competing transactions. Yeah, that's the benefit of starting with the ACCESS EXCLUSIVE locks, but once I've confirmed that I've found all the places to get the table level locks, the next step is to turn them into table level SIREAD locks, and then to implement the SSI. Locking against referenced objects is the only practical technique for implementing predicate locking for production environments that I've seen. The phase where I'm making each referenced table totally inaccessible to all competing transaction should be pretty short-lived. It just gives me an interim milestone to test that piece in isolation before going on to use it; which is great, but not a place to stop for long. Or have I totally misunderstood your suggestion? Nope, you're on target. Although - if I were you - I would post the ACCESS EXCLUSIVE lock version of the patch for feedback. I can't speak for anyone else, but I'll read it. (Just clearly label it as what it is, of course.) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] about some parameters
Hi, Every time i explain what is the fsync parameter for, the next thing i always say is: never turn it off, and now that we have synchronous_commit there is no good reason for turn fsync off... so why are we still let it be in the postgresql.conf where it's so vulnerable to a misguided dba? even just not show it should be great. another parameter that is interesting is seq_page_cost, AFAIUI all the other cost parameters (in the planner section of course) are relative to this one. so what is the logic in allow changing it? going a little further, when we have this per tablespace what will mean to change a tablespace seq_page_cost? it will be relative to the general one? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers