[HACKERS] proposal: tsearch dictionary initialization hook
Hello, I propose a new hook type - that helps with controlling a life cycle of some tsearch dictionaries. This hook has minimal impact on performance - it's called once per session for one tsearch configuration. Regards Pavel Stehule ts_init_dict_hook.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ps buffer is incorrectly padded on the (latest) OS X
On 04/09/10 22:41, Tom Lane wrote: I wrote: I tried this on a PPC Mac running 10.4.11, which is the oldest Mac OS I have handy at the moment. It worked fine. The existing coding in ps_status.c dates from late 2001, which means that it was first tested against OS X 10.1, and most likely we have not rechecked the question of what PS_PADDING value to use since then. My guess is that Apple must have changed this in OS X 10.2 or 10.3, because the userland Unix utilities were pretty well settled after that. Just for the archives' sake: I dug through the OS X source code archives and confirmed that this behavior changed at 10.3: compare getproclline in 10.2.8 http://www.opensource.apple.com/source/adv_cmds/adv_cmds-46/ps.tproj/print.c vs 10.3 http://www.opensource.apple.com/source/adv_cmds/adv_cmds-63/ps.tproj/print.c So we don't need a version check unless you're worried about somebody trying to run Postgres 9.x on OS X 10.2 (which was retired in 2003). What happens if someone does? Crash, or just wonky ps output? If it's the latter, seems safe to backpatch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
On 05/09/10 03:55, Robert Haas wrote: On Sat, Sep 4, 2010 at 9:17 AM, Max Bowsher m...@f2s.com wrote: Can you post the repo you ended up with somewhere? Well, it's a Bazaar repository at the moment :-) But, I'll re-run it targetting git, and push it somewhere. github? anywhere better? No, that's fine. I think we should start a git repository somewhere containing the precise conversion recipe - i.e.: * cvs2git options file * cvs2git invocation command line * all scripts that massage the CVS repository before conversion, or the Git repository afterwards Yeah, that would be great. For both, see http://github.com/maxb Max. signature.asc Description: OpenPGP digital signature
[HACKERS] Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On Fri, Sep 3, 2010 at 9:20 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: So we now have the same process nested twice inside a semop() call. Looking at the Linux signal (7) man page, it is undefined what happens if semop() is re-entered in a signal handler while another semop() call is happening in main line of execution. Assuming it somehow works, which semop() call is going to return when the semaphore is incremented? Fwiw I wouldn't be too worried about semop specifically. It's a syscall and will always return with EINTR. What makes functions async-unsafe is when they might do extra processing manipulating data structures in user space such as mallocing memory. POSIX seems to be giving semop license to do that but realistically I can't imagine any implementation doing so. What I might wonder about is what happens if the signal is called just after the semop completes or just before it starts. And someone mentioned calling elog from within the signal handler -- doesn't elog call palloc() and sprintf? That can't be async-safe. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lexing small ints as int2
On Sun, Sep 5, 2010 at 1:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I am not too sure that the distinction between implicit casts and assignment casts is all that useful; We've been there and done that; it doesn't work. The current scheme was invented specifically because a two-way design didn't work. http://archives.postgresql.org/pgsql-hackers/2002-09/msg00900.php Well, sure, if you remove the distinction between implicit and assignment casts *without doing anything else*, it's not going to work. But that's not what I proposed. And as Peter said in one of his responses: Finally, I believe this paints over the real problems, namely the inadequate and hardcoded type category preferences and the inadequate handling of numerical constants. Both of these issues have had adequate approaches proposed in the past and would solve this an a number of other issues. I agree. We pride ourselves on having an extensible database product, but our current type system is fairly hostile to extension. The typispreferred stuff works OK for deciding between two types (which is not coincidentally the number of distinct values that can be represented by a Boolean column) but after that it breaks down pretty quickly. If you're adding specialized types to represent zoo animals or constellations or six-dimensional polyhedra, it works OK, but if you try to add addition stringy or numbery things, there are problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On 7 August 2010 03:51, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: Next version. Changed dependencies to pg_constraint, removed handling of unique constraints for now, and made some enhancements so that views track dependencies on constraints even in subqueries. Should be close to final now. :-) I've committed this with some revisions, notably: The view.c changes were fundamentally wrong. The right place to extract dependencies from a parsetree is in dependency.c, specifically find_expr_references_walker. The way you were doing it meant that dependencies on constraints would only be noticed for views, not for other cases such as stored plans. I rewrote the catalog search to look only at pg_constraint, not using pg_index at all. I think this will be easier to extend to the case of looking for UNIQUE + NOT NULL, whenever we get around to doing that. I also moved the search into catalog/pg_constraint.c, because it didn't seem to belong in parse_agg (as hinted by the large number of #include additions you had to make to put it there). I put in a bit of caching logic to prevent repeating the search for multiple Vars of the same relation. Tests or no tests, I can't believe that's going to be cheap enough that we want to repeat it over and over... regards, tom lane I was testing out this feature this morning and discovered that the results may be non-deterministic if the PK is deferrable. I think that check_functional_grouping() should exclude any deferrable constraints, since in general, any inference based on a deferrable constraint can't be trusted when planning a query, since the constraint can't be guaranteed to be valid when the query is executed. That's also consistent with the SQL spec. The original version of the patch had that check in it, but it vanished from the final committed version. Was that just an oversight, or an intentional change? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ps buffer is incorrectly padded on the (latest) OS X
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 04/09/10 22:41, Tom Lane wrote: So we don't need a version check unless you're worried about somebody trying to run Postgres 9.x on OS X 10.2 (which was retired in 2003). What happens if someone does? Crash, or just wonky ps output? If it's the latter, seems safe to backpatch. Wonky ps output. I don't recall exactly how wonky, but back in the day it looked better blank-padded. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
Dean Rasheed dean.a.rash...@gmail.com writes: On 7 August 2010 03:51, Tom Lane t...@sss.pgh.pa.us wrote: I was testing out this feature this morning and discovered that the results may be non-deterministic if the PK is deferrable. Good point. The original version of the patch had that check in it, but it vanished from the final committed version. Was that just an oversight, or an intentional change? I don't recall having thought about it one way or the other. What did the check look like? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On 5 September 2010 16:15, Tom Lane t...@sss.pgh.pa.us wrote: Dean Rasheed dean.a.rash...@gmail.com writes: On 7 August 2010 03:51, Tom Lane t...@sss.pgh.pa.us wrote: I was testing out this feature this morning and discovered that the results may be non-deterministic if the PK is deferrable. Good point. The original version of the patch had that check in it, but it vanished from the final committed version. Was that just an oversight, or an intentional change? I don't recall having thought about it one way or the other. What did the check look like? Well originally it was searching indexes rather than constraints, and funcdeps_check_pk() included the following check: if (!indexStruct-indisprimary || !indexStruct-indimmediate) continue; Now its looping over pg_constraint entries, so I guess anything wtih con-condeferrable == true should be ignored. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
Dean Rasheed dean.a.rash...@gmail.com writes: On 5 September 2010 16:15, Tom Lane t...@sss.pgh.pa.us wrote: I don't recall having thought about it one way or the other. What did the check look like? Well originally it was searching indexes rather than constraints, and funcdeps_check_pk() included the following check: if (!indexStruct-indisprimary || !indexStruct-indimmediate) continue; Now its looping over pg_constraint entries, so I guess anything wtih con-condeferrable == true should be ignored. Seems reasonable, will fix. Thanks for the report! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming a base backup from master
On Sat, Sep 04, 2010 at 02:42:40PM +0100, Greg Stark wrote: On Fri, Sep 3, 2010 at 8:30 PM, Martijn van Oosterhout klep...@svana.org wrote: rsync is not rocket science. All you need is for the receiving end to send a checksum for each block it has. The server side does the same checksum and for each block sends back same or new data. Well rsync is closer to rocket science than that. It does rolling checksums and can handle data being moved around, which vacuum does do so it's probably worthwhile. Not sure. When vacuum moves rows around the chance that it will move rows as a block and that the line pointers will be the same is practically nil. I don't think rsync will pick up on blocks the size of a typical row. Vacuum changes the headers so you never have a copied block. *However* I tihnk you're all headed in the wrong direction here. I don't think rsync is what anyone should be doing with their backups at all. It still requires scanning through *all* your data even if you've only changed a small percentage (which it seems is the use case you're concerned about) and it results in corrupting your backup while the rsync is in progress and having a window with no usable backup. You could address that with rsync --compare-dest but then you're back to needing space and i/o for whole backups every time even if you're only changing small parts of the database. If you're working from a known good version of the database at some point, yes you are right you have more interesting options. If you don't you want something that will fix it. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
On Fri, 2010-09-03 at 18:24 -0400, Tom Lane wrote: Now the HS case likewise appears to be set up so that the signal can only directly interrupt ProcWaitForSignal, so I think the core issue is whether any deadlock situations are possible. Given that this gets called from a low-level place like LockBufferForCleanup, I don't feel too comfortable about that. LockBufferForCleanup is only ever called during recovery by heap_xlog_clean() or btree_xlog_vacuum(). The actions taken to replay a WAL record are independent of all other WAL records from a locking perspective, so replay of every WAL record starts with no LWlocks held by startup process. LockBufferForCleanup is taken early on in replay a heap or btree cleanup record and so we can easily check that no other LWlocks are held while it is called. I certainly haven't seen any analysis or documentation of what locks can safely be held at that point. The deadlock checker only tries to take the LockMgr LWLocks, so extrapolating from whether it is safe to whether touching the ProcArrayLock is safe seems entirely unfounded. So the startup process calls one LWlock, ProcArrayLock, and is not holding any other LWlock when it does. The deadlock checker attempts to get and hold all of the other lock partition locks. So deadlock checker already does the thing you're saying might be dangerous and the startup process doesn't. The ProcArrayLock is only taken as a way of signaling other backends. If that is particularly unsafe we could redesign that aspect. It might be worth pointing out here that LockBufferForCleanup is already known to be a risk factor for undetected deadlocks, even without HS in the picture, because of the possibility of deadlocks involving a chain of both heavyweight locks and LWLocks. Whether HS makes it materially worse may be something that we need field experience to determine. You may be right and that it will be a problem. The deadlock risk we're protecting against is a deadlock involving both normal locks and buffer pins. We're safer having it than not having this code, IMHO. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] returning multiple result sets from a stored procedure
On 04/09/10 17:16, Merlin Moncure wrote: Curious: is mulitset handling as you see it supported by the current v3 protocol? The manual says: The response to a SELECT query (or other queries that return row sets, such as EXPLAIN or SHOW) normally consists of RowDescription, zero or more DataRow messages, and then CommandComplete. COPY to or from the frontend invokes special protocol as described in Section 46.2.5. All other query types normally produce only a CommandComplete message. Since a query string could contain several queries (separated by semicolons), there might be several such response sequences before the backend finishes processing the query string. ReadyForQuery is issued when the entire string has been processed and the backend is ready to accept a new query string. If a multiple return sets from a procedure are returned just like multiple return sets from multiple queries, that's already covered by the protocol. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] returning multiple result sets from a stored procedure
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 04/09/10 17:16, Merlin Moncure wrote: Curious: is mulitset handling as you see it supported by the current v3 protocol? If a multiple return sets from a procedure are returned just like multiple return sets from multiple queries, that's already covered by the protocol. Well, the protocol says you can do it, but it would likely require significant work to make clients deal with it sanely. Also, the part of the protocol document Heikki is quoting is for the legacy simple query mode. We deliberately designed this behavior *out* of the extended query mode. So for example you couldn't use out-of-line parameters with such a feature, unless there's a protocol redesign. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] returning multiple result sets from a stored procedure
On fre, 2010-09-03 at 16:18 -0400, Tom Lane wrote: Part of the reason it's sat on TODO is lack of consensus about how such a feature ought to look/work; particularly since most of the discussion about it has considered that it'd go along with stored procedures executing outside of transactions. I would probably be a mistake to tie these features together. They are tricky enough separately. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On sön, 2010-09-05 at 11:35 -0400, Tom Lane wrote: Dean Rasheed dean.a.rash...@gmail.com writes: On 5 September 2010 16:15, Tom Lane t...@sss.pgh.pa.us wrote: I don't recall having thought about it one way or the other. What did the check look like? Well originally it was searching indexes rather than constraints, and funcdeps_check_pk() included the following check: if (!indexStruct-indisprimary || !indexStruct-indimmediate) continue; Now its looping over pg_constraint entries, so I guess anything wtih con-condeferrable == true should be ignored. Seems reasonable, will fix. Thanks for the report! Yes, the SQL standard explicitly requires the constraint in question to be immediate. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Max Bowsher m...@f2s.com writes: On 05/09/10 03:55, Robert Haas wrote: Can you post the repo you ended up with somewhere? For both, see http://github.com/maxb I took the trouble to run through a mechanical diff of this version's REL8_3_STABLE log history versus what I get from cvs2cl. Several cvs2cl bug fixes later :-(, I have a pretty darn close match. There are some discrepancies in what the two tools choose to regard as a single commit versus successive commits with the same log message, but that's probably OK. The only real gripe I can find to make is that in the cases where a file is added to a back branch, the manufactured commit is invariably blamed on committer pgsql. Can't we arrange to blame it on the person who actually added the file? (I wonder whether this is related to the fact that the same commits have made-up timestamps, which we already griped about.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] returning multiple result sets from a stored procedure
On 9/5/2010 2:05 PM, Heikki Linnakangas wrote: On 04/09/10 17:16, Merlin Moncure wrote: Curious: is mulitset handling as you see it supported by the current v3 protocol? The manual says: The response to a SELECT query (or other queries that return row sets, such as EXPLAIN or SHOW) normally consists of RowDescription, zero or more DataRow messages, and then CommandComplete. COPY to or from the frontend invokes special protocol as described in Section 46.2.5. All other query types normally produce only a CommandComplete message. Since a query string could contain several queries (separated by semicolons), there might be several such response sequences before the backend finishes processing the query string. ReadyForQuery is issued when the entire string has been processed and the backend is ready to accept a new query string. If a multiple return sets from a procedure are returned just like multiple return sets from multiple queries, that's already covered by the protocol. Just as a side note, libpqtypes can emulate this using composite arrays; a feature we abuse internally. It is actually the primary justification we had for developing that portion of libpqtypes; initially we stayed clear of arrays and composites. create table fork_t (fork_id, rev_id, size, block_ids int8[], ...) create table rev_t (rev_id, blah, blah, fork_t[]); /* this is my favorite part of libpqtypes */ PGarray arr; PQgetf(result, tup_num, %rev_t[], field_num, arr); Now loop the array arr and getf(arr.res) for each rev_t, which allows you to getf each fork_t in the fork_t[], etc I *know* it is not pure multiset'n, but it sure gets the job done (in a completely different way, I know). However, I'm sure those reading this list can see the possiblities ;) Andrew Chernow eSilo, LLC. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Tom Lane wrote: Max Bowsher m...@f2s.com writes: For both, see http://github.com/maxb [...] The only real gripe I can find to make is that in the cases where a file is added to a back branch, the manufactured commit is invariably blamed on committer pgsql. Can't we arrange to blame it on the person who actually added the file? (I wonder whether this is related to the fact that the same commits have made-up timestamps, which we already griped about.) CVS does not record when a branch was created or by whom. If a git commit has to be created for such events, cvs2git attributes them to a configurable username, which Max has set to be pgsql. It chooses the latest possible timestamp that is consistent with other (timestamped) changesets that depend on it. Does cvs2cl do something better? If so, how? Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Michael Haggerty mhag...@alum.mit.edu writes: Tom Lane wrote: [...] The only real gripe I can find to make is that in the cases where a file is added to a back branch, the manufactured commit is invariably blamed on committer pgsql. Can't we arrange to blame it on the person who actually added the file? (I wonder whether this is related to the fact that the same commits have made-up timestamps, which we already griped about.) CVS does not record when a branch was created or by whom. If a git commit has to be created for such events, cvs2git attributes them to a configurable username, which Max has set to be pgsql. It chooses the latest possible timestamp that is consistent with other (timestamped) changesets that depend on it. Does cvs2cl do something better? If so, how? I suspect what it's doing is attributing the branch creation to the user who makes the first commit on the branch for that file. In general I'd expect that to give a reasonable result --- better than choosing a guaranteed-to-be-wrong constant value anyway ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Tom Lane wrote: Michael Haggerty mhag...@alum.mit.edu writes: CVS does not record when a branch was created or by whom. If a git commit has to be created for such events, cvs2git attributes them to a configurable username, which Max has set to be pgsql. It chooses the latest possible timestamp that is consistent with other (timestamped) changesets that depend on it. Does cvs2cl do something better? If so, how? I suspect what it's doing is attributing the branch creation to the user who makes the first commit on the branch for that file. In general I'd expect that to give a reasonable result --- better than choosing a guaranteed-to-be-wrong constant value anyway ;-) On the contrary, I prefer an obvious indication of I don't know to a value that might appear to be authoritative but is really just a guess. It could be that one user copied the file verbatim to the branch and a second user changed the file as part of an unrelated change. The default default value for these commits is cvs2svn (in your case cvs2git would probably be more appropriate), which I like because it makes it clearer than pgsql that the commit was generated as part of a conversion. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git: uh-oh
Michael Haggerty mhag...@alum.mit.edu writes: Tom Lane wrote: I suspect what it's doing is attributing the branch creation to the user who makes the first commit on the branch for that file. In general I'd expect that to give a reasonable result --- better than choosing a guaranteed-to-be-wrong constant value anyway ;-) On the contrary, I prefer an obvious indication of I don't know to a value that might appear to be authoritative but is really just a guess. It could be that one user copied the file verbatim to the branch and a second user changed the file as part of an unrelated change. Hm, I see. The default default value for these commits is cvs2svn (in your case cvs2git would probably be more appropriate), which I like because it makes it clearer than pgsql that the commit was generated as part of a conversion. If we can set it to a value different from any actual committer name, that would be a good thing to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] string function - format function proposal
On Wed, Sep 1, 2010 at 1:29 PM, Pavel Stehule pavel.steh...@gmail.com wrote: * %v also doesn't quote boolean values, but t and f are not valid. You should use true and false (or 't' and 'f') for the cases. you have a true - it should be fixed I found quote_literal() prints boolean values as 'true' or 'false'. It uses casting to text type rather than calling output function. OTOH, format functions (and concat funcs) use output functions. Which should we use for such purposes? Consistent behavior is obviously preferred. Boolean type might be the only type that is converted to different representation in typoutput or cast-to-test, but we should consider to have boolean-specific hardwired code, or cast all types to text instead of output functions. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers