Re: [HACKERS] Single pass vacuum - take 1
On Mon, Jul 18, 2011 at 2:20 AM, Robert Haas wrote: > On Fri, Jul 15, 2011 at 7:23 AM, Simon Riggs wrote: >> My additional requests would be that we can easily tell which blocks >> have been modified like this, that we have a way to turn this off if >> we get bugs for next few releases, that we check it all works with Hot >> Standby just fine and that all the block inspection tools support it. > > To me, this seems like a bit too much of an internal change to justify > adding a GUC. I will be happy to remove it again when we have shown there are no bugs getting this wrong is a data loss issue. > But it probably is a good idea to ping the authors of > the various block inspection tools -- does contrib/pageinspect care > about this sort of thing, or just the out-of-core stuff? I'm the author of the main sections of the pageinspect module, and yes, I care. :-) I want to be able to say easily and quickly "ahhh, that page was touched by the new code". As mentioned above, this code has potential for causing data loss and it is important to look at ways of diagnosing bugs. Pageinspect was written to allow us to confirm at a low level that HOT was working. > I'd also think that it would be a good idea to consider (most likely > as a separate patch) the idea you suggested upthread: skip phase 2 if > the number of dead tuples is insufficient to justify the cost of > scanning the indexes. I've been wondering if it might make sense to > couple that change with a decrease in vacuum_scale_factor - in effect, > vacuum more frequently, but don't scan the indexes every time. > > One possibly useful metric for benchmarking these sorts of changes is > to run a write workload for a while until the size of the tables > stabilize and then measure how big they are. If a given change causes > the table size to stabilize at a lower value, that suggests that the > change makes vacuum more effective at controlling bloat, which is > good. Conversely, if the change causes the table size to stabilize at > a higher value, that suggests that we've made vacuum less effective at > controlling bloat, which is bad. In fact, I'd venture to say that > anything that falls into the latter category is dead on arrival... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in WAL Writer process
On Mon, Jul 18, 2011 at 1:48 AM, Robert Haas wrote: > Might be good to start a new thread for each auxilliary process, or we > may get confused. +1 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error in PQsetvalue
Hello, Andrew. I hope you don't mind I've added this patch to CommitFest: https://commitfest.postgresql.org/action/patch_view?id=606 You wrote: AC> On 6/3/2011 10:26 PM, Andrew Chernow wrote: >> I disagree -- I think the fix is a one-liner. line 446: if (tup_num == res->ntups&& !res->tuples[tup_num]) should just become if (tup_num == res->ntups) also the memset of the tuple slots when the slot array is expanded can be removed. (in addition, the array tuple array expansion should really be abstracted, but that isn't strictly necessary here). >>> >>> All true. This is a cleaner fix to something that was in fact broken ;) You >>> want >> >> Attached a patch that fixes the OP's issue. PQsetvalue now uses pqAddTuple to >> grow the tuple table and has removed the remnants of an older idea that >> caused >> the bug. >> AC> Sorry, I attached the wrong patch. Here is the correct one. -- With best wishes, Pavel mailto:pa...@gf.microolap.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fixing PQsetvalue()
Hello, Merlin. I hope it's OK that I've added Andrew's patch to CommitFest: https://commitfest.postgresql.org/action/patch_view?id=606 I did this becuase beta3 already released, but nut nothig is done on this bug. You wrote: MM> On Thu, Jun 23, 2011 at 7:54 AM, Andrew Chernow wrote: >>> you are creating as you iterate through. This behavior was >>> unnecessary in terms of what libpqtypes and friends needed and may (as >>> Tom suggested) come back to bite us at some point. As it turns out, >>> PQsetvalue's operation on results that weren't created via >>> PQmakeEmptyResult was totally busted because of the bug, so we have a >>> unique opportunity to tinker with libpq here: you could argue that you >>> >>> +1 >>> >>> Exactly at this moment I am thinking about using modifiable >>> (via PQsetvalue) PGresult instead of std::map in my C++ library >>> for store parameters for binding to executing command. >>> I am already designed how to implement it, and I supposed that >>> PQsetvalue is intended to work with any PGresult and not only >>> with those which has been created via PQmakeEmptyResult... >>> So, I am absolutely sure, that PQsetvalue should works with >>> any PGresult. >> >> All PGresults are created via PQmakeEmptyPGresult, including libpqtypes. >> Actually, libpqtypes calls PQcopyResult which calls PQmakeEmptyPGresult. >> >> It might be better to say a "server" result vs. a "client" result. >> Currently, PQsetvalue is broken when provided a "server" generated result. MM> er, right-- the divergence was in how the tuples were getting added -- MM> thanks for the clarification. essentially your attached patch fixes MM> that divergence. MM> merlin -- With best wishes, Pavel mailto:pa...@gf.microolap.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add temp_file_limit GUC parameter to constrain temporary file sp
On sön, 2011-07-17 at 18:39 -0400, Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian writes: > > > Tom Lane wrote: > > >> Add temp_file_limit GUC parameter to constrain temporary file space > > >> usage. > > >> > > >> The limit is enforced against the total amount of temp file space used by > > >> each session. > > >> > > >> Mark Kirkwood, reviewed by C?dric Villemain and Tatsuo Ishii > > > > > Should we document that sessions that exceed this limit generate an > > > error? I don't see any mention of this in the docs. > > > > Huh? Seems like a waste of text to me. What else would happen? > > Well, if we exceed work_mem, we spill to temp disk. If we exceed temp > disk, we error out. Those different behaviors don't seem obvious to me. > The work_mem docs do mention is spills to disk though, so maybe it is > OK. Sounds like it would be good to document the behavior if the limit is exceeded in each case. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reduced power consumption in autovacuum launcher process
On 18 July 2011 01:48, Robert Haas wrote: > Might be good to start a new thread for each auxilliary process, or we > may get confused. Right - I've done so in this reply. There isn't a problem as such with the AV Launcher patch - there's a problem with most or all remaining auxiliary processes, that needs to be worked out once and for all. I refer to the generic signal handler/timeout invalidation issues already described. >> ISTM that these generic handlers ought to be handling this >> generically, and that there should be a Latch for just this purpose >> for each process within PGPROC. We already have this Latch in PGPROC: >> >> Latch waitLatch; /* allow us to wait for sync rep */ >> >> Maybe its purpose should be expanded to "current process Latch"? > > I think that could be a really good idea, though I haven't looked at > it closely enough to be sure. > >> Another concern is, what happens when we receive a signal, generically >> handled or otherwise, and have to SetLatch() to avoid time-out >> invalidation? Should we just live with a spurious >> AutoVacLauncherMain() iteration, or should we do something like check >> if the return value of WaitLatch indicates that we woke up due to a >> SetLatch() call, which must have been within a singal handler, and >> that we should therefore goto just before WaitLatch() and elide the >> spurious iteration? Given that we can expect some signals to occur >> relatively frequently, spurious iterations could be a real concern. > > Really? I suspect that it doesn't much matter exactly how many > machine language instructions we execute on each wake-up, within > reasonable bounds, of course. Maybe some testing is in order? There's only one way to get around the time-out invalidation problem that I'm aware of - call SetLatch() in the handler. I'd be happy to hear alternatives, but until we have an alternative, we're stuck managing this in each and every signal handler. Once we've had the latch set to handle this, and control returns to the auxiliary process loop, we now have to decide from within the auxiliary if we can figure out that all that happened was a "required" wake-up, and thus we shouldn't really go through with another iteration. That, or we can simply do the iteration. I have my doubts that it is acceptable to wake-up spuriously in response to routine events that there are generic handlers for. Maybe this needs to be decided on a case-by-case basis. > On another note, I might be inclined to write something like: > > if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && !PostmasterIsAlive()) > proc_exit(1); > > ...so as to avoid calling that function unnecessarily on every iteration. Hmm. I'm not so sure. We're now relying on the return value of WaitLatch(), which isn't guaranteed to report all wake-up events (although I don't believe it would be a problem in this exact case). Previously, we called PostmasterIsAlive() once a second anyway, and that wasn't much of a problem. >> Incidentally, should I worry about the timeout long for WaitLatch() >> overflowing? > > How would that happen? struct timeval is comprised of two longs - one representing seconds, and the other represented the balance of microseconds. Previously, we didn't combine them into a single microsecond representation. Now, we do. There could perhaps be a very large "nap", as determined by launcher_determine_sleep(), so that the total number of microseconds passed to WaitLatch() would exceed the maximum long size that can be safely represented on some or all platforms. On most 32-bit machines, sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = 2,147,483,647 microseconds = only about 35 minutes. There are corner cases, such as if someone were to set autovacuum_naptime to something silly. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On Jul18, 2011, at 15:12 , Peter Geoghegan wrote: > struct timeval is comprised of two longs - one representing seconds, > and the other represented the balance of microseconds. Previously, we > didn't combine them into a single microsecond representation. Now, we > do. I haven't actually looked at the code, so maybe I'm missing something obvious - but why don't we use a double precision float (double) instead of a long for the combined representation? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan wrote: >>> Another concern is, what happens when we receive a signal, generically >>> handled or otherwise, and have to SetLatch() to avoid time-out >>> invalidation? Should we just live with a spurious >>> AutoVacLauncherMain() iteration, or should we do something like check >>> if the return value of WaitLatch indicates that we woke up due to a >>> SetLatch() call, which must have been within a singal handler, and >>> that we should therefore goto just before WaitLatch() and elide the >>> spurious iteration? Given that we can expect some signals to occur >>> relatively frequently, spurious iterations could be a real concern. >> >> Really? I suspect that it doesn't much matter exactly how many >> machine language instructions we execute on each wake-up, within >> reasonable bounds, of course. Maybe some testing is in order? > > There's only one way to get around the time-out invalidation problem > that I'm aware of - call SetLatch() in the handler. I'd be happy to > hear alternatives, but until we have an alternative, we're stuck > managing this in each and every signal handler. > > Once we've had the latch set to handle this, and control returns to > the auxiliary process loop, we now have to decide from within the > auxiliary if we can figure out that all that happened was a "required" > wake-up, and thus we shouldn't really go through with another > iteration. That, or we can simply do the iteration. > > I have my doubts that it is acceptable to wake-up spuriously in > response to routine events that there are generic handlers for. Maybe > this needs to be decided on a case-by-case basis. I'm confused. If the process gets hit with a signal, it's already woken up, isn't it? Whatever system call it was blocked on may or may not get restarted depending on the platform and what the signal handler does, but from an OS perspective, the process has already been allocated a time slice and will run until either the time slice is exhausted or it again blocks. >> On another note, I might be inclined to write something like: >> >> if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && >> !PostmasterIsAlive()) >> proc_exit(1); >> >> ...so as to avoid calling that function unnecessarily on every iteration. > > Hmm. I'm not so sure. We're now relying on the return value of > WaitLatch(), which isn't guaranteed to report all wake-up events > (although I don't believe it would be a problem in this exact case). > Previously, we called PostmasterIsAlive() once a second anyway, and > that wasn't much of a problem. Ah. OK. >>> Incidentally, should I worry about the timeout long for WaitLatch() >>> overflowing? >> >> How would that happen? > > struct timeval is comprised of two longs - one representing seconds, > and the other represented the balance of microseconds. Previously, we > didn't combine them into a single microsecond representation. Now, we > do. > > There could perhaps be a very large "nap", as determined by > launcher_determine_sleep(), so that the total number of microseconds > passed to WaitLatch() would exceed the maximum long size that can be > safely represented on some or all platforms. On most 32-bit machines, > sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = > 2,147,483,647 microseconds = only about 35 minutes. There are corner > cases, such as if someone were to set autovacuum_naptime to something > silly. OK. In that case, my feeling is "yes, you need to worry about that". I'm not sure exactly what the best solution is: we could either twiddle the WaitLatch interface some more, or restrict autovacuum_naptime to at most 30 minutes, or maybe there's some other option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
Robert Haas writes: > On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan > wrote: >> There could perhaps be a very large "nap", as determined by >> launcher_determine_sleep(), so that the total number of microseconds >> passed to WaitLatch() would exceed the maximum long size that can be >> safely represented on some or all platforms. On most 32-bit machines, >> sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = >> 2,147,483,647 microseconds = only about 35 minutes. There are corner >> cases, such as if someone were to set autovacuum_naptime to something >> silly. > OK. In that case, my feeling is "yes, you need to worry about that". > I'm not sure exactly what the best solution is: we could either > twiddle the WaitLatch interface some more, or restrict > autovacuum_naptime to at most 30 minutes, or maybe there's some other > option. A wakeup once every half hour would surely not be an issue from a power consumption standpoint. However, I'm not sure I understand here: aren't we trying to remove the timeouts completely? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add temp_file_limit GUC parameter to constrain temporary file sp
Bruce Momjian writes: > Tom Lane wrote: >> Huh? Seems like a waste of text to me. What else would happen? > Well, if we exceed work_mem, we spill to temp disk. If we exceed temp > disk, we error out. Those different behaviors don't seem obvious to me. [ shrug... ] Feel free to change it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: a validator for configuration files
Robert Haas writes: > On Sun, Jul 17, 2011 at 12:59 AM, Tom Lane wrote: >> I agree custom_variable_classes is conceptually messy, but it's a >> reasonably lightweight compromise that gives some error checking without >> requiring a lot of possibly-irrelevant extensions to be loaded into >> every postgres process. > Hmm. Maybe what we need is a mechanism that allows the configuration > to be associated a loadable module, and whenever that module is > loaded, we also load the associated configuration settings. This is > probably terribly syntax, but something like: > ALTER LOAD 'plpgsql' SET plpgsql.variable_conflict = 'whatever'; > AFAICS, that would remove the need to set variables in postgresql.conf > that can't be validated, and so we could just disallow it. No, that only fixes things for the case of setting a variable in the control file. It isn't useful for ALTER ROLE/DATABASE SET. And it still has the problem of forcing every process to load every extension, and as written it would also require the postmaster to read catalogs. > OTOH, maybe that's more trouble than can be justified by the size of > the problem. Yeah. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in WAL Writer process
Simon Riggs writes: > On Sun, Jul 17, 2011 at 10:41 PM, Tom Lane wrote: >> Yeah, I agree with putting the actual SetLatch call after we release the >> lock ... but doesn't the calculation need to be done while we're still >> holding it? Otherwise it'd be using potentially-inconsistent values. > The calculation is just a heurustic, so doesn't need to be exactly > accurate. We need the latest values derived while holding the lock, > but we don't need to ensure they change while we make the calc. > The calc needs to say "if we are the ones who make the array more than > half full, then wake up the WALWriter". It might already be awake, it > might be already be more than half full or it might even be less than > half full now - none of those cases are very important. Nonetheless, I think it'll be better to make the calculation and set a local bool variable (to tell whether to do SetLatch later) while we're in the midst of the advance-to-next-WAL-buffer code. Even if it's true that we would generally get an OK answer by reading the variables again after releasing the lock, that's highly unlikely to be a performance win, because of cache line contention effects (ie, having to wrest the line back from the next guy doing WALInsert, and then give it back to him afterwards). Once you release the lock, you should stop touching shared memory, period. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
On Thu, Jul 7, 2011 at 3:25 PM, Robert Haas wrote: > On Thu, Jul 7, 2011 at 3:21 PM, Noah Misch wrote: >> On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote: >>> On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch wrote: >>> > CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new >>> > operator >>> > classes, collations and exclusion operators for each index column. It >>> > then >>> > checks those against the existing values for the same. I figured that was >>> > obvious enough, but do you want a new version noting that? >>> >>> I guess one question I had was... are we depending on the fact that >>> ComputeIndexAttrs() performs a bunch of internal sanity checks? Or >>> are we just expecting those to always pass, and we're going to examine >>> the outputs after the fact? >> >> Those checks can fail; consider an explicit operator class or collation that >> does not support the destination type. At that stage, we neither rely on >> those >> checks nor mind if they do fire. If we somehow miss the problem at that >> stage, >> DefineIndex() will detect it later. Likewise, if we hit an error in >> CheckIndexCompatible(), we would also hit it later in DefineIndex(). > > OK. Committed with minor comment and documentation changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan >> wrote: >>> There could perhaps be a very large "nap", as determined by >>> launcher_determine_sleep(), so that the total number of microseconds >>> passed to WaitLatch() would exceed the maximum long size that can be >>> safely represented on some or all platforms. On most 32-bit machines, >>> sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 = >>> 2,147,483,647 microseconds = only about 35 minutes. There are corner >>> cases, such as if someone were to set autovacuum_naptime to something >>> silly. > >> OK. In that case, my feeling is "yes, you need to worry about that". >> I'm not sure exactly what the best solution is: we could either >> twiddle the WaitLatch interface some more, or restrict >> autovacuum_naptime to at most 30 minutes, or maybe there's some other >> option. > > A wakeup once every half hour would surely not be an issue from a power > consumption standpoint. However, I'm not sure I understand here: aren't > we trying to remove the timeouts completely? Well, in the case of the AV launcher, the issue is that the main loop is timed by definition, cf. autovacuum_naptime, and the WaitLatch() interface is apparently designed so that we can't sleep longer than 35 minutes. We can avoid waking every second simply to check whether the postmaster has died, but waking up every autovacuum_naptime seems de rigeur barring a major redesign of the mechanism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
Hello KaiGai-san, I've been preparing to review this patch by reading both pgsql-hackers history on sepgsql, and also the RHEL 6 guide on SELinux this weekend, today I installed GIT HEAD with --with-selinux on Scientific Linux 6, developer installation, so far almost everything looking good. These things should probably be added to the 9.1beta3 documentation branch: 1) the line with for DBNAME in ... do postgres --single etc, lacks a -D argument and hence gives the error: postgres does not know where to find the server configuration file. 2) there is a dependency to objects outside of the postgresql installation tree in /etc/selinux/targeted/contexts/sepgsql_contexts, and that file has an error that is thrown when contrib/sepgsql is executed: /etc/selinux/targeted/contexts/sepgsql_contexts: line 33 has invalid object type db_blobs (same for db_language) I found your fix for the error on a forum on oss.tresys.com, but IMHO either the contrib/sepgsql should mention that the dependency exists and it might contain errors for (older) reference policies, or it should include a bugfree reference policy for sepgsql to replace the system installed refpolicy with (and mention that in the install documentation). 3) sepgsql is currently a bit hard to find in the documentation. www.postgresql.org website search doesn't find sepgsql and selinux only refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had to manually remember it was a contrib module. Also sepgsql isn't linked to at the SECURITY LABEL page. At the moment I'm unsure if I have seen all sepgsql related sgml-based documentation. After fixing the refpolicy I proceeded with the contrib/sepgsql manual, with the goal to get something easy done, like create a top secret table like 'thisyearsbonusses', and a single user 'boss' and configure sepgsql in such a way that only the boss can access the top secret table. I've read the the contrib documentation, browsed links on the bottom of the page but until now I don't even have a clue how to proceed. Until I do so, I don't feel it's appropriate for me to review the avc patch. Would you be willing to help me getting a bit started? Specific questions are: 1) The contrib doc under DML permissions talks about 'db_table:select' etc? What are these things? They are not labels since I do not see them listed in the output of 'select distinct label from pg_seclabel'. 2) The regression test label.sql introduces labels with types sepgsql_trusted_proc_exec_t, sepgsql_ro_table_t. My question is: where are these defined? What is their meaning? Can I define my own? 3) In the examples so far I've seen unconfined_u and system_u? Can I define my own? Thanks, Yeb Havinga On 2011-07-14 21:46, Kohei KaiGai wrote: Sorry, the syscache part was mixed to contrib/sepgsql part in my previous post. Please see the attached revision. Although its functionality is enough simple (it just reduces number of system-call invocation), its performance improvement is obvious. So, I hope someone to volunteer to review these patches. Thanks, 2011/7/11 Kohei KaiGai: I rebased the userspace access vector cache patch to the latest tree. I'll describe the background of this patch because this thread has not been active more than a week. The sepgsql asks in-kernel selinux when it needs to make its access control decison, so it always causes system call invocations. However, access control decision of selinux for a particular pair of security label is consistent as long as its security policy is not reloaded. Thus, it is a good idea to cache access control decisions recently used in userspace. In addition, current GetSecurityLabel() always open pg_seclabel catalog and scan to fetch security label of database objects, although it is a situation we can utilize syscache mechanism. The "uavc-syscache" patch adds a new SECLABELOID syscache. It also redefine pg_seclabel.provide as Name, instead of Text, according to the suggestion from Tom. (To avoid collation conscious datatype) The "uavc-selinux-cache" patch adds cache mechanism of contrib/sepgsql. Its internal api to communicate with selinux (sepgsql_check_perms) was replaced by newer sepgsql_avc_check_perms that checks cached access control decision at first, prior to system call invocations. The result of performance improvement is obvious. * Test 2. time to run 50,000 of SELECT from empty tables selinux | SECLABELOID syscache avc | without | with -+--- without | 185.59[s] | 178.38[s] -+--- with| 23.58[s] | 21.79[s] -+--- I strongly hope this patch (and security label support on shared objects) to get unstreamed in this commit-fest, because it will perform as a basis of other upcoming features. Please volunteer the reviewing! Thanks, 2011/7/2 Kohei KaiGai: The attached patch re-defines pg_seclabel.provider as NameData, instead
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
Robert Haas writes: > On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane wrote: >> A wakeup once every half hour would surely not be an issue from a power >> consumption standpoint. However, I'm not sure I understand here: aren't >> we trying to remove the timeouts completely? > Well, in the case of the AV launcher, the issue is that the main loop > is timed by definition, cf. autovacuum_naptime, and the WaitLatch() > interface is apparently designed so that we can't sleep longer than 35 > minutes. Hmm. Well, it's not too late to rethink the WaitLatch API, if we think that that might be a significant limitation. Offhand I don't see that it is, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FOR KEY LOCK foreign keys
"Kevin Grittner" wrote: > Noah Misch wrote: > >> With this patch in its final form, I have completed 180+ suite >> runs without a failure. > > The attached patch allows the tests to pass when > default_transaction_isolation is stricter than 'read committed'. Without these two patches the tests fail about one time out of three on my machine at the office at the 'read committed' transaction isolation level, and all the time at stricter levels. On my machine at home I haven't seen the failures at 'read committed'. I don't know if this is Intel (at work) versus AMD (at home) or what. With both Noah's patch and mine I haven't yet seen a failure in either environment, with a few dozen tries.. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Made ecpglib write double with a precision of 15 digits.
I wrote: > Michael Meskes writes: >> Made ecpglib write double with a precision of 15 digits. >> Patch originally by Akira Kurosawa . > I think you missed the 9.1 branch. ... not to mention that the buildfarm is now failing ecpg-check everywhere ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Made ecpglib write double with a precision of 15 digits.
On Mon, Jul 18, 2011 at 12:16:50PM -0400, Tom Lane wrote: > ... not to mention that the buildfarm is now failing ecpg-check > everywhere ... I seem to have messed up my local make check call, sorry, working on the fix. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org Jabber: michael.meskes at googlemail dot com VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
Hi! New version of patch is attached. There are following changes. 1) Since proposed tchnique is not always a "fast" build, it was renamed everywhere in the patch to "buffering" build. 2) Parameter "buffering" now has 3 possible values "yes", "no" and "auto". "auto" means automatic switching from regular index build to buffering one. Currently it just switch when index size exceeds maintenance_work_mem. 3) Holding of many buffers pinned is avoided. 4) Rebased with head. TODO: 1) Take care about ordered datasets in automatic switching. 2) Take care about concurrent backends in automatic switching. -- With best regards, Alexander Korotkov. gist_fast_build-0.7.0.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module plpgsql's embeded sql validator
Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011: > On a somewhat related note, I'd also really like to have the ability to parse > things like .sql files externally, to do things like LINT checking. We talked about this during PGCon. The idea that seemed to have consensues there was to export the parser similarly to how we build the ecpg parser, that is, a set of perl scripts which take our gram.y as input and modify it to emit something different. What ecpg does with it is emit a different grammar, but it doesn't seem impossible to me to have it emit some sort of (lint) checker. I admit I haven't looked at the new Perl scripts we use in ecpg (they were rewritten in the 9.1 era). -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Hello Tom, Thank you for review I am thinking, so your comment is clean and I'll respect it in new version. There is only one issue, that should be solved first. I introduced non standard diagnostics field "column_names", because there is not possible get "column_name" value for check constraints now. A correct implementation of COLUMN_NAME field needs a explicit relation between pg_constraint and pg_attribute - maybe implemented as new column to pg_constraint. Do you agree? Regards Pavel 2011/7/16 Tom Lane : > Pavel Stehule writes: >> I am sending a updated patch > > I looked over this patch a bit. I guess my main concern about it > is that the set of items to be reported seems to have been made up on > a whim. I think that we ought to follow the SQL standard, which has a > pretty clearly defined set of additional information items --- look at > the spec for the GET DIAGNOSTICS statement. (In SQL:2008, this is > section 23.1 .) I don't feel that we need to > implement every field the standard calls for, at least not right away, > but we ought to have their list in mind. Conversely, implementing items > that *aren't* listed in the spec has to meet a considerably higher bar > than somebody just submitting a patch that does it. > > The standard information items that seem reasonable for us to implement > in the near future are > > COLUMN_NAME > CONSTRAINT_NAME > CONSTRAINT_SCHEMA > SCHEMA_NAME > TABLE_NAME > TRIGGER_NAME > TRIGGER_SCHEMA > > So I'd like to see the patch revised to use this terminology. We > probably also need to think a bit harder about the PG_DIAG_XXX code > letters to be used --- we're already just about at the limit of what > fields can have reasonably-mnemonic code letters, and not all of the > above have obvious choices, let alone the rest of what's in the spec > that we might someday want to implement. What assignment rule should > we use when we can't choose a mnemonic letter? > > Some other specific comments on the patch follow: > > 1. It's way short in the documentation department. protocol.sgml > certainly needs additions (see "Error and Notice Message Fields"), > also libpq.sgml's discussion of PQresultErrorField(), also > sources.sgml's "Reporting Errors Within the Server", and I'm not > sure where else. > ok > 2. I think you could drop the tuple-descriptor changes, because they're > only needed in service of an information item that is not found in the > standard and doesn't seem very necessary. The standard says to report > the name of the constraint, not what columns it involves. > > 3. errrel() is extremely poorly considered. The fact that it requires > utils/relcache.h to be #included by elog.h (and therefore by *every* > *single* *file* in the backend) is a symptom of that, but expecting > elog.c to do catalog lookups is as bad or worse from a modularity > standpoint. I think all the added elog functions should not take > anything higher-level than a C string. > > 4. Actually, it would probably be a good idea to avoid inventing a new > elog API function for each individual new information item; something > along the lines of "erritem(PG_DIAG_WHATEVER, string_value)" would be > more appropriate to cover the inevitable future expansions. > > 5. I don't think IndexRelationGetParentRelation is very appropriate > either --- in the use cases you have, the parent table's OID is easily > accessible, as is its namespace (which'll be the same as the index's) > and so you could just have the callers do get_rel_name(tableoid). > Doing a relcache open in an error reporting path seems like overkill. > > I'm going to mark this patch Returned With Feedback. > > regards, tom lane > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: enhanced get diagnostics statement 2
Pavel Stehule writes: > 2011/7/14 Alvaro Herrera : >> Thanks ... I expect you're going to resubmit the patch based on David's >> last version and my comments? > yes, see a attachment Applied with some editorial adjustments. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 1
On Mon, Jul 18, 2011 at 3:14 AM, Simon Riggs wrote: > On Mon, Jul 18, 2011 at 2:20 AM, Robert Haas > wrote: > > On Fri, Jul 15, 2011 at 7:23 AM, Simon Riggs > wrote: > >> My additional requests would be that we can easily tell which blocks > >> have been modified like this, that we have a way to turn this off if > >> we get bugs for next few releases, that we check it all works with Hot > >> Standby just fine and that all the block inspection tools support it. > > > > To me, this seems like a bit too much of an internal change to justify > > adding a GUC. > > I will be happy to remove it again when we have shown there are no > bugs getting this wrong is a data loss issue. > > Though I understand the fear for data loss, do we have much precedent of adding GUC to control such mechanism ? Even for complex feature like HOT we did not add any GUC to turn it off and I don't think we missed it. So I would suggest we review the code and test the feature extensively and fix the bugs if any, but lets not add any GUC to turn it off. In fact, the code and algorithm itself is not that complex and I would suggest you to take a look at the patch. > > But it probably is a good idea to ping the authors of > > the various block inspection tools -- does contrib/pageinspect care > > about this sort of thing, or just the out-of-core stuff? > > I'm the author of the main sections of the pageinspect module, and > yes, I care. :-) > > Yeah, we should probably teach pageinspect to see and report this additional information. Thanks, Pavan > > -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] patch for 9.2: enhanced errors
Pavel Stehule writes: > There is only one issue, that should be solved first. I introduced non > standard diagnostics field "column_names", because there is not > possible get "column_name" value for check constraints now. A correct > implementation of COLUMN_NAME field needs a explicit relation between > pg_constraint and pg_attribute - maybe implemented as new column to > pg_constraint. Do you agree? No, I don't. You're adding complication to solve a problem that doesn't need to be solved. The standard says to return the name of the constraint for a constraint-violation failure. It does not say anything about naming the associated column(s). COLUMN_NAME is only supposed to be defined for certain kinds of errors, and this isn't one of them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Fri, Jul 15, 2011 at 3:56 PM, Joey Adams wrote: > On Mon, Jul 4, 2011 at 10:22 PM, Joseph Adams > wrote: >> I'll try to submit a revised patch within the next couple days. > > Sorry this is later than I said. > > I addressed the issues covered in the review. I also fixed a bug > where "\u0022" would become """, which is invalid JSON, causing an > assertion failure. > > However, I want to put this back into WIP for a number of reasons: > > * The current code accepts invalid surrogate pairs (e.g. > "\uD800\uD800"). The problem with accepting them is that it would be > inconsistent with PostgreSQL's Unicode support, and with the Unicode > standard itself. In my opinion: as long as the server encoding is > universal (i.e. UTF-8), decoding a JSON-encoded string should not fail > (barring data corruption and resource limitations). > > * I'd like to go ahead with the parser rewrite I mentioned earlier. > The new parser will be able to construct a parse tree when needed, and > it won't use those overkill parsing macros. > > * I recently learned that not all supported server encodings can be > converted to Unicode losslessly. The current code, on output, > converts non-ASCII characters to Unicode escapes under some > circumstances (see the comment above json_need_to_escape_unicode). > > I'm having a really hard time figuring out how the JSON module should > handle non-Unicode character sets. \u escapes in JSON literals > can be used to encode characters not available in the server encoding. > On the other hand, the server encoding can encode characters not > present in Unicode (see the third bullet point above). This means > JSON normalization and comparison (along with member lookup) are not > possible in general. I previously suggested that, instead of trying to implement JSON, you should just try to implement JSON-without-the-restriction-that-everything-must-be-UTF8. Most people are going to be using UTF-8 simply because it's the default, and if you forget about transcoding then ISTM that this all becomes a lot simpler. We don't, in general, have the ability to support data in multiple encodings inside PostgreSQL, and it seems to me that by trying to invent a mechanism for making that work as part of this patch, you are setting the bar for yourself awfully high. One thing to think about here is that transcoding between UTF-8 and the server encoding seems like the wrong thing all around. After all, the user does not want the data in the server encoding; they want it in their chosen client encoding. If you are transcoding between UTF-8 and the server encoding, then that suggests that there's some double-transcoding going on here, which creates additional opportunities for (1) inefficiency and (2) outright failure. I'm guessing that's because you're dealing with an interface that expects the internal representation of the datum on one side and the server encoding on the other side, which gets back to the point in the preceding paragraph. You'd probably need to revise that interface in order to make this really work the way it should, and that might be more than you want to get into. At any rate, it probably is a separate project from making JSON work. If in spite of the above you're bent on continuing down your present course, then it seems to me that you'd better make the on-disk representation UTF-8, with all \u escapes converted to the corresponding characters. If you hit an invalid surrogate pair, or a character that exists in the server encoding but not UTF-8, it's not a legal JSON object and you throw an error on input, just as you would for mismatched braces or similar. On output, you should probably just use \u to represent any unrepresentable characters - i.e. option 3 from your original list. That may be slow, but I think that it's not really worth devoting a lot of mental energy to this case. Most people are going to be using UTF-8 because that's the default, and those who are not shouldn't expect a data format built around UTF-8 to work perfectly in their environment, especially if they insist on using characters that are representable in only some of the encodings they are using. But, again, why not just forget about transcoding and define it as "JSON, if you happen to be using utf-8 as the server encoding, and otherwise some variant of JSON that uses the server encoding as its native format?". It seems to me that that would be a heck of a lot simpler and more reliable, and I'm not sure it's any less useful in practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On 18.07.2011 18:32, Tom Lane wrote: Robert Haas writes: On Mon, Jul 18, 2011 at 10:35 AM, Tom Lane wrote: A wakeup once every half hour would surely not be an issue from a power consumption standpoint. However, I'm not sure I understand here: aren't we trying to remove the timeouts completely? Well, in the case of the AV launcher, the issue is that the main loop is timed by definition, cf. autovacuum_naptime, and the WaitLatch() interface is apparently designed so that we can't sleep longer than 35 minutes. Hmm. Well, it's not too late to rethink the WaitLatch API, if we think that that might be a significant limitation. Right, we can easily change the timeout argument to be in milliseconds instead of microseconds. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Tom, > No, I don't. You're adding complication to solve a problem that doesn't > need to be solved. The standard says to return the name of the > constraint for a constraint-violation failure. It does not say anything > about naming the associated column(s). COLUMN_NAME is only supposed to > be defined for certain kinds of errors, and this isn't one of them. Are we talking about FK constraints here, or CHECK contstraints? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-column generic option
On Mon, Jul 11, 2011 at 12:11 AM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of dom jul 10 21:21:19 -0400 2011: >> On Jul 9, 2011, at 10:49 PM, Alvaro Herrera >> wrote: >> > In short: in my opinion, attoptions and attfdwoptions need to be one >> > thing and the same. >> >> I feel the opposite. In particular, what happens when a future release of >> PostgreSQL adds an attoption that happens to have the same name as >> somebody's per-column FDW option? Something breaks, that's what... > > Hmm, if you follow my proposal above, that wouldn't actually happen, > because the core options do not apply to foreign columns. Well, not at the moment. But I think it's altogether likely that we might want them to in the future. The foreign data wrapper support we have right now is basically a stub until we get around to improving it, so we don't (for example) analyze foreign tables, which means that n_distinct is not relevant. But that's something we presumably want to change at some point. Eventually, I would anticipate that we'll have quite a few more column options and most will apply to both tables and foreign tables, so I'm not keen to bake in something that makes that potentially problematic. I think we should understand attoptions as things that modify the behavior of PostgreSQL, while attfdw/genoptions are there solely for the foreign data wrapper to use. An extra nullable field in pg_attribute isn't costing us anything non-trivial, and the syntactic and definitional clarity seems entirely worth it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
Heikki Linnakangas writes: > On 18.07.2011 18:32, Tom Lane wrote: >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think >> that that might be a significant limitation. > Right, we can easily change the timeout argument to be in milliseconds > instead of microseconds. On the whole I'd be more worried about giving up the shorter waits than the longer ones --- it's not too hard to imagine using submillisecond timeouts in the future, as machines get faster. If we really wanted to fix this, I think we need to move to a wider datatype. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
Robert Haas writes: > On Fri, Jul 15, 2011 at 3:56 PM, Joey Adams > wrote: >> I'm having a really hard time figuring out how the JSON module should >> handle non-Unicode character sets. > But, again, why not just forget about transcoding and define it as > "JSON, if you happen to be using utf-8 as the server encoding, and > otherwise some variant of JSON that uses the server encoding as its > native format?". It seems to me that that would be a heck of a lot > simpler and more reliable, and I'm not sure it's any less useful in > practice. Right offhand, the only argument I can see against this is that we might someday want to pass JSON datums directly to third-party loadable libraries that are picky about UTF8-ness. But we could just document and/or enforce that such libraries can only be used in UTF8-encoded databases. BTW, could the \u problem be finessed by leaving such escapes in source form? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Josh Berkus writes: > Tom, >> No, I don't. You're adding complication to solve a problem that doesn't >> need to be solved. The standard says to return the name of the >> constraint for a constraint-violation failure. It does not say anything >> about naming the associated column(s). COLUMN_NAME is only supposed to >> be defined for certain kinds of errors, and this isn't one of them. > Are we talking about FK constraints here, or CHECK contstraints? Either one. They both have the potential to reference more than one column, so if the committee had meant errors to try to identify the referenced columns, they'd have put something other than COLUMN_NAME into the standard. They didn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-column generic option
Robert Haas writes: > ... I think we should understand > attoptions as things that modify the behavior of PostgreSQL, while > attfdw/genoptions are there solely for the foreign data wrapper to > use. An extra nullable field in pg_attribute isn't costing us > anything non-trivial, and the syntactic and definitional clarity seems > entirely worth it. +1. We paid the price of allowing nullable columns in pg_attribute long ago. One more isn't going to cost anything, especially since virtually every row in that catalog already contains at least one null. I'm not too thrilled with the terminology of "generic options", though. I think this should be understood as specifically "FDW-owned options". If the column isn't reserved for the use of the FDW, then you get right back into the problem of who's allowed to use it and what if there's a collision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On Mon, Jul 18, 2011 at 03:12:20PM -0400, Tom Lane wrote: > Heikki Linnakangas writes: > > On 18.07.2011 18:32, Tom Lane wrote: > >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think > >> that that might be a significant limitation. > > > Right, we can easily change the timeout argument to be in milliseconds > > instead of microseconds. > > On the whole I'd be more worried about giving up the shorter waits than > the longer ones --- it's not too hard to imagine using submillisecond > timeouts in the future, as machines get faster. If we really wanted to > fix this, I think we need to move to a wider datatype. > > regards, tom lane > You could also tag the high bit to allow you to encode larger ranges by having microseconds for the values with the high bit = 0 and use milliseconds for the values with the high bit = 1. Then you could have the best of both without punching up the width of the datatype. Regard, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-column generic option
On Mon, Jul 18, 2011 at 3:26 PM, Tom Lane wrote: > Robert Haas writes: >> ... I think we should understand >> attoptions as things that modify the behavior of PostgreSQL, while >> attfdw/genoptions are there solely for the foreign data wrapper to >> use. An extra nullable field in pg_attribute isn't costing us >> anything non-trivial, and the syntactic and definitional clarity seems >> entirely worth it. > > +1. We paid the price of allowing nullable columns in pg_attribute long > ago. One more isn't going to cost anything, especially since virtually > every row in that catalog already contains at least one null. > > I'm not too thrilled with the terminology of "generic options", though. > I think this should be understood as specifically "FDW-owned options". > If the column isn't reserved for the use of the FDW, then you get right > back into the problem of who's allowed to use it and what if there's a > collision. I concur. The SQL/MED standard is welcome to refer to them as generic options, but at least FTTB they are going to be entirely for FDWs in our implementation, and naming them that way is therefore a Good Thing. If the SQL committee decides to use them in other places and we choose to support that in some future release for some as-yet-unclear purpose, well, it won't be the first time we've modified the system catalog schema. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
2011/7/18 Tom Lane : > Josh Berkus writes: >> Tom, >>> No, I don't. You're adding complication to solve a problem that doesn't >>> need to be solved. The standard says to return the name of the >>> constraint for a constraint-violation failure. It does not say anything >>> about naming the associated column(s). COLUMN_NAME is only supposed to >>> be defined for certain kinds of errors, and this isn't one of them. > >> Are we talking about FK constraints here, or CHECK contstraints? > > Either one. They both have the potential to reference more than one > column, so if the committee had meant errors to try to identify the > referenced columns, they'd have put something other than COLUMN_NAME > into the standard. They didn't. Personally, I see a sense for COLUMN_NAME field only with relation to CHECK_CONSTRAINT - for any other constraint using a COLUMN_NAME is based on parsing a constraint rule - and I don't believe so the standard is based in it. Column check constraint is attached explicitly to one column - but this relation should not be based on semantic. We can check DB2 implementation. Regards Pavel > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Mon, Jul 18, 2011 at 3:19 PM, Tom Lane wrote: > Robert Haas writes: >> On Fri, Jul 15, 2011 at 3:56 PM, Joey Adams >> wrote: >>> I'm having a really hard time figuring out how the JSON module should >>> handle non-Unicode character sets. > >> But, again, why not just forget about transcoding and define it as >> "JSON, if you happen to be using utf-8 as the server encoding, and >> otherwise some variant of JSON that uses the server encoding as its >> native format?". It seems to me that that would be a heck of a lot >> simpler and more reliable, and I'm not sure it's any less useful in >> practice. > > Right offhand, the only argument I can see against this is that we might > someday want to pass JSON datums directly to third-party loadable > libraries that are picky about UTF8-ness. But we could just document > and/or enforce that such libraries can only be used in UTF8-encoded > databases. Right. Or, if someone someday implements a feature to allow multiple server encodings within the same database, then we can tell people to use it if they want JSON to work in a spec-canonical way. > BTW, could the \u problem be finessed by leaving such escapes in > source form? Joey seems to want to canonicalize the input, which argues against that, and to detect invalid surrogate pairs, which likewise argues against that. You could argue that's overkill, but it seems to have at least some merit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduced power consumption in autovacuum launcher process
On Mon, Jul 18, 2011 at 3:28 PM, k...@rice.edu wrote: > On Mon, Jul 18, 2011 at 03:12:20PM -0400, Tom Lane wrote: >> Heikki Linnakangas writes: >> > On 18.07.2011 18:32, Tom Lane wrote: >> >> Hmm. Well, it's not too late to rethink the WaitLatch API, if we think >> >> that that might be a significant limitation. >> >> > Right, we can easily change the timeout argument to be in milliseconds >> > instead of microseconds. >> >> On the whole I'd be more worried about giving up the shorter waits than >> the longer ones --- it's not too hard to imagine using submillisecond >> timeouts in the future, as machines get faster. If we really wanted to >> fix this, I think we need to move to a wider datatype. >> >> regards, tom lane >> > > You could also tag the high bit to allow you to encode larger ranges > by having microseconds for the values with the high bit = 0 and use > milliseconds for the values with the high bit = 1. Then you could > have the best of both without punching up the width of the datatype. Considering that we're just talking about a function call here, and not something that ever goes out to disk, that seems like entirely too much notational complexity. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Pavel Stehule writes: > 2011/7/18 Tom Lane : >>> Are we talking about FK constraints here, or CHECK contstraints? >> Either one. They both have the potential to reference more than one >> column, so if the committee had meant errors to try to identify the >> referenced columns, they'd have put something other than COLUMN_NAME >> into the standard. They didn't. > Personally, I see a sense for COLUMN_NAME field only with relation to > CHECK_CONSTRAINT - for any other constraint using a COLUMN_NAME is > based on parsing a constraint rule - and I don't believe so the > standard is based in it. Read the standard. COLUMN_NAME is defined for use only in syntax_error_or_access_rule_violation errors that relate to a specific column. In fact, the spec is written as (SQL:2008 23.1 GR 4-h-ii): If the syntax error or access rule violation was for an inaccessible column, then the value of COLUMN_NAME is the of that column. Otherwise, the value of COLUMN_NAME is a zero-length string. which suggests that it might be meant *only* for use with INSUFFICIENT_PRIVILEGE errors that are thrown due to a column ACL. We can probably extend that to some other syntax errors, like unknown column or wrong datatype or what have you, but there is nothing here to suggest that we have to force the issue for errors that don't naturally relate to exactly one column. And CHECK constraints don't. Consider "CHECK (f1 > f2)". regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Fri, Jul 15, 2011 at 5:05 PM, Josh Berkus wrote: > As you can probably tell, we are not ready to end the commitfest. (I > think Robert gave me this CF to show why even talking about a one-week > mini-fest is a fantasy. If so, successful.). Heh, heh. Well, it wasn't that premeditated, but I'm always glad to reach a meeting of the minds. :-) > These are complex and need review by advanced hackers. They are also > often interdependant. As such, I'd like to automatically move his > patches to the next CF and ask that hackers who are engaged in working > on them continue to do so between CFs. There are only two patches left and I think we really ought to try to take a crack at doing something with them. Yeb is working on the userspace access vector cache patch, which I think is going drag on longer than we want keep the CommitFest open, but I'm OK with giving it a day or two to shake out. Aside from the benefit of reviewing the actual patch, I see that Yeb is pushing on KaiGai to do further work on the documentation, which I think is of great value. I will review the other patch - on shared object labels - RSN. > PATCHES WHICH I MOVED TO THE NEXT CF 9-2011: > > * Collect frequency statistics and selectivity estimation for arrays > * Allow multiple Postgres clusters running on the same machine to > distinguish themselves in the event log > * lazy vxid locks I'm not clear on your criteria for moving these patches, as they seem to be in somewhat different states. The first one is WIP, and it doesn't really matter whether you move it to the next CommitFest or just mark it returned with feedback. There is active development work going on there, so it's not going to get committed at this point. But the other two are basically just things we didn't get around to reviewing. With respect to the lazy vxid lock patch, it looks to me like Jeff has done a bit of review, and I think the real question here is whether or not we want to go ahead and make that change (with some stylistic and cosmetic corrections) or wait until we have a more complex solution to the spinlock contention problems mapped out. On a pgbench run with 8 clients on a 32-core machine, I see about a 2% speedup from that patch on pgbench -S, and it grows to 8% at 32 clients. At 80 clients (but still just a 32-core box), the results bounce around more, but taking the median of three five-minute runs, it's an 11% improvement. To me, that's enough to make it worth applying, especially considering that what is 11% on today's master is, in raw TPS, equivalent to maybe 30% of yesterday's master (prior to the fast relation lock patch being applied). More, it seems pretty clear that this is the conceptually right thing to do, even if it's going to require some work elsewhere to file down all the rough edges thus exposed. If someone objects to that, then OK, we should talk about that: but so far I don't think anyone has expressed strong opposition: in which case I'd like to fix it up and get it in. With respect to the event-log patch, MauMau has resubmitted that to the next CommitFest, so that's fine as far as it goes. But it might make sense to see if we can twist Magnus's arm into re-reviewing it now while it's fresh in his mind, rather than waiting another two or three months. > PATCHES WHICH HAVE BEEN UPDATED AND NEED FURTHER REVIEW: > > * Cascade Replication No update. > PATCHES STILL UNDER ACTIVE DISCUSSION: > > * Add ability to constrain backend temporary file space Now committed, by Tom. > PATCHES I CAN'T FIGURE OUT THE STATUS OF: > > * pg_comments system view I reviewed this and marked it Returned with Feedback. > * Bugfix for XPATH() if expression returns a scalar value No update. > So, down to a fairly limited list, except that we need a lot of committing. We're down to three patches that fall into this category: two XML patches by Florian, which have been somewhat derailed by an argument about whether values of type xml should, in fact, be required to be valid xml (I think you know my vote on this one...); and an error-reporting patch that Tom weighed in on over the weekend. This last suffers from the issue that it's not quite clear whether Tom is going to do the work or whether he's expecting the submitter to do it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
2011/7/18 Tom Lane : > Pavel Stehule writes: >> 2011/7/18 Tom Lane : Are we talking about FK constraints here, or CHECK contstraints? > >>> Either one. They both have the potential to reference more than one >>> column, so if the committee had meant errors to try to identify the >>> referenced columns, they'd have put something other than COLUMN_NAME >>> into the standard. They didn't. > >> Personally, I see a sense for COLUMN_NAME field only with relation to >> CHECK_CONSTRAINT - for any other constraint using a COLUMN_NAME is >> based on parsing a constraint rule - and I don't believe so the >> standard is based in it. > > Read the standard. COLUMN_NAME is defined for use only in > syntax_error_or_access_rule_violation errors that relate to a specific > column. In fact, the spec is written as (SQL:2008 23.1 GR 4-h-ii): > > If the syntax error or access rule violation was for an inaccessible > column, then the value of COLUMN_NAME is the of that > column. Otherwise, the value of COLUMN_NAME is a zero-length string. > > which suggests that it might be meant *only* for use with > INSUFFICIENT_PRIVILEGE errors that are thrown due to a column ACL. > We can probably extend that to some other syntax errors, like unknown > column or wrong datatype or what have you, but there is nothing here to > suggest that we have to force the issue for errors that don't naturally > relate to exactly one column. And CHECK constraints don't. Consider > "CHECK (f1 > f2)". > ok, this is relative clean, but so for example, NULL or DOMAIN constraints doesn't affect a COLUMN_NAME? These constraints has no name. regards Pavel > regards, tom lane > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Fri, Jul 15, 2011 at 10:05 PM, Josh Berkus wrote: > PATCHES WHICH HAVE BEEN UPDATED AND NEED FURTHER REVIEW: > > * Cascade Replication Sorry, too busy reviewing to take note of this. I expect to commit when its tested and ready. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
Robert Haas writes: > We're down to three patches that fall into this category: two XML > patches by Florian, which have been somewhat derailed by an argument > about whether values of type xml should, in fact, be required to be > valid xml (I think you know my vote on this one...); Yeah, it's not clear to me either which of those are still reasonable candidates for committing as-is. > and an > error-reporting patch that Tom weighed in on over the weekend. This > last suffers from the issue that it's not quite clear whether Tom is > going to do the work or whether he's expecting the submitter to do it. If you mean the business about allowing GUCs in postgresql.conf to be applied even if there are semantic errors elsewhere, I'm just as happy to let Alexey or Florian have a go at it first, if they want. The real question at the moment is do we have consensus about changing that? Because if we do, the submitted patch is certainly not something to commit as-is, and should be marked Returned With Feedback. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
Yeb, Thanks for your volunteering. 2011/7/18 Yeb Havinga : > Hello KaiGai-san, > > I've been preparing to review this patch by reading both pgsql-hackers > history on sepgsql, and also the RHEL 6 guide on SELinux this weekend, today > I installed GIT HEAD with --with-selinux on Scientific Linux 6, developer > installation, so far almost everything looking good. > The Scientific Linux 6 is not suitable, because its libselinux version is a bit older than this patch expects (libselinux-2.0.99 or later). My recommendation is Fedora 15, instead. > 1) the line with for DBNAME in ... do postgres --single etc, lacks a -D > argument and hence gives the error: > postgres does not know where to find the server configuration file. > OK. I intended users to adjust their own paths (including -D option), but an explicit "-D /path/to/database" seems to me more helpful as an example. I'll submit a patch in a separate thread. > 2) there is a dependency to objects outside of the postgresql installation > tree in /etc/selinux/targeted/contexts/sepgsql_contexts, and that file has > an error that is thrown when contrib/sepgsql is executed: > /etc/selinux/targeted/contexts/sepgsql_contexts: line 33 has invalid object > type db_blobs > (same for db_language) > I found your fix for the error on a forum on oss.tresys.com, but IMHO either > the contrib/sepgsql should mention that the dependency exists and it might > contain errors for (older) reference policies, or it should include a > bugfree reference policy for sepgsql to replace the system installed > refpolicy with (and mention that in the install documentation). > It is not an error, but just a notification to inform users that sepgsql_contexts file contains invalid lines. It is harmless, so we can ignore them. I don't think sepgsql.sgml should mention about this noise, because it purely come from the problem in libselinux and refpolicy; these are external packages from viewpoint of PostgreSQL. > 3) sepgsql is currently a bit hard to find in the documentation. > www.postgresql.org website search doesn't find sepgsql and selinux only > refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had > to manually remember it was a contrib module. Also sepgsql isn't linked to > at the SECURITY LABEL page. At the moment I'm unsure if I have seen all > sepgsql related sgml-based documentation. > Improvement of documentation is an issue. The wiki.postgresql.org should be an appropriate place, maybe. The reason why SECURITY LABEL does not point to sepgsql.sgml is that it is general purpose infrastructure for all the upcoming label based security mechanism, not only sepgsql. It was our consensus in v9.1 development. > After fixing the refpolicy I proceeded with the contrib/sepgsql manual, with > the goal to get something easy done, like create a top secret table like > 'thisyearsbonusses', and a single user 'boss' and configure sepgsql in such > a way that only the boss can access the top secret table. I've read the the > contrib documentation, browsed links on the bottom of the page but until now > I don't even have a clue how to proceed. Until I do so, I don't feel it's > appropriate for me to review the avc patch. > At least, you don't need to fix the policy stuff anything. The point of this patch is replacement of existing mechanism (that always asks in-kernel selinux with system-call invocation) by a smart caching mechanism (it requires minimum number of system-call invocation) without any user-visible changes. So, it is not necessary to define a new policy for testing. > Would you be willing to help me getting a bit started? Specific questions > are: > > 1) The contrib doc under DML permissions talks about 'db_table:select' etc? > What are these things? They are not labels since I do not see them listed in > the output of 'select distinct label from pg_seclabel'. > The security label is something like user-id or ownership/object-acl in the default database access controls. It checks a relationship between user-id and ownership/object-acl of the target object. If this relationship allowed particular actions like 'select', 'update' or others, it shall be allowed when user requires these actions. In similar way, 'db_table:select' is a type of action; 'select' on table object, not an identifier of user or objects. SELinux defines a set of allowed actions (such as 'db_table:select') between a particular pair of security labels (such as 'staff_t' and 'sepgsql_table_t'). The pg_seclabel holds only security label of object being referenced. So, you should see /selinux/class/db_*/perms to see list of permissions defined in the security policy (but limited number of them are in use, now). > 2) The regression test label.sql introduces labels with types > sepgsql_trusted_proc_exec_t, sepgsql_ro_table_t. My question is: where are > these defined? What is their meaning? Can I define my own? > The system's default security policy (selinux-policy package) defines all the
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
On Mon, Jul 18, 2011 at 4:21 PM, Kohei KaiGai wrote: >> 3) sepgsql is currently a bit hard to find in the documentation. >> www.postgresql.org website search doesn't find sepgsql and selinux only >> refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had >> to manually remember it was a contrib module. Also sepgsql isn't linked to >> at the SECURITY LABEL page. At the moment I'm unsure if I have seen all >> sepgsql related sgml-based documentation. >> > Improvement of documentation is an issue. > The wiki.postgresql.org should be an appropriate place, maybe. > > The reason why SECURITY LABEL does not point to sepgsql.sgml is > that it is general purpose infrastructure for all the upcoming label based > security mechanism, not only sepgsql. > It was our consensus in v9.1 development. Actually, I think it's that way mostly because we committed the SECURITY LABEL stuff first. I'd be in favor of adding some kind of cross-link. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Pavel Stehule writes: > so for example, NULL or DOMAIN constraints doesn't affect a > COLUMN_NAME? These constraints has no name. Well, the executor's NOT NULL tests could certainly be extended to emit COLUMN_NAME --- I don't see any logical or implementation problem with that, even if it seems to be outside the scope of what the standard says to use the field for. But let's not get into modifying the system catalogs to produce error fields that are not required by the standard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
2011/7/18 Robert Haas : > On Mon, Jul 18, 2011 at 4:21 PM, Kohei KaiGai wrote: >>> 3) sepgsql is currently a bit hard to find in the documentation. >>> www.postgresql.org website search doesn't find sepgsql and selinux only >>> refers to an old PostgreSQL redhat bug in 2005 on bugzilla.redhat.com. I had >>> to manually remember it was a contrib module. Also sepgsql isn't linked to >>> at the SECURITY LABEL page. At the moment I'm unsure if I have seen all >>> sepgsql related sgml-based documentation. >>> >> Improvement of documentation is an issue. >> The wiki.postgresql.org should be an appropriate place, maybe. >> >> The reason why SECURITY LABEL does not point to sepgsql.sgml is >> that it is general purpose infrastructure for all the upcoming label based >> security mechanism, not only sepgsql. >> It was our consensus in v9.1 development. > > Actually, I think it's that way mostly because we committed the > SECURITY LABEL stuff first. I'd be in favor of adding some kind of > cross-link. > Hmm. OK, I'll submit a patch to update this documentation stuff tomorrow, including an explicit -D option as Yeb mentioned. Thanks, -- KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Tom, > Either one. They both have the potential to reference more than one > column, so if the committee had meant errors to try to identify the > referenced columns, they'd have put something other than COLUMN_NAME > into the standard. They didn't. I'm less concerned about the standard here and more concerned about what helps our users. Having column names for an FK error is *extremely* useful for troubleshooting, particularly if the database has been upgraded from the 7.4 days and has non-useful FK names like "$3". I agree that column names for CHECK constraints is a bit of a tar baby, since check constraints can be on complex permutations of columns. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
Simon, >> * Cascade Replication > > Sorry, too busy reviewing to take note of this. I expect to commit > when its tested and ready. So, would that be this commitfest, or next commitfest? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
Robert, >> * Collect frequency statistics and selectivity estimation for arrays >> * Allow multiple Postgres clusters running on the same machine to >> distinguish themselves in the event log >> * lazy vxid locks > > I'm not clear on your criteria for moving these patches, as they seem > to be in somewhat different states. For the array criteria, it took me until the last week of the CF to find a reviewer. That reviewer found a number of issues, and the submitter submitted an updated patch. It looks quite likely that work on that patch will get updated in the next couple weeks but not immediately. For the eventlog, MauMau had submitted an updated patch, but all of our Windows hackers had let me know that they were unavailable for the next couple weeks for review or commit. For lazy VXID locks, I'd the impression that we had an ongoing discussion about whether we were going to apply it or not, and you were on vacation for the last week of the CF. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Josh Berkus wrote: > I'm less concerned about the standard here and more concerned > about what helps our users. Having column names for an FK error > is *extremely* useful for troubleshooting, particularly if the > database has been upgraded from the 7.4 days and has non-useful FK > names like "$3". If it gives a FK constraint name, isn't there a way to get from that to the columns used by the constraint? If we want to support something non-standard, we can always tell them to look at the text of the error detail, right? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Mon, Jul 18, 2011 at 4:19 PM, Tom Lane wrote: >> and an >> error-reporting patch that Tom weighed in on over the weekend. This >> last suffers from the issue that it's not quite clear whether Tom is >> going to do the work or whether he's expecting the submitter to do it. > > If you mean the business about allowing GUCs in postgresql.conf to be > applied even if there are semantic errors elsewhere, I'm just as happy > to let Alexey or Florian have a go at it first, if they want. The real > question at the moment is do we have consensus about changing that? > Because if we do, the submitted patch is certainly not something to > commit as-is, and should be marked Returned With Feedback. I'm not totally convinced. The proposed patch is pretty small, and seems to stand on its own two feet. I don't hear anyone objecting to your proposed plan, but OTOH it doesn't strike me as such a good plan that we should reject all other improvements in the meantime. Maybe I'm missing something... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Mon, Jul 18, 2011 at 3:19 PM, Tom Lane wrote: > BTW, could the \u problem be finessed by leaving such escapes in > source form? Yes, it could. However, it doesn't solve the problem of comparison (needed for member lookup), which requires canonicalizing the strings to be compared. Here's a Unicode handling policy I came up with. It is guaranteed to be lossless as long as the client and database encodings are the same. --- On input (json_in), if the text is valid JSON, it is condensed: * Whitespace characters surrounding tokens are removed. * Long escapes (like \u0022) are converted to short escapes (like \") where possible. * Unnecessary escapes of ASCII characters (e.g. \u0061 and even \u007F) are converted to their respective characters. * Escapes of non-ASCII characters (e.g. \u0080, \u266B, \uD834\uDD1E) are converted to their respective characters, but only if the database encoding is UTF-8. On output (json_out), non-ASCII characters are converted to \u escapes, unless one or more of these very likely circumstances hold: * The client encoding and database encoding are the same. No conversion is performed, so escaping characters will not prevent any conversion errors. * The client encoding is UTF-8. Escaping is not necessary because the client can encode all Unicode codepoints. * The client encoding and/or database encoding is SQL_ASCII. SQL_ASCII tells PostgreSQL to shirk transcoding in favor of speed. When a JSON-encoded string is unwrapped using from_json (e.g. from_json($$ "\u00A1Hola!" $$)), escapes will be converted to the characters they represent. If any escapes cannot be represented in the database encoding, an error will be raised. Note that: * If the database encoding is UTF-8, conversion will never fail. * If the database encoding is SQL_ASCII, conversion will fail if any escapes of non-ASCII characters are present. --- However, I'm having a really hard time figuring out how comparison would work in this framework. Here are a few options: 1. Convert the strings to UTF-8, convert the escapes to characters, and compare the strings. 2. Convert the escapes to the database encoding, then compare the strings. 3. If either string contains escapes of non-ASCII characters, do 1. Otherwise, do 2. Number 1 seems the most sane to me, but could lead to rare errors. Number 3 could produce confusing results. If character set X has three different representations of one Unicode codepoint, then we could have scenarios like this (simplified): "abc♫" != "aaa♫" but: "abc\u266b" == "aaa♫" I suppose a simple solution would be to convert all escapes and outright ban escapes of characters not in the database encoding. This would have the nice property that all strings can be unescaped server-side. Problem is, what if a browser or other program produces, say, \u00A0 (NO-BREAK SPACE), and tries to insert it into a database where the encoding lacks this character? On the other hand, converting all JSON to UTF-8 would be simpler to implement. It would probably be more intuitive, too, given that the JSON RFC says, "JSON text SHALL be encoded in Unicode." In any case, the documentation should say "Use UTF-8 for best results", as there seems to be no entirely satisfactory way to handle JSON in other database encodings. - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add temp_file_limit GUC parameter to constrain temporary file sp
On 19/07/11 02:36, Tom Lane wrote: Bruce Momjian writes: Tom Lane wrote: Huh? Seems like a waste of text to me. What else would happen? Well, if we exceed work_mem, we spill to temp disk. If we exceed temp disk, we error out. Those different behaviors don't seem obvious to me. [ shrug... ] Feel free to change it. No objections from me - can't see any harm in making it very clear what happens when the limit is exceeded :-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] storing TZ along timestamps
On Jul 18, 2011, at 12:29 AM, Robert Haas wrote: > On Fri, Jul 8, 2011 at 11:13 AM, Stuart Bishop > wrote: >> On Mon, Jun 6, 2011 at 7:50 AM, Jim Nasby wrote: >>> On Jun 4, 2011, at 3:56 AM, Greg Stark wrote: On Thu, Jun 2, 2011 at 8:58 PM, Jim Nasby wrote: > > I'm torn between whether the type should store the original time or the > original time converted to GMT. This is the wrong way to think about it. We *never* store time "converted to GMT". When we want to represent a point in time we represent it as seconds since the epoch. >>> Right. Sorry, my bad. >>> The question here is how to represent more complex concepts than simply points in time. I think the two concepts under discussion are a) a composite type representing a point in time and a timezone it should be interpreted in for operations and display and b) the original input provided which is a text string with the constraint that it's a valid input which can be interpreted as a point in time. >>> >>> My fear with A is that something could change that would make it impossible >>> to actually get back to the time that was originally entered. For example, >>> a new version of the timezone database could change something. Though, that >>> problem also exists for timestamptz today, so presumably if it was much of >>> an issue we'd have gotten complaints by now. >> >> The common problem is daylight savings time being declared or >> cancelled. This happens numerous times throughout the year, often with >> short notice. >> >> If you want to store '6pm July 3rd 2014 Pacific/Fiji', and want that >> to keep meaning 6pm Fiji time no matter what decisions the Fijian >> government makes over the next two years, you need to store the >> wallclock (local) time and the timezone. The wallclock time remains >> fixed, but the conversion to UTC may float. >> >> If you are storing an point in time that remains stable no matter >> future political decisions, you store UTC time and an offset. The >> conversion to wallclock time may float, and your 6pm Fiji time meeting >> might change to 5pm or 7pm depending on the policical edicts. > > This is a pretty good point. You would want the first of these > (probably) for the time a meeting is scheduled to start, and the > second for the time of a meteor shower (or some other natural event > that doesn't care what the politicians decide). > > I feel like the second of these is pretty well handled by our existing > timestamptz data type. Granted, you can't store the intended display > time zone, but putting it in a separate column is not really a > problem: at least, it has the right semantics. So maybe the first is > the one we should be aiming at. If so, storing a counter and a time > zone is the wrong approach: you need to record something like month, day, hour, minute, second, tzname>. Right; you need a timestamp and you need to know what timezone that timestamp was entered in. That means you can always convert that time to whatever timezone you'd like (like timestamptz), but you also know what time was originally entered, and what timezone it was entered in. Technically you can do that with a separate field, but that seems really ugly to me. So what we're proposing is a new data type that stores a timestamp as well as the timezone that that time was originally entered in. We can't just store a 3 letter timezone abbreviation, because the mapping from 3 letter TZs to actual TZs is not fixed (see the timezone_abbreviations GUC). I believe the best way to handle this is a system table that stores the name of every timezone that the database has ever loaded from the timezone data files, along with an OID. That means that the storage for this is just a timestamp and an OID. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] storing TZ along timestamps
Jim, > Right; you need a timestamp and you need to know what timezone that timestamp > was entered in. That means you can always convert that time to whatever > timezone you'd like (like timestamptz), but you also know what time was > originally entered, and what timezone it was entered in. Technically you can > do that with a separate field, but that seems really ugly to me. I disagree. It's a good mapping of the actual data. The timestamp and the timezone in which that timestamp was entered are two separate pieces of data and *ought* to be in two separate fields. For one thing, the question of "what timezone was this entered in" is an application-specific question, since you have three different potential timezones: * the actual client timezone * the actual server timezone * the application timezone if the application has configurable timezones In a builtin data type, which of those three would you pick? Only the application knows. Additionally, if you have your timestamp-with-original-timezone data type, then you're going to need to recode every single timestamp-handling function and operator to handle the new type. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Jul18, 2011, at 22:19 , Tom Lane wrote: > Robert Haas writes: >> We're down to three patches that fall into this category: two XML >> patches by Florian, which have been somewhat derailed by an argument >> about whether values of type xml should, in fact, be required to be >> valid xml (I think you know my vote on this one...); > > Yeah, it's not clear to me either which of those are still reasonable > candidates for committing as-is. There are actually three XML-related patches from me in the queue. I'll try to give an overview of their states - as perceived by me, of course. If people want to comment on this, I suggest to do that that either on the existing threads from these patches or on new ones, but not on this one - lest confusion ensues. * Bugfix for XPATH() if text or attribute nodes are selected https://commitfest.postgresql.org/action/patch_view?id=580 Makes XPATH() return valid XML if text or attribute are selected. I'm not aware of any issues with that one, other than Radoslaw's unhappiness about the change of behaviour. Since the current behaviour is very clearly broken (as in dump, reload, ka-Woom), I'm not prepared to accept that as a show-stopper. The question about whether values of type XML should or should not be in fact valid XML is a red herring. That question has long ago been settles by the XML type's input function, which has a pretty clear and consistent idea about what constitutes a valid value of type XML. The patch simply makes XPATH() abide by those rules, instead of making up rules of it's own pretty nilly-willy. * Bugfix for XPATH() if expression returns a scalar value https://commitfest.postgresql.org/action/patch_view?id=565 Makes XPATH() support XPath expressions which compute a scalar value, not a node set (i.e. apply a top-level function such as name()). Currently, we return an empty array in that case. Peter Eisentraut initially seemed to prefer creating separate functions for that (XPATH_STRING, ...). I argued against that, on the grounds that that'd make it impossible to evaluate user-supplied XPath expression (since you wouldn't know which function to use). I haven't heard back from him after that argument. Radoslaw liked the patch, but wanted the quoting removed - same theme as with the other patch. * XML error handling improvement to fix XPATH bug https://commitfest.postgresql.org/action/patch_view?id=579 Like that first patch, it fixes a bug where XPATH() returns invalid XML. The cause is completely different though. Here, libxml is (partially) at fault - it's parsing methods always return a document instance if a document is *structurally* valid, even if it contains semantic error like invalid namespace references. But it isn't fully prepared to actually handle these inconsistent documents - for example, when asked to render a namespace URI as text, it unconditionally assumes it doesn't have to escape it, because it may not contain special characters anway. Which, if course, isn't necessarily true for structurally valid but semantically invalid documents... valid... Fixing that wasn't possible without a major rewrite of the XML support's error handling - one must use the modern version of libxml's error handling infrastructure to actually be able to detect these semantic error reliably and distinguish them from mere warnings. Noah Misch has reviewed that patch pretty extensively (Thanks again, Noah!), and I've resolved all his concerns regarding the implementation. I haven't seen a single argument *against* applying this so far. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: a validator for configuration files
Tom, Florian, >> On the downside, the current behaviour prevents problems if someone changes >> two interrelated GUCs, but makes a mistake at one of them. For example, >> someone might drastically lower bgwriter_delay but might botch the matching >> adjustment of bgwriter_lru_maxpages. > > That's a fair point, but the current behavior only saves you if the > botch is such that the new value is detectably invalid, as opposed to > say just a factor of 100 off from what you meant. Not sure that that's > all that helpful. Hmmm. As someone who often deploys pg.conf changes as part of a production code rollout, I actually like the "atomic" nature of updating postgresql.conf -- that is, all your changes succeed, or they all fail. If we add this feature, I'd want there to be an option which allows getting the current all-or-none behavior. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Jul18, 2011, at 22:19 , Tom Lane wrote: >> and an >> error-reporting patch that Tom weighed in on over the weekend. This >> last suffers from the issue that it's not quite clear whether Tom is >> going to do the work or whether he's expecting the submitter to do it. > > If you mean the business about allowing GUCs in postgresql.conf to be > applied even if there are semantic errors elsewhere, I'm just as happy > to let Alexey or Florian have a go at it first, if they want. The real > question at the moment is do we have consensus about changing that? > Because if we do, the submitted patch is certainly not something to > commit as-is, and should be marked Returned With Feedback. Just to avoid false expectations here: I'd be happy to review further versions of this patch, but I won't write them. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
"Kevin Grittner" writes: > Josh Berkus wrote: >> I'm less concerned about the standard here and more concerned >> about what helps our users. Having column names for an FK error >> is *extremely* useful for troubleshooting, particularly if the >> database has been upgraded from the 7.4 days and has non-useful FK >> names like "$3". > If it gives a FK constraint name, isn't there a way to get from that > to the columns used by the constraint? If we want to support > something non-standard, we can always tell them to look at the text > of the error detail, right? Yes. This is entirely *not* about friendliness to human users; they're going to read the existing primary/detail/hint fields, and probably aren't even going to see these new error fields by default. What the new fields are meant for is allowing client programs to do something useful without parsing the text of the human-oriented fields ... for instance, identify which FK constraint got violated. Somebody who's intending to use this functionality would presumably take care to give his constraints names that were helpful for his purposes. Moreover, if he's hoping to use that client code against more than one database, what he's going to want is SQL-standard functionality, not more nor less. As for the "my constraints have names like $3" argument, maybe an ALTER CONSTRAINT RENAME command would be the most helpful answer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Initial Review: JSON contrib modul was: Re: [HACKERS] Another swing at JSON
On Jul19, 2011, at 00:17 , Joey Adams wrote: > I suppose a simple solution would be to convert all escapes and > outright ban escapes of characters not in the database encoding. +1. Making JSON work like TEXT when it comes to encoding issues makes this all much simpler conceptually. It also avoids all kinds of weird issues if you extract textual values from a JSON document server-side. If we really need more flexibility than that, we should look at ways to allow different columns to have different encodings. Doing that just for JSON seems wrongs - especially because doesn't really reduce the complexity of the problem, as your examples shows. The essential problem here is, AFAICS, that there's really no sane way to compare strings in two different encodings, unless both encode a subset of unicode only. > This would have the nice property that all strings can be unescaped > server-side. Problem is, what if a browser or other program produces, > say, \u00A0 (NO-BREAK SPACE), and tries to insert it into a database > where the encoding lacks this character? They'll get an error - just as if they had tried to store that same character in a TEXT column. > On the other hand, converting all JSON to UTF-8 would be simpler to > implement. It would probably be more intuitive, too, given that the > JSON RFC says, "JSON text SHALL be encoded in Unicode." Yet, they only I reason I'm aware of for some people to not use UTF-8 as the server encoding is that it's pretty inefficient storage-wise for some scripts (AFAIR some japanese scripts are an example, but I don't remember the details). By making JSON store UTF-8 on-disk always, the JSON type gets less appealing to those people. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] storing TZ along timestamps
Excerpts from Josh Berkus's message of lun jul 18 18:37:15 -0400 2011: > The timestamp and the timezone in which that timestamp was entered are > two separate pieces of data and *ought* to be in two separate fields. > For one thing, the question of "what timezone was this entered in" is an > application-specific question, since you have three different potential > timezones: > > * the actual client timezone > * the actual server timezone > * the application timezone if the application has configurable timezones > > In a builtin data type, which of those three would you pick? Only the > application knows. I think this whole discussion is built on the assumption that the client timezone and the application timezone are one thing and the same; and the server timezone is not relevant at all. If the app TZ is not the client TZ, then the app will need fixed. > Additionally, if you have your timestamp-with-original-timezone data > type, then you're going to need to recode every single > timestamp-handling function and operator to handle the new type. I have my doubts about that, and I hope not. These details haven't been discussed at all; I only started this thread to get community approval on cataloguing the TZs. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module plpgsql's embeded sql validator
On Mon, Jul 18, 2011 at 02:05:42PM -0400, Alvaro Herrera wrote: > Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011: > > > On a somewhat related note, I'd also really like to have the > > ability to parse things like .sql files externally, to do things > > like LINT checking. > > We talked about this during PGCon. The idea that seemed to have > consensues there was to export the parser similarly to how we build > the ecpg parser, that is, a set of perl scripts which take our > gram.y as input and modify it to emit something different. That solves the non-customized part of the problem, which is worth solving. A complete parser for a given DB would need catalog access, i.e. a live connection to that DB. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for 9.2: enhanced errors
Excerpts from Pavel Stehule's message of lun jul 18 16:02:43 -0400 2011: > 2011/7/18 Tom Lane : > > which suggests that it might be meant *only* for use with > > INSUFFICIENT_PRIVILEGE errors that are thrown due to a column ACL. > > We can probably extend that to some other syntax errors, like unknown > > column or wrong datatype or what have you, but there is nothing here to > > suggest that we have to force the issue for errors that don't naturally > > relate to exactly one column. And CHECK constraints don't. Consider > > "CHECK (f1 > f2)". > > ok, this is relative clean, but > > so for example, NULL or DOMAIN constraints doesn't affect a > COLUMN_NAME? These constraints has no name. I dunno about domains, but NOT NULL constraints definitely have names according to the standard (and will have them in PG soon enough). Hmm, domain constraints are CHECK or NOT NULL, and both of them have or will have names. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: new contrib module plpgsql's embeded sql validator
>> We talked about this during PGCon. The idea that seemed to have >> consensues there was to export the parser similarly to how we build >> the ecpg parser, that is, a set of perl scripts which take our >> gram.y as input and modify it to emit something different. > > That solves the non-customized part of the problem, which is worth > solving. In addition to this, parsing SQL may need tree walker functions and some support functions(e.g. memory management). These are source files from pgpool-II's parser directory. copyfuncs.c kwlist.h nodes.c pg_list.h pool_string.h value.c gram.c kwlookup.c nodes.h pg_wchar.h primnodes.hvalue.h gram.h list.c outfuncs.cpool_memory.c scan.c wchar.c gramparse.h makefuncs.c parsenodes.h pool_memory.h scanner.h keywords.c makefuncs.h parser.c pool_parser.h scansup.c keywords.h memnodes.h parser.h pool_string.c scansup.h > A complete parser for a given DB would need catalog access, > i.e. a live connection to that DB. Yes, pgpool-II does some of this. (for example, to know if particular table is a tempory table, to expand "*" to real column lists and so on). Just F.Y.I. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Allow \dd to show constraint comments
On Sun, Jul 17, 2011 at 11:25 PM, Robert Haas wrote: > On Sat, Jul 2, 2011 at 8:37 PM, Josh Kupershmidt wrote: >> On Sun, Jun 19, 2011 at 11:59 AM, Josh Kupershmidt >> wrote: > It seems funny to have is_system = true unconditionally for any object > type. Why'd you do it that way? Or maybe I should ask, why true > rather than false? Thanks for the review! Well, I was hoping to go by the existing psql backslash commands' notions about what qualifies as "system" and what doesn't; that worked OK for commands which supported the 'S' modifier, but not all do. For objects like tablespaces, where you must be a superuser to create one, it seems like they should all be considered is_system = true, on the vague premise that superuser => system. Other objects like roles are admittedly murkier, and I didn't find any precedent inside describe.c to help make such distinctions. But this is probably all a moot point, see below. >> I had to tweak the pg_proc.h entry for pg_opfamily_is_visible to play >> nice with the recent "transform function" change to that file. > > It seems like we ought to have this function for symmetry regardless > of what happens to the rest of this, so I extracted and committed this > part. Good idea. >> Issues still to be resolved: >> >> 1.) For now, I'm just ignoring the issue of visibility checks; I >> didn't see a simple way to support these checks \dd was doing: >> >> processSQLNamePattern(pset.db, &buf, pattern, true, false, >> "n.nspname", "p.proname", NULL, >> "pg_catalog.pg_function_is_visible(p.oid)"); >> >> I'm a bit leery of adding an "is_visible" column into pg_comments, but >> I'm not sure I have a feasible workaround if we do want to keep this >> visibility check. Maybe a big CASE statement for the last argument of >> processSQLNamePattern() would work... > > Yeah... or we could add a function pg_object_is_visible(classoid, > objectoid) that would dispatch to the relevant visibility testing > function based on object type. Not sure if that's too much of a > kludge, but it wouldn't be useful only here: you could probably use it > in combination with pg_locks, for example. Something like that might work, though an easy escape is just leaving the query style of \dd as it was originally (i.e. querying the various catalogs directly, and not using pg_comments): discussed a bit in the recent pg_comments thread w/ Josh Berkus. > The real problem with the is_system column is that it seems to be > entirely targeted around what psql happens to feel like doing. I'm > pretty sure we'll regret it if we choose to go that route. Yeah, it did turn out more messy than I had hoped, and I'm not sure it'd be possible to iron out the semantics of is_system in a way that leaves everyone happy. Probably best to just rip it out if \dd won't need it. I'll try to send an updated patch by this weekend. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Allow \dd to show constraint comments
On Mon, Jul 18, 2011 at 10:57 PM, Josh Kupershmidt wrote: > Well, I was hoping to go by the existing psql backslash commands' > notions about what qualifies as "system" and what doesn't; that worked > OK for commands which supported the 'S' modifier, but not all do. For > objects like tablespaces, where you must be a superuser to create one, > it seems like they should all be considered is_system = true, on the > vague premise that superuser => system. Other objects like roles are > admittedly murkier, and I didn't find any precedent inside describe.c > to help make such distinctions. I think that the idea is that system objects are the ones that the user probably doesn't care to see most of the time - i.e. the ones that are "built in". But... > But this is probably all a moot point, see below. ...yes. >>> Issues still to be resolved: >>> >>> 1.) For now, I'm just ignoring the issue of visibility checks; I >>> didn't see a simple way to support these checks \dd was doing: >>> >>> processSQLNamePattern(pset.db, &buf, pattern, true, false, >>> "n.nspname", "p.proname", NULL, >>> "pg_catalog.pg_function_is_visible(p.oid)"); >>> >>> I'm a bit leery of adding an "is_visible" column into pg_comments, but >>> I'm not sure I have a feasible workaround if we do want to keep this >>> visibility check. Maybe a big CASE statement for the last argument of >>> processSQLNamePattern() would work... >> >> Yeah... or we could add a function pg_object_is_visible(classoid, >> objectoid) that would dispatch to the relevant visibility testing >> function based on object type. Not sure if that's too much of a >> kludge, but it wouldn't be useful only here: you could probably use it >> in combination with pg_locks, for example. > > Something like that might work, though an easy escape is just leaving > the query style of \dd as it was originally (i.e. querying the various > catalogs directly, and not using pg_comments): discussed a bit in the > recent pg_comments thread w/ Josh Berkus. That's another approach. It seems a bit lame, but... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade and log file output on Windows
Has anyone successfully used pg_upgrade 9.0 with -l (log) on Windows? I received a private email bug report that pg_upgrade 9.0 does not work with the -l/log option on Windows. The error is: Analyzing all rows in the new cluster ""c:/MinGW/msys/1.0/home/edb/inst/bin/vacuumdb" --port 55445 --username "edb" --all --analyze >> c:/MinGW/msys/1.0/home/edb/auxschedule/test.log 2>&1" The process cannot access the file because it is being used by another process. What has me confused is this same code exists in pg_migrator, which was fixed to work with -l on Windows by Hiroshi Saito with this change: /* * On Win32, we can't send both server output and pg_ctl output * to the same file because we get the error: * "The process cannot access the file because it is being used by another process." * so we have to send pg_ctl output to 'nul'. */ sprintf(cmd, SYSTEMQUOTE "\"%s/pg_ctl\" -l \"%s\" -D \"%s\" " "-o \"-p %d -c autovacuum=off -c autovacuum_freeze_max_age=20\" " "start >> \"%s\" 2>&1" SYSTEMQUOTE, bindir, ctx->logfile, datadir, port, #ifndef WIN32 ctx->logfile); #else DEVNULL); #endif The fix was not to use the same log file and output file for pg_ctl. But as you can see, the pg_ctl and vacuumdb code is unchanged: prep_status(ctx, "Analyzing all rows in the new cluster"); exec_prog(ctx, true, SYSTEMQUOTE "\"%s/vacuumdb\" --port %d --username \"%s\" " "--all --analyze >> %s 2>&1" SYSTEMQUOTE, ctx->new.bindir, ctx->new.port, ctx->user, ctx->logfile); I can't figure out of there is something odd about this user's setup or if there is a bug in pg_upgrade with -l on Windows. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
On mån, 2011-07-18 at 11:05 -0400, Robert Haas wrote: > On Thu, Jul 7, 2011 at 3:25 PM, Robert Haas wrote: > > On Thu, Jul 7, 2011 at 3:21 PM, Noah Misch wrote: > >> On Thu, Jul 07, 2011 at 03:06:46PM -0400, Robert Haas wrote: > >>> On Thu, Jul 7, 2011 at 2:55 PM, Noah Misch wrote: > >>> > CheckIndexCompatible() calls ComputeIndexAttrs() to resolve the new > >>> > operator > >>> > classes, collations and exclusion operators for each index column. It > >>> > then > >>> > checks those against the existing values for the same. I figured that > >>> > was > >>> > obvious enough, but do you want a new version noting that? > >>> > >>> I guess one question I had was... are we depending on the fact that > >>> ComputeIndexAttrs() performs a bunch of internal sanity checks? Or > >>> are we just expecting those to always pass, and we're going to examine > >>> the outputs after the fact? > >> > >> Those checks can fail; consider an explicit operator class or collation > >> that > >> does not support the destination type. At that stage, we neither rely on > >> those > >> checks nor mind if they do fire. If we somehow miss the problem at that > >> stage, > >> DefineIndex() will detect it later. Likewise, if we hit an error in > >> CheckIndexCompatible(), we would also hit it later in DefineIndex(). > > > > OK. > > Committed with minor comment and documentation changes. Please review and fix this compiler warning: indexcmds.c: In function ‘CheckIndexCompatible’: indexcmds.c:126:15: warning: variable ‘amoptions’ set but not used [-Wunused-but-set-variable] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest Status: Sudden Death Overtime
On Mon, 2011-07-18 at 15:59 -0400, Robert Haas wrote: > On a pgbench run with 8 > clients on a 32-core machine, I see about a 2% speedup from that patch > on pgbench -S, and it grows to 8% at 32 clients. At 80 clients (but > still just a 32-core box), the results bounce around more, but taking > the median of three five-minute runs, it's an 11% improvement. To me, > that's enough to make it worth applying, especially considering that > what is 11% on today's master is, in raw TPS, equivalent to maybe 30% > of yesterday's master (prior to the fast relation lock patch being > applied). More, it seems pretty clear that this is the conceptually > right thing to do, even if it's going to require some work elsewhere > to file down all the rough edges thus exposed. If someone objects to > that, then OK, we should talk about that: but so far I don't think > anyone has expressed strong opposition: in which case I'd like to fix > it up and get it in. Agreed. I certainly like the concept of the lazy vxid patch. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and log file output on Windows
On Mon, July 18, 2011 11:28 pm, Bruce Momjian wrote: > Has anyone successfully used pg_upgrade 9.0 with -l (log) on Windows? > > I received a private email bug report that pg_upgrade 9.0 does not work > with the -l/log option on Windows. The error is: > > Analyzing all rows in the new cluster > ""c:/MinGW/msys/1.0/home/edb/inst/bin/vacuumdb" --port 55445 --username > "edb" --all --analyze > >> c:/MinGW/msys/1.0/home/edb/auxschedule/test.log 2>&1" > The process cannot access the file because it is being used by another > process. > > What has me confused is this same code exists in pg_migrator, which was > fixed to work with -l on Windows by Hiroshi Saito with this change: > > /* >* On Win32, we can't send both server output and pg_ctl output >* to the same file because we get the error: >* "The process cannot access the file because it is being used by > another process." >* so we have to send pg_ctl output to 'nul'. >*/ > sprintf(cmd, SYSTEMQUOTE "\"%s/pg_ctl\" -l \"%s\" -D \"%s\" " > "-o \"-p %d -c autovacuum=off -c > autovacuum_freeze_max_age=20\" " > "start >> \"%s\" 2>&1" SYSTEMQUOTE, >bindir, ctx->logfile, datadir, port, > #ifndef WIN32 >ctx->logfile); > #else >DEVNULL); > #endif > > The fix was not to use the same log file and output file for pg_ctl. > But as you can see, the pg_ctl and vacuumdb code is unchanged: > > prep_status(ctx, "Analyzing all rows in the new cluster"); > exec_prog(ctx, true, > SYSTEMQUOTE "\"%s/vacuumdb\" --port %d --username \"%s\" " > "--all --analyze >> %s 2>&1" SYSTEMQUOTE, > ctx->new.bindir, ctx->new.port, ctx->user, ctx->logfile); > > I can't figure out of there is something odd about this user's setup or > if there is a bug in pg_upgrade with -l on Windows. > The Windows file system seems to have some asynchronicity regarding what files are locked. For that reason, the buildfarm code has long had a couple of "sleep(5)" calls where it calls pg_ctl. You might benefit from doing something similar. cheers andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: a validator for configuration files
On sön, 2011-07-17 at 00:59 -0400, Tom Lane wrote: > Well, we *do* have a C API for that, of a sort. The problem is, what > do you do in processes that have not loaded the relevant extension? Those processes that have the extension loaded check the parameter settings in their namespace, those that don't ignore them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about EDITOR_LINENUMBER_SWITCH
Here's a patch to fix what has been discussed: * Change EDITOR_LINENUMBER_SWITCH to environment variable. * I also changed "switch" to "arg" because "switch" is a bit of a sloppy term. * So the environment variable is called PSQL_EDITOR_LINENUMBER_ARG. * Set "+" as hardcoded default value on Unix (since "vi" is the hardcoded default editor), so many users won't have to configure this at all. * I moved the documentation around a bit to centralize the editor configuration under environment variables, rather than repeating bits of it under every backslash command that invoked an editor. *** i/doc/src/sgml/ref/psql-ref.sgml --- w/doc/src/sgml/ref/psql-ref.sgml *** *** 1440,1464 testdb=> \r to cancel. - ! psql checks the environment ! variables PSQL_EDITOR, EDITOR, and ! VISUAL (in that order) for an editor to use. If ! all of them are unset, vi is used on Unix ! systems, notepad.exe on Windows systems. - ! If a line number is specified, psql will ! position the cursor on the specified line of the file or query buffer. ! This feature requires the EDITOR_LINENUMBER_SWITCH ! variable to be set, so that psql knows how ! to specify the line number to the editor. Note that if a single ! all-digits argument is given, psql assumes ! it is a line number not a file name. --- 1440,1460 \r to cancel. ! If a line number is specified, psql will ! position the cursor on the specified line of the file or query buffer. ! Note that if a single all-digits argument is given, ! psql assumes it is a line number ! not a file name. + ! See under for how to configure and ! customize your editor. + *** *** 1514,1526 Tue Oct 26 21:40:57 CEST 1999 If a line number is specified, psql will ! position the cursor on the specified line of the function body ! (note that the function body typically does not begin on the ! first line of the file). ! This feature requires the EDITOR_LINENUMBER_SWITCH ! variable to be set, so that psql knows how ! to specify the line number to the editor. --- 1510,1527 If a line number is specified, psql will ! position the cursor on the specified line of the function body. ! (Note that the function body typically does not begin on the first ! line of the file.) ! ! ! ! ! See under for how to configure and ! customize your editor. + *** *** 2599,2625 bar - EDITOR_LINENUMBER_SWITCH - - - When \edit or \ef is used with a - line number argument, this variable specifies the command-line switch - used to pass the line number to the user's editor. For editors such - as emacs or vi, you can simply set - this variable to a plus sign. Include a trailing space in the value - of the variable if there needs to be space between the switch name and - the line number. - Examples: - - - \set EDITOR_LINENUMBER_SWITCH + - \set EDITOR_LINENUMBER_SWITCH '--line ' - - - - - - ENCODING --- 2600,2605 *** *** 3167,3174 $endif ! ! Environment --- 3147,3154 ! ! Environment *** *** 3218,3225 $endif ! Editor used by the \e command. The variables ! are examined in the order listed; the first that is set is used. --- 3198,3238 ! Editor used by the \e and ! \ef commands. The variables are examined in ! the order listed; the first that is set is used. ! ! ! ! The built-in default editors are vi on Unix ! systems and notepad.exe on Windows systems. ! ! ! ! ! ! PSQL_EDITOR_LINENUMBER_ARG ! ! ! ! When \e or \ef is used ! with a line number argument, this variable specifies the ! command-line argument used to pass the starting line number to ! the user's editor. For editors such as Emacs or ! vi, this is a plus sign. Include a trailing ! space in the value of the variable if there needs to be space ! between the option name and the line number. Examples: ! ! PSQL_EDITOR_LINENUMBER_ARG='+' ! PSQL_EDITOR_LINENUMBER_ARG='--line ' ! !
[HACKERS] include host names in hba error messages
Since we are accepting host names in pg_hba.conf now, I figured it could be useful to also show the host names in error message, e.g., no pg_hba.conf entry for host "localhost" (127.0.0.1), user "x", database "y" Attached is an example patch. The question might be what criterion to use for when to show the host name. It could be if (port->remote_hostname_resolv == +1) that is, we have done the reverse and forward lookup, or if (port->remote_hostname_resolv >= 0) that is, we have only done the reverse lookup (which is consistent with log_hostname). Although this whole thing could be quite weird, because the message that a host name was rejected because the forward lookup didn't match the IP address is at DEBUG2, so it's usually never shown. So if we tell someone that there is 'no pg_hba.conf entry for host "foo"', even though there is clearly a line saying "foo" in the file, it would be confusing. Ideas? diff --git i/src/backend/libpq/auth.c w/src/backend/libpq/auth.c index 7799111..3701672 100644 --- i/src/backend/libpq/auth.c +++ w/src/backend/libpq/auth.c @@ -442,33 +442,61 @@ ClientAuthentication(Port *port) if (am_walsender) { #ifdef USE_SSL - ereport(FATAL, - (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), - errmsg("no pg_hba.conf entry for replication connection from host \"%s\", user \"%s\", %s", - hostinfo, port->user_name, - port->ssl ? _("SSL on") : _("SSL off"; + if (port->remote_hostname_resolv == +1) + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for replication connection from host \"%s\" (%s), user \"%s\", %s", + port->remote_hostname, hostinfo, port->user_name, + port->ssl ? _("SSL on") : _("SSL off"; + else + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for replication connection from host \"%s\", user \"%s\", %s", + hostinfo, port->user_name, + port->ssl ? _("SSL on") : _("SSL off"; #else - ereport(FATAL, - (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), - errmsg("no pg_hba.conf entry for replication connection from host \"%s\", user \"%s\"", - hostinfo, port->user_name))); + if (port->remote_hostname_resolv == +1) + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for replication connection from host \"%s\" (%s), user \"%s\"", + port->remote_hostname, hostinfo, port->user_name))); + else + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for replication connection from host \"%s\", user \"%s\"", + hostinfo, port->user_name))); #endif } else { #ifdef USE_SSL - ereport(FATAL, - (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), - errmsg("no pg_hba.conf entry for host \"%s\", user \"%s\", database \"%s\", %s", - hostinfo, port->user_name, - port->database_name, - port->ssl ? _("SSL on") : _("SSL off"; + if (port->remote_hostname_resolv == +1) + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for host \"%s\" (%s), user \"%s\", database \"%s\", %s", + port->remote_hostname, hostinfo, port->user_name, + port->database_name, + port->ssl ? _("SSL on") : _("SSL off"; + else + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for host \"%s\", user \"%s\", database \"%s\", %s", + hostinfo, port->user_name, + port->database_name, + port->ssl ? _("SSL on") : _("SSL off"; #else - ereport(FATAL, - (errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), - errmsg("no pg_hba.conf entry for host \"%s\", user \"%s\", database \"%s\"", - hostinfo, port->user_name, - port->database_name))); + if (port->remote_hostname_resolv == +1) + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for host \"%s\" (%s), user \"%s\", database \"%s\"", + port->remote_hostname, hostinfo, port->user_name, + port->database_name))); + else + ereport(FATAL, +(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION), + errmsg("no pg_hba.conf entry for host \"%s\", user \"%s\", database \"%s\"", + hostinfo, port->user_name, + port->database_name))); #endif } break; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Cascading replication feature for streaming log-based replicatio
On Tue, Jul 19, 2011 at 11:44 AM, Simon Riggs wrote: > Cascading replication feature for streaming log-based replication. > Standby servers can now have WALSender processes, which can work with > either WALReceiver or archive_commands to pass data. Fully updated > docs, including new conceptual terms of sending server, upstream and > downstream servers. WALSenders terminated when promote to master. > > Fujii Masao, review, rework and doc rewrite by Simon Riggs Thanks a lot for the commit! You added new GUC category "Sending Server(s)" into the doc. According to this change, we need to change also guc.c and postgresql.conf.sample. Attached patch does that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 557,562 const char *const config_group_names[] = --- 557,564 gettext_noop("Write-Ahead Log / Archiving"), /* REPLICATION */ gettext_noop("Replication"), + /* REPLICATION_SENDING */ + gettext_noop("Replication / Sending Server(s)"), /* REPLICATION_MASTER */ gettext_noop("Replication / Master Server"), /* REPLICATION_STANDBY */ *** *** 1918,1924 static struct config_int ConfigureNamesInt[] = }, { ! {"wal_keep_segments", PGC_SIGHUP, REPLICATION_MASTER, gettext_noop("Sets the number of WAL files held for standby servers."), NULL }, --- 1920,1926 }, { ! {"wal_keep_segments", PGC_SIGHUP, REPLICATION_SENDING, gettext_noop("Sets the number of WAL files held for standby servers."), NULL }, *** *** 1986,1992 static struct config_int ConfigureNamesInt[] = { /* see max_connections */ ! {"max_wal_senders", PGC_POSTMASTER, REPLICATION_MASTER, gettext_noop("Sets the maximum number of simultaneously running WAL sender processes."), NULL }, --- 1988,1994 { /* see max_connections */ ! {"max_wal_senders", PGC_POSTMASTER, REPLICATION_SENDING, gettext_noop("Sets the maximum number of simultaneously running WAL sender processes."), NULL }, *** *** 1996,2002 static struct config_int ConfigureNamesInt[] = }, { ! {"wal_sender_delay", PGC_SIGHUP, REPLICATION_MASTER, gettext_noop("WAL sender sleep time between WAL replications."), NULL, GUC_UNIT_MS --- 1998,2004 }, { ! {"wal_sender_delay", PGC_SIGHUP, REPLICATION_SENDING, gettext_noop("WAL sender sleep time between WAL replications."), NULL, GUC_UNIT_MS *** *** 2007,2013 static struct config_int ConfigureNamesInt[] = }, { ! {"replication_timeout", PGC_SIGHUP, REPLICATION_MASTER, gettext_noop("Sets the maximum time to wait for WAL replication."), NULL, GUC_UNIT_MS --- 2009,2015 }, { ! {"replication_timeout", PGC_SIGHUP, REPLICATION_SENDING, gettext_noop("Sets the maximum time to wait for WAL replication."), NULL, GUC_UNIT_MS *** a/src/backend/utils/misc/postgresql.conf.sample --- b/src/backend/utils/misc/postgresql.conf.sample *** *** 194,212 # REPLICATION #-- ! # - Master Server - ! # These settings are ignored on a standby server #max_wal_senders = 0 # max number of walsender processes # (change requires restart) #wal_sender_delay = 1s # walsender cycle time, 1-1 milliseconds #wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables - #vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed #replication_timeout = 60s # in milliseconds; 0 disables #synchronous_standby_names = '' # standby servers that provide sync rep # comma-separated list of application_name # from standby(s); '*' = all # - Standby Servers - --- 194,217 # REPLICATION #-- ! # - Sending Server(s) - ! # These settings have effect on any server that is to send replication data #max_wal_senders = 0 # max number of walsender processes # (change requires restart) #wal_sender_delay = 1s # walsender cycle time, 1-1 milliseconds #wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables #replication_timeout = 60s # in milliseconds; 0 disables + + # - Master Server - + + # These settings are ignored on a standby server + #synchronous_standby_names = '' # standby servers that provide sync rep # comma-separated list of application_name # from standby(s); '*' = all + #vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed # - Standby Servers - *** a/src/include/utils/guc_tables.h --- b/src/include/utils/guc_tables.h *** *** 69,74 enum config_group --- 69,75 WAL_CHECKPOINTS, WAL_ARCHIVING, REPLICAT
[HACKERS] range types and ip4r
Just wondering, will the planned range type functionality also be able to absorb the functionality of the ip4r type as a range of the ip4 type (http://pgfoundry.org/projects/ip4r)? Maybe it's trivial, but since the ip types also have a kind of hierarchical structure, I figured I'd point it out in case you hadn't considered it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers