Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 01:58:14PM -0700, Peter Geoghegan wrote: The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. Keep in mind that if we ship an index format, we are going to have trouble changing the layout because of pg_upgrade. pg_upgrade can mark the indexes as invalid and force users to reindex, but that is less than idea. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 6:15 AM, Bruce Momjian br...@momjian.us wrote: On Wed, Mar 12, 2014 at 01:58:14PM -0700, Peter Geoghegan wrote: The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. Keep in mind that if we ship an index format, we are going to have trouble changing the layout because of pg_upgrade. pg_upgrade can mark the indexes as invalid and force users to reindex, but that is less than idea. Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Fwiw I have a few questions -- but beware, I'm a complete neophyte when it comes to jsonb style document databases so these are more likely to represent misconceptions on my part than problems with jsonb. I naively though a gin index on a jsonb would help with queries like WHERE col-'prop' = 'val'. In fact it only seems to help with WHERE col ? 'prop'. To help with the former it looks like I need an expression index on col-'prop' is that right? There doesn't seem to be an operator that combines both a dereference and value test into a single operator so I don't think our index machinery can deal with this. Or am I supposed to use contains and construct a json object for the test? I also find it awkward that col-'prop' returns the json representation of the property. If it's text that means it's double-quoted. I would think that a user storing text in a json property would want a way to pull out the text that json property represents so he doesn't have to write col-'prop' = 'foo' and doesn't need to strip the quotes (and de-escape the string?) before displaying the value or passing it through other apis. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6
On 2014-03-12 20:09:23 -0400, Robert Haas wrote: On the pgsql-packagers list, there has been some (OT for that list) discussion of whether commit 9a57858f1103b89a5674f0d50c5fe1f756411df6 is sufficiently serious to justify yet another immediate minor release of 9.3.x. The relevant questions seem to be: 1. Is it really bad? It breaks the ctid of concurrently updated/locked tuples during WAL replay. Which can lead to all sorts of nastiness like indexes not finding any rows. Since that kind of locking/updating is pretty common with foreign keys, it's not an unlikely scenario. Unfortunately FPIs won't save the day in all that many scenarios because there'll normally a XLOG_HEAP2_LOCK_UPDATED before the XLOG_HEAP_LOCK record which is replayed badly. Now, one could argue that it only affects replicas or servers that crashed at some point, but I think that's not much comfort. 2. Does it affect a lot of people or only a few? It's been reported twice (Peter Geoghegan, Greg Stark) by Heroku and one person on IRC could reproduce it repeatedly. The latter was what made me look into it again and find the bug. Greg has confirmed that it fixes the bug when replaying the WAL again. 3. Are there more, equally bad bugs that are unfixed, or perhaps even unreported, yet? Uh. I have no idea. I don't know of any reports that can't be attributed to any of these, but as you're also include unreported bugs in that question... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On 2014-03-12 13:34:47 -0700, Josh Berkus wrote: On 03/12/2014 12:34 PM, Robert Haas wrote: Urgh. That error message looks susceptible to improvement. How about: replication slot %s cannot be dropped because it is currently in use I think that'd require duplicating some code between acquire and drop, but how about replication slot %s is in use by another backend? Sold. Wait ... before you go further ... I object to dropping the word active from the error message. The column is called active, and that's where a DBA should look; that word needs to stay in the error message. replication slot %s is in active in another backend? Alternatively we could replace the boolean active by the owner's pid, but that's a not entirely trivial change... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] db_user_namespace a temporary measure
On 2014-03-12 20:54:36 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Mar 12, 2014 at 9:19 AM, Andres Freund and...@2ndquadrant.com wrote: Except that we don't have the infrastructure to perform such checks (neither partial, nor expression indexes, no exclusion constraints) on system tables atm. So it's not a entirely trivial thing to do. I'm probably woefully underinformed here, but it seems like getting exclusion constraints working might be simpler than partial indexes or expression indexes, because both of those involve being able to evaluate arbitrary predicates, whereas exclusion constraints just involve invoking index access methods to look for conflicting rows via smarts built into your index AM. The latter seems to involve less risk of circularity (but I might be wrong). Exclusion constraints support being partial... But I guess we could forbid using that. You might be right. I don't think anyone's ever looked at what it would take to support that particular case. We have looked at the other cases and run away screaming ... but I think that was before exclusion constraints existed. Hm. Is it actually that complicated to support checking predicates and computing expressions for system catalogs during index insertions? If we'd only create those indexes once the the basic bootstrap is over, I don't see too much problems with circularity? Creating indexes on shared catalogs after the immediate bootstrap isn't entirely trivial, but should be doable. I've searched for running away screaming, but even with extending the search critera a bit I unfortunately came up empty. I don't really see much need for expression indexes on catalogs, but partial unique constraints would surely be useful. Now, what I *do* see problems with would be to try to evaluate predicates/expressions when filling system caches. But it looks to be me like the primary interest at least here is partial unique constraints? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY table FROM STDIN doesn't show count tag
On 12 March 2014 23:57, Tom Lane Wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: My inclination now (see later traffic) is to suppress the status report when the COPY destination is the same as pset.queryFout (ie, a simple test whether the FILE pointers are equal). This would suppress the status report for \copy to stdout and COPY TO STDOUT cases, and also for \copy to pstdout if you'd not redirected queryFout with \o. Based on my analysis, I observed that just file pointer comparison may not be sufficient to decide whether to display command tag or not. E.g. imagine below scenario: psql.exe -d postgres -o 'file.dat' -c \copy tbl to 'file.dat'; Though both destination files are same but file pointer will be different and hence printing status in file 'file.dat' will overwrite some part of data copied to file. Also we don't have any direct way of comparison of file name itself. As I see \copy ... TO.. will print status only in-case of \copy to pstdout if -o option is given. So instead of having so much of confusion and inconsistency that too for one very specific case, I though not to print status for all case Of STDOUT and \COPY ... TO ... This is reasonably similar to what we already do for SELECT, isn't it? I mean, the server always sends back a command tag, but psql sometimes opts not to print it. Right, the analogy to SELECT gives some comfort that this is reasonable. I have modified the patch based on above analysis as: 1. In-case of COPY ... TO STDOUT, command tag will not be displayed. 2. In-case of \COPY ... TO ..., command tag will not be displayed. 3. In all other cases, command tag will be displayed similar as were getting displayed earlier. I have modified the corresponding documentation. Please find the attached revised patch. Thanks and Regards, Kumar Rajeev Rastogi psql-copy-count-tag-20140313.patch Description: psql-copy-count-tag-20140313.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6
On Thu, 2014-03-13 at 12:00 +0100, Andres Freund wrote: On 2014-03-12 20:09:23 -0400, Robert Haas wrote: On the pgsql-packagers list, there has been some (OT for that list) discussion of whether commit 9a57858f1103b89a5674f0d50c5fe1f756411df6 is sufficiently serious to justify yet another immediate minor release of 9.3.x. The relevant questions seem to be: 1. Is it really bad? It breaks the ctid of concurrently updated/locked tuples during WAL replay. Which can lead to all sorts of nastiness like indexes not finding any rows. Since that kind of locking/updating is pretty common with foreign keys, it's not an unlikely scenario. Unfortunately FPIs won't save the day in all that many scenarios because there'll normally a XLOG_HEAP2_LOCK_UPDATED before the XLOG_HEAP_LOCK record which is replayed badly. Now, one could argue that it only affects replicas or servers that crashed at some point, but I think that's not much comfort. 2. Does it affect a lot of people or only a few? It's been reported twice (Peter Geoghegan, Greg Stark) by Heroku and one person on IRC could reproduce it repeatedly. The latter was what made me look into it again and find the bug. Greg has confirmed that it fixes the bug when replaying the WAL again. 3. Are there more, equally bad bugs that are unfixed, or perhaps even unreported, yet? Uh. I have no idea. I don't know of any reports that can't be attributed to any of these, but as you're also include unreported bugs in that question... Does this affect also other branches? 9.2 ? regards, -- Jozef Mlich jml...@redhat.com Associate Software Engineer - EMEA ENG Developer Experience Mobile: +420 604 217 719 http://cz.redhat.com/ Red Hat, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6
On 2014-03-13 13:06:00 +0100, Jozef Mlich wrote: Does this affect also other branches? 9.2 ? Nope, it's 9.3 only. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 4:21 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. That's right ! Should we do these now, that's the question. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Fwiw the jsonb data doesn't actually seem to be any smaller than text json on this data set (this is avg(pg_column_size(col)) and I checked, they're both using the same amount of toast space) jsonb | json ---+--- 813.5 | 716.3 (1 row) It's still more than 7x faster in cpu costs though: stark=# select count(attrs-'properties'-'STREET') from citylots; count 196507 (1 row) Time: 1026.678 ms stark=# select count(attrs-'properties'-'STREET') from citylots_json; count 196507 (1 row) Time: 7418.010 ms -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi Horiguchi-san, Thank you for working this patch! (2014/03/10 17:29), Kyotaro HORIGUCHI wrote: Hello. As a minimal implementation, I made an attempt that emit NOTICE message when alter table affects foreign tables. It looks like following, | =# alter table passwd add column added int, add column added2 int; | NOTICE: This command affects foreign relation cf1 | NOTICE: This command affects foreign relation cf1 | ALTER TABLE | =# select * from passwd; | ERROR: missing data for column added | CONTEXT: COPY cf1, line 1: root:x:0:0:root:/root:/bin/bash | =# This seems far better than silently performing the command, except for the duplicated message:( New bitmap might required to avoid the duplication.. As I said before, I think it would be better to show this kind of information on each of the affected tables whether or not the affected one is foreign. I also think it would be better to show it only when the user has specified an option to do so, similar to a VERBOSE option of other commands. ISTM this should be implemented as a separate patch. I made the changes above and below as an attempt in the attached patch foreign_inherit-v4.patch I think the problem is foreign childs in inheritance tables prevents all menber in the inheritance relation from using parameterized paths, correct? Yes, I think so, too. Hmm. I tried minimal implementation to do that. This omits cost recalculation but seems to work as expected. This seems enough if cost recalc is trivial here. I think we should redo the cost/size estimate, because for example, greater parameterization leads to a smaller rowcount estimate, if I understand correctly. In addition, I think this reparameterization should be done by the FDW itself, becasuse the FDW has more knowledge about it than the PG core. So, I think we should introduce a new FDW routine for that, say ReparameterizeForeignPath(), as proposed in [1]. Attached is an updated version of the patch. Due to the above reason, I removed from the patch the message displaying function you added. Sorry for the delay. [1] http://www.postgresql.org/message-id/530c7464.6020...@lab.ntt.co.jp Best regards, Etsuro Fujita *** a/contrib/file_fdw/file_fdw.c --- b/contrib/file_fdw/file_fdw.c *** *** 117,122 static void fileGetForeignRelSize(PlannerInfo *root, --- 117,126 static void fileGetForeignPaths(PlannerInfo *root, RelOptInfo *baserel, Oid foreigntableid); + static ForeignPath *fileReparameterizeForeignPath(PlannerInfo *root, + RelOptInfo *baserel, + Path *path, + Relids required_outer); static ForeignScan *fileGetForeignPlan(PlannerInfo *root, RelOptInfo *baserel, Oid foreigntableid, *** *** 145,150 static bool check_selective_binary_conversion(RelOptInfo *baserel, --- 149,155 static void estimate_size(PlannerInfo *root, RelOptInfo *baserel, FileFdwPlanState *fdw_private); static void estimate_costs(PlannerInfo *root, RelOptInfo *baserel, + ParamPathInfo *param_info, FileFdwPlanState *fdw_private, Cost *startup_cost, Cost *total_cost); static int file_acquire_sample_rows(Relation onerel, int elevel, *** *** 163,168 file_fdw_handler(PG_FUNCTION_ARGS) --- 168,174 fdwroutine-GetForeignRelSize = fileGetForeignRelSize; fdwroutine-GetForeignPaths = fileGetForeignPaths; + fdwroutine-ReparameterizeForeignPath = fileReparameterizeForeignPath; fdwroutine-GetForeignPlan = fileGetForeignPlan; fdwroutine-ExplainForeignScan = fileExplainForeignScan; fdwroutine-BeginForeignScan = fileBeginForeignScan; *** *** 517,523 fileGetForeignPaths(PlannerInfo *root, (Node *) columns)); /* Estimate costs */ ! estimate_costs(root, baserel, fdw_private, startup_cost, total_cost); /* --- 523,530 (Node *) columns)); /* Estimate costs */ ! estimate_costs(root, baserel, ! NULL, fdw_private, startup_cost, total_cost); /* *** *** 542,547 fileGetForeignPaths(PlannerInfo *root, --- 549,595 } /* + * fileReparameterizeForeignPath + *Attempt to modify a given
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 06:53 AM, Greg Stark wrote: I also find it awkward that col-'prop' returns the json representation of the property. If it's text that means it's double-quoted. I would think that a user storing text in a json property would want a way to pull out the text that json property represents so he doesn't have to write col-'prop' = 'foo' and doesn't need to strip the quotes (and de-escape the string?) before displaying the value or passing it through other apis. - returns dequoted text if the value it points to is a plain string. If it's not doing that then that's a bug. andrew=# select jsonb '{a:the string}' - 'a'; ?column? -- the string (1 row) andrew=# select jsonb '{a:the string}' - 'a' ; ?column? the string (1 row) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
* Tom Lane (t...@sss.pgh.pa.us) wrote: I don't really think partial validation makes sense. We could just remove the whole topic, and tell extension authors that it's up to them to defend themselves against bizarre values stored for their table options. But I'm wondering if there's really so much use-case for a feature like that. While I agree that validation would be a good thing to have, if we can figure out a way to make it work, I don't really see why that has a huge bearing on the use-cases for this feature overall. There's clearly a bunch of use-cases for I need to add a bit of meta-data, for my own needs, about this table. Nasby is doing what Robert was originally advocating (having an independent table-of-tables) and rightfully pointed out that it basically sucks. I feel like a lot of this has to do with reloptions being in some way/shape/form viewed as ours (as in, belongs to -core). I can get behind that idea, but it doesn't solve the use-case. The whole discussion around validation is interesting but it would also eliminate a bunch of natural use-cases as not everyone will want to build an extension or write C code just to have a place to store this extra meta-data (and indeed- we'd probably just end up with someone implementing a custom_reloptions extension which just allowed anything). In the end, perhaps we should just add another field which is called 'custom_reloptions' and allow that to be the wild west? With a few recommendations that extension authors use a prefix of some kind and that individual DBAs use either no-namespace, or one which isn't likely to conflict with real extensions. That would also avoid any possible conflict with what we want to do in core later on. As for dealing with extensions which migrate to core, we might be able to teach pg_dump's binary upgrade about that, and be able to migrate any custom_reloptions which were for the independent extension into the 'core' reloptions, or we could just punt on it and tell people they'll need to re-set the options or use whatever the new DDL is, or perhaps we'll update the extension to just pass through the options. In any case, that strikes me as a solveable problem, particularly if they're independent fields. Perhaps one other option would be to add a new field which is the 'wild west' but then allow extensions to add to reloptions w/ appropriate validation, but I'm not sure that it's really necessary. Extensions should be able to validate the value when they go to use it for whatever they need it for and complain if they don't understand it. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] gaussian distribution pgbench
On Tue, Mar 11, 2014 at 1:49 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2014/03/09 1:49), Fabien COELHO wrote: Hello Mitsumasa-san, New \setrandom interface is here. \setrandom var min max [gaussian threshold | exponential threshold] Attached patch realizes this interface, but it has little bit ugly codeing in executeStatement() and process_commands().. I think it is not too bad. The ignore extra arguments on the line is a little pre-existing mess anyway. All right. What do you think? I'm okay with this UI and its implementation. OK. We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 12:47 AM, Simon Riggs si...@2ndquadrant.com wrote: On 13 March 2014 02:14, Robert Haas robertmh...@gmail.com wrote: I'm not sure why this is being blocked. This is a community contribution that seeks to improve everybody's options. Blocking it does *nothing* to prevent individual extensions from providing table-level options - we give them freedom to do whatever the hell they want. Validation is a pipe dream, not *ever* an achievable reality. Blocking is just exercise of a veto for nobody's gain. Unsurprisingly, I don't agree with any of that. The point is that execising a veto here is irrelevant. Blocking this patch does *nothing* to prevent extensions from adopting per-table options. All that is happening is that a single, coherent mechanism for such options is being blocked. Blocking this is like trying to block rain. We can all pretend the blocking viewpoint has succeeded, but all it does is to bring Postgres core into disrepute. I have often heard that from others that this is a business opportunity, not a problem. If that is true, its not because we didn't try to act for the good of all. It is very true that there are other ways for extensions to manage per-table options. In my mind, that's another reason NOT to throw open the door to unrestricted use of reloptions to store whatever anyone wants to throw in there, but rather to wait until we have a sound and well-thought-out design that we're comfortable with our ability to support and extend into the indefinite future. The bottom line here is that, as in previous years, there are a certain number of people who show up near the end of CF4 and are unhappy that some patch didn't get committed. Generally, they allege that (1) there's nothing wrong with the patch, (2) if there is something wrong with the patch, then it's the fault of the people objecting for not volunteering to fix it, and (3) that if the patch isn't committed despite the objections raised, it's going to be hideously bad for PostgreSQL. Josh Berkus chose to put his version of this rant on his blog: http://www.databasesoup.com/2014/02/why-hstore2jsonb-is-most-important.html But the reality is that most of the patches we reject are in my opinion rejected for good reasons (though some are rejected for bad reasons); that most of the ones that really matter emerge for a later release in greatly improved form; and that the product is better overall of for those delays. Because on projects where people are quick to commit irrevocably to insufficiently-scrutinized design decisions, huge amounts of time and energy get spent digging out from under those bad decisions; or else nobody fixes it and the product just stinks. So, in my opinion, the time and care that we take to get things right is a feature, not a bug. Your mileage may, of course, vary. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 08:42 AM, Greg Stark wrote: Fwiw the jsonb data doesn't actually seem to be any smaller than text json on this data set (this is avg(pg_column_size(col)) and I checked, they're both using the same amount of toast space) jsonb | json ---+--- 813.5 | 716.3 (1 row) That's expected, you save on whitespace, quotes and punctuation and spend on structural overhead (e.g. string lengths). The actual strings stored are the virtally the same. Numbers are stored as numerics, which might or might not be longer. Nulls and booleans are about a wash. It's still more than 7x faster in cpu costs though: stark=# select count(attrs-'properties'-'STREET') from citylots; count 196507 (1 row) Time: 1026.678 ms stark=# select count(attrs-'properties'-'STREET') from citylots_json; count 196507 (1 row) Time: 7418.010 ms That's also expected, it's one of the major benefits. With jsonb you're avoiding reparsing the json. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is this a bug?
On Wed, Mar 12, 2014 at 11:11 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Hi all, Shouldn't the ALTER statements below raise an exception? fabrizio=# CREATE TABLE foo(bar SERIAL PRIMARY KEY); CREATE TABLE fabrizio=# SELECT relname, reloptions FROM pg_class WHERE relname ~ '^foo'; relname | reloptions -+ foo | foo_bar_seq | foo_pkey| (3 rows) fabrizio=# ALTER TABLE foo RESET (noname); ALTER TABLE fabrizio=# ALTER INDEX foo_pkey RESET (noname); ALTER INDEX fabrizio=# ALTER TABLE foo ALTER COLUMN bar RESET (noname); ALTER TABLE If I try to SET an option called noname obviously will raise an exception: fabrizio=# ALTER TABLE foo SET (noname=1); ERROR: unrecognized parameter noname Well, it's fairly harmless, but it might not be a bad idea to tighten that up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 09:17:36 -0400, Robert Haas wrote: It is very true that there are other ways for extensions to manage per-table options. You previously said that, but I really don't see any. Which way out there exists that a) doesn't leave garbage after the relation is dropped or renamed b) is properly dumped by pg_dump c) is properly integratable with cache invalidations. c) is hackable by manually sending cache invalidations from C code when changing the associated information, and by using a relcache callback for cache invalidation, but the others really aren't solveable right now afaics. The bottom line here is that, as in previous years, there are a certain number of people who show up near the end of CF4 and are unhappy that some patch didn't get committed. Generally, they allege that (1) there's nothing wrong with the patch, (2) if there is something wrong with the patch, then it's the fault of the people objecting for not volunteering to fix it, and (3) that if the patch isn't committed despite the objections raised, it's going to be hideously bad for PostgreSQL. I agree that this happens occasionally, but I don't really see evidence of it in this case. We seem to be discussing the merit of the patch itself, not the scheduling of a eventual commit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is this a bug?
On 13-03-2014 00:11, Fabrízio de Royes Mello wrote: Shouldn't the ALTER statements below raise an exception? For consistency, yes. Who cares? I mean, there is no harm in resetting an unrecognized parameter. Have in mind that tighten it up could break scripts. In general, I'm in favor of validating things. euler@euler=# reset noname; ERROR: 42704: unrecognized configuration parameter noname LOCAL: set_config_option, guc.c:5220 -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On Thu, Mar 13, 2014 at 12:45 AM, Amit Kapila amit.kapil...@gmail.com wrote: While attempting to operate in? That seems like unhelpful weasel-wording. I wonder if we ought to have separate messages for each possibility, like delete tuple (X,Y) when called from heap_delete(), update tuple (X,Y), check exclusion constraint on tuple (X,Y) when called from check_exclusion_constraint, etc. That seems like it would be handy information to have. Okay, below are the distinct places from where we need to pass such information. heap_delete - delete tuple (X,Y) heap_update - update tuple (X,Y) heap_lock_tuple - lock tuple (X,Y) heap_lock_updated_tuple_rec - lock updated tuple (X,Y) _bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple location) I don't think that giving the index tuple location is going to be very helpful; can we get the TID for conflicting heap tuple? IndexBuildHeapScan - scan tuple (X,Y) EvalPlanQualFetch - fetch tuple (X,Y) These two seem unhelpful to me. For EvalPlanQualFetch maybe recheck updated tuple would be good, and for IndexBuildHeapScan perhaps checking uniqueness of tuple. check_exclusion_constraint - check exclusion constraint on tuple (X,Y) I think it might not be a big deal to update the patch to pass such info. Won't it effect the translatability guidelines as we need to have different translation message for each op? Yes, we'll need a separate message for each. The other option could be we need to ensure which places are safe to pass tuple so that we can display whole tuple instead of just TID, for example the tuple we are passing from heap_lock_tuple() has been fetched using Dirty Snapshot (refer EvalPlanQualFetch() caller of heap_lock_tuple()), but still we can use it in error as it has been decided that it is live tuple and transaction can update it by the time it reaches XactLockTableWaitWithInfo(), so it is safe. I think we need to discuss and validate all places where ever we use Dirty/Any Snapshot to ensure that we can pass tuple from such a call, may be at end the result is we can pass tuple from most of locations, but still it needs to be done carefully. Well, it's sounding like we can only display the whole tuple if (1) the message level is less than ERROR and (2) the snapshot is an MVCC snapshot. That's an annoying and hard-to-document set of limitations. But we should be able to display the TID always, so I think we should drop the part of the patch that tries to show tuple data and revisit that in a future release if need be. I don't feel too bad about that because it seems to me that showing the TID is a big improvement over the status quo; right now, when you get the information that transaction A is waiting for transaction B, you know they're fighting over some tuple, but you have no idea which one. Even just having the relation name would help a lot, I bet, but if you have the TID also, you can use a SELECT query with WHERE ctid = '(X,Y)' to find the specific tuple of interest. That's maybe not as convenient as having all the data printed out in the log, and there are certainly use cases it won't answer, but it's still a WHOLE lot better than having absolutely NO information, which is what we've got today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 1:08 PM, Andrew Dunstan and...@dunslane.net wrote: - returns dequoted text if the value it points to is a plain string. If it's not doing that then that's a bug. Sorry, I must have gotten confused between various tests. It does seem to be doing that. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
On 03/13/2014 03:17 PM, Fujii Masao wrote: On Tue, Mar 11, 2014 at 1:49 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2014/03/09 1:49), Fabien COELHO wrote: I'm okay with this UI and its implementation. OK. We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. IMHO we should just implement the \setrandom changes, and not add any of these options to modify the standard test workload. If someone wants to run TPC-B workload with gaussian or exponential distribution, they can implement it as a custom script. The docs include the script for the standard TPC-B workload; just copy-paster that and modify the \setrandom lines. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
On Thu, Mar 13, 2014 at 10:51 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 03/13/2014 03:17 PM, Fujii Masao wrote: On Tue, Mar 11, 2014 at 1:49 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2014/03/09 1:49), Fabien COELHO wrote: I'm okay with this UI and its implementation. OK. We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. IMHO we should just implement the \setrandom changes, and not add any of these options to modify the standard test workload. If someone wants to run TPC-B workload with gaussian or exponential distribution, they can implement it as a custom script. The docs include the script for the standard TPC-B workload; just copy-paster that and modify the \setrandom lines. Yeah, I'm OK with this. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 09:17:36 -0400, Robert Haas wrote: It is very true that there are other ways for extensions to manage per-table options. You previously said that, but I really don't see any. Which way out there exists that a) doesn't leave garbage after the relation is dropped or renamed b) is properly dumped by pg_dump c) is properly integratable with cache invalidations. c) is hackable by manually sending cache invalidations from C code when changing the associated information, and by using a relcache callback for cache invalidation, but the others really aren't solveable right now afaics. Well, I'm not going to claim that the methods that exist today are perfect. Things you can do include: (1) the table of tables approach, (2) abusing comments, and perhaps (3) abusing the security label machinery. SECURITY LABEL FOR bdr ON TABLE copy_me IS 'yes, please'? Only the first of those fails prong (a) of your proposed requirements, and they all pass prong (b). I'm not totally sure how well comments and security labels integrate with cache invalidation. An interesting point here is that the SECURITY LABEL functionality is arguably exactly what is wanted here, except for the name of the command. Tables (and almost every other type of object in the system, including columns, functions, etc.) can have an arbitrary number of security labels, each of which must be managed by a separate provider, which gets to validate those options at the time they're applied. Of course, the provider can simply choose to accept everything, if it wants. Dump-and-reload is handled by assuming that you need to have the applicable providers present at reload time (or ignore the errors you get when restoring the dump, or edit the dump). And an interesting point is that the SECURITY LABEL feature has been around since 9.1 and we've had zero complaints about the design. This either means that the design is excellent, or very few people have tried to use it for anything. But I think it would be worth considering to what extent that design (modulo the name) also meets the requirements here. Because it works on all object types, it's actually quite a bit more general than this proposal. And it wouldn't be very hard to drop the word SECURITY from the command and just let objects have labels. (We could even introduce introduce alternate syntax, like ALTER object-type object-name SET LABEL FOR provider TO value, if that makes things nicer, though the confusion of having two completely different syntaxes might not be worth it.) On the other hand, if that design *doesn't* meet the requirements here, then it would be good to know why. What I think we certainly don't want to do is invent a very similar mechanism to what already exists, but with a slightly different set of warts. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is this a bug?
On Thu, Mar 13, 2014 at 10:34 AM, Euler Taveira eu...@timbira.com.br wrote: On 13-03-2014 00:11, Fabrízio de Royes Mello wrote: Shouldn't the ALTER statements below raise an exception? For consistency, yes. Who cares? I mean, there is no harm in resetting an unrecognized parameter. Have in mind that tighten it up could break scripts. In general, I'm in favor of validating things. I know this could break scripts, but I think a consistent behavior should be raise an exception when an option doesn't exists. euler@euler=# reset noname; ERROR: 42704: unrecognized configuration parameter noname LOCAL: set_config_option, guc.c:5220 This is a consistent behavior. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 10:03:03 -0400, Robert Haas wrote: On Thu, Mar 13, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 09:17:36 -0400, Robert Haas wrote: It is very true that there are other ways for extensions to manage per-table options. You previously said that, but I really don't see any. Which way out there exists that a) doesn't leave garbage after the relation is dropped or renamed b) is properly dumped by pg_dump c) is properly integratable with cache invalidations. c) is hackable by manually sending cache invalidations from C code when changing the associated information, and by using a relcache callback for cache invalidation, but the others really aren't solveable right now afaics. Well, I'm not going to claim that the methods that exist today are perfect. Things you can do include: (1) the table of tables approach, (2) abusing comments, and perhaps (3) abusing the security label machinery. SECURITY LABEL FOR bdr ON TABLE copy_me IS 'yes, please'? Only the first of those fails prong (a) of your proposed requirements, and they all pass prong (b). I'm not totally sure how well comments and security labels integrate with cache invalidation. The table of table fall short of all of those, so it's pretty much unusable. Comments aren't usable because there's no way to coordinate between various users of the facility and it breaks their original usage. They also don't produce cache invalidations. But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? They currently don't seem to create invalidations on the objects they are set upon, maybe we should change that? There seems to be pretty little reason to avoid that, the frequence of change really should never be high enough for it to be problematic. And an interesting point is that the SECURITY LABEL feature has been around since 9.1 and we've had zero complaints about the design. This either means that the design is excellent, or very few people have tried to use it for anything. Without saying that its design is bad, I am pretty sure it's because it's basically unused. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 13 March 2014 13:17, Robert Haas robertmh...@gmail.com wrote: The bottom line here is that, as in previous years, there are a certain number of people who show up near the end of CF4 and are unhappy that some patch didn't get committed. Generally, they allege that (1) there's nothing wrong with the patch, (2) if there is something wrong with the patch, then it's the fault of the people objecting for not volunteering to fix it, and (3) that if the patch isn't committed despite the objections raised, it's going to be hideously bad for PostgreSQL. Josh Berkus chose to put his version of this rant on his blog: An interesting twist. 1) It's a simple patch and could be committed. Claiming otherwise would not be accurate. 2) Nobody has said it's the fault of the people objecting for not volunteering to fix it 3) As I explained twice already, *not* committing the patch does *nothing* to prevent extension writers from making up their own mechanism, so blocking the patch does nothing. Writing the extra code required takes a while, but frankly its quicker than pointless arguing. PostgreSQL will not explode if this patch is blocked, nor will it explode if we allow unvalidated options. Hmm, so actually none of those points stick. Perhaps we're talking about another patch that you think should be rejected? Not sure. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
Robert Haas robertmh...@gmail.com writes: Well, it's sounding like we can only display the whole tuple if (1) the message level is less than ERROR and (2) the snapshot is an MVCC snapshot. That's an annoying and hard-to-document set of limitations. But we should be able to display the TID always, so I think we should drop the part of the patch that tries to show tuple data and revisit that in a future release if need be. +1. That avoids the thing that was really bothering me about this patch, which is that it seemed likely to create a whole new set of failure modes. Every case in which the data-printing effort could fail is a case where the patch makes things *less* debuggable, not more so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
[ forgot to respond to this part ] Andres Freund and...@2ndquadrant.com writes: They currently don't seem to create invalidations on the objects they are set upon, maybe we should change that? No, because relcache doesn't store security labels to start with. There's a separate catalog cache for security labels, I believe, and invalidating entries in that ought to be sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 10:24:09 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? Well, if you want to validate the settings then you pretty much have to require that in some form. If there were a CREATE SECURITY LABEL PROVIDER or something, with the catalog pointing to a validator function, we wouldn't necessarily... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 10:20 AM, Andres Freund and...@2ndquadrant.com wrote: Well, I'm not going to claim that the methods that exist today are perfect. Things you can do include: (1) the table of tables approach, (2) abusing comments, and perhaps (3) abusing the security label machinery. SECURITY LABEL FOR bdr ON TABLE copy_me IS 'yes, please'? Only the first of those fails prong (a) of your proposed requirements, and they all pass prong (b). I'm not totally sure how well comments and security labels integrate with cache invalidation. The table of table fall short of all of those, so it's pretty much unusable. Comments aren't usable because there's no way to coordinate between various users of the facility and it breaks their original usage. They also don't produce cache invalidations. But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. Why do we need that? Are we really going to have so many names here that a simple convention that an extension providing multiple names should prefix each one with $EXTENSION_NAME + _ is insufficient? The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? Correct. They currently don't seem to create invalidations on the objects they are set upon, maybe we should change that? There seems to be pretty little reason to avoid that, the frequence of change really should never be high enough for it to be problematic. No objection. And an interesting point is that the SECURITY LABEL feature has been around since 9.1 and we've had zero complaints about the design. This either means that the design is excellent, or very few people have tried to use it for anything. Without saying that its design is bad, I am pretty sure it's because it's basically unused. Sure, that's my bet as well. I think the really interesting question here is how the dump-and-reload issue ought to be handled. As Tom says, it seems on the surface as though you can either require that the provider be loaded for that, or you can accept unvalidated settings. Between those, my vote is for the first, because I think that extensions are not likely to want to have to deal at runtime with the possibility of having arbitrary values where they expect values from a fixed list. Basically, my feeling is that if you install an extension that adds new table-level options, that's effectively a new version of the database, and expecting a dump from that version to restore into a vanilla database is about as reasonable as expecting 9.4 dumps to restore flawlessly on 8.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 13 March 2014 13:17, Stephen Frost sfr...@snowman.net wrote: In the end, perhaps we should just add another field which is called 'custom_reloptions' and allow that to be the wild west? That makes sense. ... and allow that to be the wild west? but that would be an emotive phrase that doesn't help acceptance. As you say, this is just metadata. We have no reason to believe that a DBA would be less careful with metadata than they are with their data. We trust them to design their own tables and fill them with data. I figure we can trust them with options metadata too. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Andres Freund and...@2ndquadrant.com writes: But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? Well, if you want to validate the settings then you pretty much have to require that in some form. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On Thu, Mar 13, 2014 at 7:10 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 13, 2014 at 12:45 AM, Amit Kapila amit.kapil...@gmail.com wrote: _bt_doinsert - insert index tuple (X,Y) (here it will refer to index tuple location) I don't think that giving the index tuple location is going to be very helpful; can we get the TID for conflicting heap tuple? Yes, each index tuple contains reference to TID of heap tuple. IndexBuildHeapScan - scan tuple (X,Y) EvalPlanQualFetch - fetch tuple (X,Y) These two seem unhelpful to me. For EvalPlanQualFetch maybe recheck updated tuple would be good, and for IndexBuildHeapScan perhaps checking uniqueness of tuple. For IndexBuildHeapScan, checking uniqueness of tuple may not be always the case, as in case of HEAPTUPLE_DELETE_IN_PROGRESS, it waits on tuple even for HOT. check_exclusion_constraint - check exclusion constraint on tuple (X,Y) I think it might not be a big deal to update the patch to pass such info. Won't it effect the translatability guidelines as we need to have different translation message for each op? Yes, we'll need a separate message for each. In that case, can't we think of some generic word to use, because in Log along with this message, it will print the SQL Statement causing this log as well, so it might not be completely vague to use some generic word. Well, it's sounding like we can only display the whole tuple if (1) the message level is less than ERROR and (2) the snapshot is an MVCC snapshot. That's an annoying and hard-to-document set of limitations. But we should be able to display the TID always, so I think we should drop the part of the patch that tries to show tuple data and revisit that in a future release if need be. Agreed. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 13 March 2014 14:03, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 13, 2014 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 09:17:36 -0400, Robert Haas wrote: It is very true that there are other ways for extensions to manage per-table options. You previously said that, but I really don't see any. Which way out there exists that a) doesn't leave garbage after the relation is dropped or renamed b) is properly dumped by pg_dump c) is properly integratable with cache invalidations. c) is hackable by manually sending cache invalidations from C code when changing the associated information, and by using a relcache callback for cache invalidation, but the others really aren't solveable right now afaics. Well, I'm not going to claim that the methods that exist today are perfect. Things you can do include: (1) the table of tables approach, (2) abusing comments, and perhaps (3) abusing the security label machinery. SECURITY LABEL FOR bdr ON TABLE copy_me IS 'yes, please'? Only the first of those fails prong (a) of your proposed requirements, and they all pass prong (b). I'm not totally sure how well comments and security labels integrate with cache invalidation. An interesting point here is that the SECURITY LABEL functionality is arguably exactly what is wanted here, except for the name of the command. Tables (and almost every other type of object in the system, including columns, functions, etc.) can have an arbitrary number of security labels, each of which must be managed by a separate provider, which gets to validate those options at the time they're applied. Of course, the provider can simply choose to accept everything, if it wants. Dump-and-reload is handled by assuming that you need to have the applicable providers present at reload time (or ignore the errors you get when restoring the dump, or edit the dump). And an interesting point is that the SECURITY LABEL feature has been around since 9.1 and we've had zero complaints about the design. This either means that the design is excellent, or very few people have tried to use it for anything. But I think it would be worth considering to what extent that design (modulo the name) also meets the requirements here. Because it works on all object types, it's actually quite a bit more general than this proposal. And it wouldn't be very hard to drop the word SECURITY from the command and just let objects have labels. (We could even introduce introduce alternate syntax, like ALTER object-type object-name SET LABEL FOR provider TO value, if that makes things nicer, though the confusion of having two completely different syntaxes might not be worth it.) I like that suggestion, all of it. Perhaps change it to METADATA LABEL ? On the other hand, if that design *doesn't* meet the requirements here, then it would be good to know why. What I think we certainly don't want to do is invent a very similar mechanism to what already exists, but with a slightly different set of warts. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 10:22 AM, Simon Riggs si...@2ndquadrant.com wrote: On 13 March 2014 13:17, Robert Haas robertmh...@gmail.com wrote: The bottom line here is that, as in previous years, there are a certain number of people who show up near the end of CF4 and are unhappy that some patch didn't get committed. Generally, they allege that (1) there's nothing wrong with the patch, (2) if there is something wrong with the patch, then it's the fault of the people objecting for not volunteering to fix it, and (3) that if the patch isn't committed despite the objections raised, it's going to be hideously bad for PostgreSQL. Josh Berkus chose to put his version of this rant on his blog: An interesting twist. 1) It's a simple patch and could be committed. Claiming otherwise would not be accurate. 2) Nobody has said it's the fault of the people objecting for not volunteering to fix it 3) As I explained twice already, *not* committing the patch does *nothing* to prevent extension writers from making up their own mechanism, so blocking the patch does nothing. Writing the extra code required takes a while, but frankly its quicker than pointless arguing. PostgreSQL will not explode if this patch is blocked, nor will it explode if we allow unvalidated options. Hmm, so actually none of those points stick. Perhaps we're talking about another patch that you think should be rejected? Not sure. Well, I'm *trying* to talk about the fact that I think that any machinery that allows custom reloptions (or their equivalent) should also support mandatory validation. I think this subthread is somehow getting sidetracked from the meat of that conversation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Robert Haas escribió: Basically, my feeling is that if you install an extension that adds new table-level options, that's effectively a new version of the database, and expecting a dump from that version to restore into a vanilla database is about as reasonable as expecting 9.4 dumps to restore flawlessly on 8.4. This seems a very reasonable principle to me. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 10:26:11 -0400, Tom Lane wrote: [ forgot to respond to this part ] Andres Freund and...@2ndquadrant.com writes: They currently don't seem to create invalidations on the objects they are set upon, maybe we should change that? No, because relcache doesn't store security labels to start with. There's a separate catalog cache for security labels, I believe, and invalidating entries in that ought to be sufficient. There doesn't seem to be any form of system managed cache for security labels afaics. Every lookup does a index scan. I currently don't see how I could build a cache in userland that'd invalidate if either a) the underlying object changes b) the label changes. I don't have a better idea than triggering invalidations on the respective underlying object. If you have one... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 13 March 2014 14:36, Simon Riggs si...@2ndquadrant.com wrote: I like that suggestion, all of it. Perhaps change it to METADATA LABEL ? Damn. It works, apart from the fact that we don't get parameter=value. That may not be critical, since most use cases I can think of are booleans. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 10:31:12 -0400, Robert Haas wrote: I think the really interesting question here is how the dump-and-reload issue ought to be handled. As Tom says, it seems on the surface as though you can either require that the provider be loaded for that, or you can accept unvalidated settings. Between those, my vote is for the first, because I think that extensions are not likely to want to have to deal at runtime with the possibility of having arbitrary values where they expect values from a fixed list. Basically, my feeling is that if you install an extension that adds new table-level options, that's effectively a new version of the database, and expecting a dump from that version to restore into a vanilla database is about as reasonable as expecting 9.4 dumps to restore flawlessly on 8.4. Pft. I don't expect a restore to succeed without the library present, but I think any such infrastructure should work with a CREATE EXTENSION installing the provider. Especially if we're trying to make this into something more generic than just for pure security labels. It might make sense to always require the library be always loaded for selinux or whatnot, but much less so if it's for a schema management tool or something. Relying on shared_preload_library seems to run counter the route pg's extensibility has taken. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
In this loop, + for (i = 0; i desc-natts; i++) + { + char *val; + int vallen; + + vallen = strlen(val); + if (vallen = maxfieldlen) + appendStringInfoString(buf, val); + else + { + vallen = pg_mbcliplen(val, vallen, maxfieldlen); + appendBinaryStringInfo(buf, val, vallen); + appendStringInfoString(buf, ...); + } + } you're checking that each individual field doesn't go over maxfieldlen chars (30), but if the fields are numerous, this could end up being very long anyway. I think you need to limit total length as well, and as soon as buf.len exceeds some number of chars, exit the loop. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Another question. Is Peter's branch up to date with jsonb_populate_record() ? From discussions on list it sounds like the plan was to get rid of the use_json_as_text argument but his patch still has it. (Tangentially, I wonder if it wouldn't be possible to make this a plain cast. I'm not sure but I think it's possible to have a cast to a polymorphic type and peek at runtime at the record definition to determine what to do). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Andres Freund and...@2ndquadrant.com writes: On 2014-03-13 10:26:11 -0400, Tom Lane wrote: No, because relcache doesn't store security labels to start with. There's a separate catalog cache for security labels, I believe, and invalidating entries in that ought to be sufficient. There doesn't seem to be any form of system managed cache for security labels afaics. Every lookup does a index scan. I currently don't see how I could build a cache in userland that'd invalidate if either a) the underlying object changes b) the label changes. If there's not a catcache for pg_seclabels, I'd have no objection to adding one. As for your userland cache objection, you certainly could build such a thing using the existing inval callbacks (if we had a catcache on pg_seclabels), and in any case what have userland caches got to do with relcache? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 10:27 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 10:24:09 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? Well, if you want to validate the settings then you pretty much have to require that in some form. If there were a CREATE SECURITY LABEL PROVIDER or something, with the catalog pointing to a validator function, we wouldn't necessarily... I seriously doubt that's going to work nicely. Now you've implicitly introduced a dependency from every object that has a label to the label provider. pg_dump is going to have to restore the validator function before it restores anything that has a label, and before that it's going to have to restore the languages used to create those validator functions, and those languages might themselves be labeled, either by that provider or by other providers. Perhaps you could untangle that mess, but I'm disinclined to try because I can't see what real problem we're solving here. Extension that just provide particular functions or datatypes can be loaded on demand, but those that change underlying system behavior need to be loaded by the postmaster, or at least at backend startup. We've tried to patch around that fact with GUCs and it seems to me that we've thoroughly destroyed validation in the process but without really buying ourselves much. There are five contrib modules that define custom variables: auth_delay, auto_explain, pg_stat_statements, sepgsql, and worker_spi. auth_delay, worker_spi and pg_stat_statements have to be loaded at postmaster startup time, and you have to decide whether you want sepgsql at *initdb* time. The only one of those that you can possibly load on the fly into an individual session is auto_explain, and that's probably not very useful: if you have control of the interactive session, you might as well just use EXPLAIN. Maybe there are better examples outside the core distribution, but to me it's looking like the idea that you can add GUCs on the fly into individual sessions is a big fizz. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 10:45 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 10:31:12 -0400, Robert Haas wrote: I think the really interesting question here is how the dump-and-reload issue ought to be handled. As Tom says, it seems on the surface as though you can either require that the provider be loaded for that, or you can accept unvalidated settings. Between those, my vote is for the first, because I think that extensions are not likely to want to have to deal at runtime with the possibility of having arbitrary values where they expect values from a fixed list. Basically, my feeling is that if you install an extension that adds new table-level options, that's effectively a new version of the database, and expecting a dump from that version to restore into a vanilla database is about as reasonable as expecting 9.4 dumps to restore flawlessly on 8.4. Pft. I don't expect a restore to succeed without the library present, but I think any such infrastructure should work with a CREATE EXTENSION installing the provider. Especially if we're trying to make this into something more generic than just for pure security labels. It might make sense to always require the library be always loaded for selinux or whatnot, but much less so if it's for a schema management tool or something. Relying on shared_preload_library seems to run counter the route pg's extensibility has taken. Well, I don't have a big problem with the idea that some sessions might not have a certain extension loaded. For some extensions, that might not lead to very coherent behavior, but I guess it's the extension developer's job to tell the user whether or not that extension needs shared_preload_libraries, needs either shared or local preload_libraries, or can be installed however. At the same time, I don't feel compelled to provide an autoload mechanism to cover the case where a user tries to set a label in a session which does not have the label provider preloaded. Such a mechanism will be complex and introduce many problems of its own for what is in my mind a pretty darn narrow benefit; and we sure as heck do not have time to engineer it for 9.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 10:49 AM, Greg Stark wrote: Another question. Is Peter's branch up to date with jsonb_populate_record() ? From discussions on list it sounds like the plan was to get rid of the use_json_as_text argument but his patch still has it. Yes, we're not changing that, and some people like it anyway. The API is intentionally the same as the legacy json_populate_record API. (Tangentially, I wonder if it wouldn't be possible to make this a plain cast. I'm not sure but I think it's possible to have a cast to a polymorphic type and peek at runtime at the record definition to determine what to do). If you can simplify it be my guest. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 11:11:51 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-13 10:26:11 -0400, Tom Lane wrote: No, because relcache doesn't store security labels to start with. There's a separate catalog cache for security labels, I believe, and invalidating entries in that ought to be sufficient. There doesn't seem to be any form of system managed cache for security labels afaics. Every lookup does a index scan. I currently don't see how I could build a cache in userland that'd invalidate if either a) the underlying object changes b) the label changes. If there's not a catcache for pg_seclabels, I'd have no objection to adding one. Ok. That's an easy enough patch, would anyone object to adding that now? As for your userland cache objection, you certainly could build such a thing using the existing inval callbacks (if we had a catcache on pg_seclabels), and in any case what have userland caches got to do with relcache? I don't think I've said anything about relcaches being required for this. It came up in 20140313132604.gg8...@awork2.anarazel.de, but that was because we were just talking table level there, and it's a tad easier to hook into relcache invalidation callbacks than catcache ones. That said, for a relation level cache that refer's to the table's definition, you really *do* need a relcache invalidation callback, not just a catcache callback. There's a fair number of places that do a CacheInvalidateRelcache() to trigger invals. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgresql XML parsing
Hi, Thanks for the input. I would look into JSON parsing as well, but the requirement is XML parsing. There is no DTD/Schema for the XML. Is there any way I could know what are the possible tags and their values? I am building my parser based on the output PostgreSQL produces (hard coding the tags) and I am afraid I would miss out on tags. Thank you. On Thu, Mar 13, 2014 at 5:47 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, On 03/12/2014 09:36 AM, Ashoke wrote: Hi, I am working on adding a functionality to PostgreSQL. I need to parse the XML format query plan (produced by PostgreSQL v9.3) and save it in a simple data structure (say C structure). I was wondering if ... The only XML parsing we have is where Postgres is built with libxml, in which case we use its parser. But query plan XML is delivered to a client (or a log file, which means more or less the same thing here). As a HACKERS' matter, explain output can be obtained from ExplainPrintPlan() in any format in backend. I don't know if it is the case though. If you want to parse it then it should be parsed in the client - that's why we provide it. Inside postgres I don't see a point in parsing the XML rather than handling the query plan directly. The worst possible option would be to make a hand-cut XML parser, either in the client or the server - XML parsing has all sorts of wrinkles that can bite you badly. I agree with it. If XML input is not essential, JSON format would be parsed more easily than xml. 9.3 already intrinsically has a JSON parser infrastructure available for the purpose. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Regards, Ashoke
Re: [HACKERS] pg_archivecleanup bug
On Thu, Mar 13, 2014 at 1:48 AM, Bruce Momjian br...@momjian.us wrote: On Mon, Dec 9, 2013 at 11:27:28AM -0500, Robert Haas wrote: On Thu, Dec 5, 2013 at 6:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: But the other usages seem to be in assorted utilities, which will need to do it right for themselves. initdb.c's walkdir() seems to have it right and might be a reasonable model to follow. Or maybe we should invent a frontend-friendly version of ReadDir() rather than duplicating all the error checking code in ten-and-counting places? If there's enough uniformity in all of those places to make that feasible, it certainly seems wise to do it that way. I don't know if that's the case, though - e.g. maybe some callers want to exit and others do not. pg_resetxlog wants to exit; pg_archivecleanup and pg_standby most likely want to print an error and carry on. I have developed the attached patch which fixes all cases where readdir() wasn't checking for errno, and cleaned up the syntax in other cases to be consistent. Thanks! While I am not a fan of backpatching, the fact we are ignoring errors in some critical cases seems the non-cosmetic parts should be backpatched. While I haven't read the patch, I agree that this is a back-patchable bug fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Robert Haas escribió: Well, I don't have a big problem with the idea that some sessions might not have a certain extension loaded. For some extensions, that might not lead to very coherent behavior, but I guess it's the extension developer's job to tell the user whether or not that extension needs shared_preload_libraries, needs either shared or local preload_libraries, or can be installed however. At the same time, I don't feel compelled to provide an autoload mechanism to cover the case where a user tries to set a label in a session which does not have the label provider preloaded. Such a mechanism will be complex and introduce many problems of its own for what is in my mind a pretty darn narrow benefit; and we sure as heck do not have time to engineer it for 9.4. Eh? Why do we need an autoload mechanism? As far as I know, we already have one --- you call a function that's in an installed module, and if the module is not loaded, it will then be loaded. So if we have a registry of validator functions, it will just be a matter of calling the validator function, and the autoloader will load the module. Of course, the module needs to have been installed previously, but at least for extensions surely this is going to be the case. I don't really like the LABEL idea being proposed in this subthread to store options. The nice thing about reloptions is that the code to parse input, validate the option names and values, and put the values in a struct is all already there. All a module has to do is call a few appropriate functions and provide a parsing table that goes alongside a struct definition. With LABELs, each module is going to have to provide code to do this all over, unless you're all thinking of something that I'm missing. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-13 10:26:11 -0400, Tom Lane wrote: No, because relcache doesn't store security labels to start with. There's a separate catalog cache for security labels, I believe, and invalidating entries in that ought to be sufficient. There doesn't seem to be any form of system managed cache for security labels afaics. Every lookup does a index scan. I currently don't see how I could build a cache in userland that'd invalidate if either a) the underlying object changes b) the label changes. If there's not a catcache for pg_seclabels, I'd have no objection to adding one. As for your userland cache objection, you certainly could build such a thing using the existing inval callbacks (if we had a catcache on pg_seclabels), and in any case what have userland caches got to do with relcache? I avoided doing that for the same reasons that we've been careful to add no such cache to pg_largeobject_metadata: the number of large objects could be big enough to cause problems with backend memory consumption. Note that large objects are one of the object types to which security labels can be applied, so any concern that applies there also applies here. I have however had the thought before that it would be nice to allow for callbacks of invalidation functions of some kind even on catalogs that don't have catcaches. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 11:15:56 -0400, Robert Haas wrote: On Thu, Mar 13, 2014 at 10:27 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 10:24:09 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: But security labels are a nice idea, will think about it. AFAICs there's no builtin subdivision within the label for one provider which is a bit of a shame but solvable. The biggest issue I see is that it essentially seems to require that the provider is in {shared,local}_preload_libraries? You can't restore into a server otherwise afaics? Well, if you want to validate the settings then you pretty much have to require that in some form. If there were a CREATE SECURITY LABEL PROVIDER or something, with the catalog pointing to a validator function, we wouldn't necessarily... I seriously doubt that's going to work nicely. Now you've implicitly introduced a dependency from every object that has a label to the label provider. pg_dump is going to have to restore the validator function before it restores anything that has a label, and before that it's going to have to restore the languages used to create those validator functions, and those languages might themselves be labeled, either by that provider or by other providers. Aren't pretty much all of those problems already solved because already need to be able to order all this to dump the extension for a datatype before the relation with a column of that type? Perhaps you could untangle that mess, but I'm disinclined to try because I can't see what real problem we're solving here. Extension that just provide particular functions or datatypes can be loaded on demand, but those that change underlying system behavior need to be loaded by the postmaster, or at least at backend startup. Why is adding an annotation to a table changing the underlying system behaviour? There might be cases where it is, and those can easily require having been loaded via s_p_l. We've tried to patch around that fact with GUCs and it seems to me that we've thoroughly destroyed validation in the process but without really buying ourselves much. I think you're making a much bigger issue of GUC validation problems than there is. It's perfectly possible to assign datatypes, check functions et all to custom GUCs? And there's EmitWarningsOnPlaceholders() to warn about unknown GUCs inside a namespace. There are five contrib modules that define custom variables: auth_delay, auto_explain, pg_stat_statements, sepgsql, and worker_spi. auth_delay, worker_spi and pg_stat_statements have to be loaded at postmaster startup time, and you have to decide whether you want sepgsql at *initdb* time. You forgot at least plpgsql. Which is already a good showcase why we want this to be per database, not per cluster, i.e. not preloaded. There's also pretty good reasons to use auto_explain at the session level, because you otherwise can't look inside a function's plans. Maybe there are better examples outside the core distribution, but to me it's looking like the idea that you can add GUCs on the fly into individual sessions is a big fizz. It seems to be on a somewhat odd warpath against against custom gucs ;) . I've used the capability to do so *dozens* of times. What problems have they actually caused? Note that postgresql.conf is parsed long before we initiate shared_preload_libraries et al are taking effect, so even if we'd require libraries to be loaded before custom GUCs can be defined, we'd need to create a entirely new mechanism of loading libraries for it. With a very odd circularity, because to parse postgresql.conf you'd need to have it parsed to load the libraries. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 11:20:02 -0400, Robert Haas wrote: At the same time, I don't feel compelled to provide an autoload mechanism to cover the case where a user tries to set a label in a session which does not have the label provider preloaded. I don't think there's that much need for that to be supported for user initiated setting, but pg_dump imo is a different case. Such a mechanism will be complex and introduce many problems of its own for what is in my mind a pretty darn narrow benefit; and we sure as heck do not have time to engineer it for 9.4. Yea, I think that's pretty clearly out of scope for 9.4, independent of the solution we can come up with. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 2014-03-13 11:26:10 -0400, Robert Haas wrote: On Thu, Mar 13, 2014 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: If there's not a catcache for pg_seclabels, I'd have no objection to adding one. As for your userland cache objection, you certainly could build such a thing using the existing inval callbacks (if we had a catcache on pg_seclabels), and in any case what have userland caches got to do with relcache? I avoided doing that for the same reasons that we've been careful to add no such cache to pg_largeobject_metadata: the number of large objects could be big enough to cause problems with backend memory consumption. Note that large objects are one of the object types to which security labels can be applied, so any concern that applies there also applies here. Good point. Are you primarily worried about the size of the cache, or about the size of the queued invaldations? I guess if it's the former we could just have the cache, but not use it when looking up values. But yuck. I think it'd be cleaner to trigger invalidations on the underlying objects... I have however had the thought before that it would be nice to allow for callbacks of invalidation functions of some kind even on catalogs that don't have catcaches. Unfortunately the format catcache invalidations have is pretty tightly tied to the hash function catcaches use internally. And we need something that can be included in the WAL, otherwise it won't work on HS nodes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is this a bug
fabriziomello wrote On Thu, Mar 13, 2014 at 10:34 AM, Euler Taveira lt; euler@.com gt; wrote: On 13-03-2014 00:11, Fabrízio de Royes Mello wrote: Shouldn't the ALTER statements below raise an exception? For consistency, yes. Who cares? I mean, there is no harm in resetting an unrecognized parameter. Have in mind that tighten it up could break scripts. In general, I'm in favor of validating things. I know this could break scripts, but I think a consistent behavior should be raise an exception when an option doesn't exists. euler@euler=# reset noname; ERROR: 42704: unrecognized configuration parameter noname LOCAL: set_config_option, guc.c:5220 This is a consistent behavior. Regards, Probably shouldn't back-patch but a fix and release comment in 9.4 is warranted. Scripts resetting invalid parameters are probably already broken, they just haven't discovered their mistake yet. Do we need an IF EXISTS feature on these as well? ;) David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Is-this-a-bug-tp5795831p5795943.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgresql XML parsing
On 03/13/2014 11:27 AM, Ashoke wrote: Hi, Thanks for the input. I would look into JSON parsing as well, but the requirement is XML parsing. There is no DTD/Schema for the XML. Is there any way I could know what are the possible tags and their values? I am building my parser based on the output PostgreSQL produces (hard coding the tags) and I am afraid I would miss out on tags. No, it's not possible, since modules can hook in and add their own nodes with arbitrary names (see for example the Postgres FDW which does this). You need to be able to handle arbitrary tags, even if it's by ignoring them. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Andres Freund and...@2ndquadrant.com writes: On 2014-03-13 11:26:10 -0400, Robert Haas wrote: I have however had the thought before that it would be nice to allow for callbacks of invalidation functions of some kind even on catalogs that don't have catcaches. Unfortunately the format catcache invalidations have is pretty tightly tied to the hash function catcaches use internally. And we need something that can be included in the WAL, otherwise it won't work on HS nodes. Note that the existence of a cache doesn't mean it's necessarily populated. In this example, a catcache on pg_seclabels could be used just fine, as long as it wasn't used to load labels for large objects. Even if it were never used at all, it would still provide a usable conduit for invalidation events. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal to join the hackers list
Hi, We are computer engineering students from Maharashtra Institute of Technology, Pune, Maharashtra, India. We are pursuing Bachelor of Engineering degree in Computer Engineering. As a part of the curriculum, we are supposed to perform a group project in the final year on a topic of our choice. We have decided to modify the storage of PostgresSQL for columnar storage along with row based tuple storage. We are trying to modify the planner and optimiser to generate the plan using data stored in both both row and columnar storage. We are thinking to extend this project after curriculum and will integrate btrfs file system with PostgresSQL to store columnar data. So far we have added our own system catalog and have created different folders for row storage and column storage. We are working on the planner and will later integrate all the modules. Guidance form your team would be extremely helpful to us. We had mailed you last year as well regarding the same but at that time we had just started with studying the source code of PostgreSQL. Could you please share your thoughts on the idea? Thanks in advance for your time. *Aditi Munot (munot.ad...@gmail.com)* munot.ad...@gmail.com *Rajashree Mandaogane (rajashree@gmail.com)* rajashree@gmail.com *Swapnil Bhoite (swapnil.bho...@live.com)* swapnil.bho...@live.com *Tanmay Deshpande (tp.deshpand...@gmail.com)* tp.deshpand...@gmail.com
[HACKERS] JSON Patch (RFC 6902) support?
This is my first email to the PostgreSQL mailing lists so I hope this is the correct place. If not, please let me know. I was wondering if it would be possible and wise to support JSON Patch? https://tools.ietf.org/html/rfc6902 One of the problems I have as a user is how to update a portion of a JSON object efficiently. Right now I have to read the entire field from the database, update it, and then write it back. I am thinking JSON Patch might be a good way to solve this problem because it would allow partial updates and I think it could easily fit into the existing set of JSON functions such as: // applies a JSON Patch json_patch_apply(json, patch) // diffs two JSON objects and produces a JSON Patch json_patch_diff(json a, json b) Thanks, Ryan Pedela
Re: [HACKERS] GIN improvements part2: fast scan
On 03/12/2014 07:52 PM, Alexander Korotkov wrote: * I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, rather than GIN_TRUE. The equivalent boolean version returns 'true' without recheck. Is that a typo, or was there some reason for the discrepancy? Actually, there is not difference in current implementation, But I implemented it so that trueTriConsistentFn can correctly work with shimBoolConsistentFn. In this case it should return GIN_MAYBE in case when it have no GIN_MAYBE in the input (as analogue of setting recheck flag). So, it could return GIN_TRUE only if it checked that input has GIN_MAYBE. However, checking would be just wasting of cycles. So I end up with just GIN_MAYBE :-) I don't understand that. As it is, it's inconsistent with the boolean trueConsistent function. trueConsistent always returns TRUE with recheck=false. And in GIN_SEARCH_MODE_EVERYTHING mode, there are no regular scan keys. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch (RFC 6902) support?
On 03/13/2014 09:53 AM, Ryan Pedela wrote: This is my first email to the PostgreSQL mailing lists so I hope this is the correct place. If not, please let me know. I was wondering if it would be possible and wise to support JSON Patch? https://tools.ietf.org/html/rfc6902 One of the problems I have as a user is how to update a portion of a JSON object efficiently. Right now I have to read the entire field from the database, update it, and then write it back. I am thinking JSON Patch might be a good way to solve this problem because it would allow partial updates and I think it could easily fit into the existing set of JSON functions such as: // applies a JSON Patch json_patch_apply(json, patch) // diffs two JSON objects and produces a JSON Patch json_patch_diff(json a, json b) I can't speak to the technical difficulties, but *I* would use it. Note that on the backend Postgres is still going to re-write the entire JSON value. Also, we'd want both a json_patch and jsonb_patch, which would have the same syntax but different internal plumbing. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On 03/13/2014 04:07 AM, Andres Freund wrote: On 2014-03-12 13:34:47 -0700, Josh Berkus wrote: On 03/12/2014 12:34 PM, Robert Haas wrote: Urgh. That error message looks susceptible to improvement. How about: replication slot %s cannot be dropped because it is currently in use I think that'd require duplicating some code between acquire and drop, but how about replication slot %s is in use by another backend? Sold. Wait ... before you go further ... I object to dropping the word active from the error message. The column is called active, and that's where a DBA should look; that word needs to stay in the error message. replication slot %s is in active in another backend? *for* another backend, but that works for me. I just want to keep the word active, because when I encountered that error in testing I knew *immediately* where to look because of the word. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal to join the hackers list
On 03/13/2014 08:54 AM, Rajashree Mandaogane wrote: We have decided to modify the storage of PostgresSQL for columnar storage along with row based tuple storage. We are trying to modify the planner and optimiser to generate the plan using data stored in both both row and columnar storage. We are thinking to extend this project after curriculum and will integrate btrfs file system with PostgresSQL to store columnar data. Huh? btrfs? Why would anything on the filesystem later be involved? And why would you tie a PostgreSQL storage backend to a specific filesystem? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6
All, First, I'll note that one of the reasons we haven't had a bunch of reports from the field about this is that a lot of our users have yet to apply 9.3.3, so if they have corruption issues they probably attribute them to the issues which are fixed in 9.3.3. I know that's the case with our customer base. As much as I hate extra releases, it might be better to push this one out; if we can get it out in the next 2 weeks, folks can skip the downtime for 9.3.3 and go straight to 9.3.4. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6
On Thu, Mar 13, 2014 at 5:10 PM, Josh Berkus j...@agliodbs.com wrote: First, I'll note that one of the reasons we haven't had a bunch of reports from the field about this is that a lot of our users have yet to apply 9.3.3, so if they have corruption issues they probably attribute them to the issues which are fixed in 9.3.3. I know that's the case with our customer base. I was speculating that the reason we saw a sudden bunch after 9.3.3 might be that there might be a number of people who wait N releases before upgrading and the number of people for whom the value of N is 3 might be significant. Or it could be a coincidence. Users will only notice if they fail over to their standby or run queries on their standby. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Patch (RFC 6902) support?
On 03/13/2014 01:01 PM, Josh Berkus wrote: On 03/13/2014 09:53 AM, Ryan Pedela wrote: This is my first email to the PostgreSQL mailing lists so I hope this is the correct place. If not, please let me know. I was wondering if it would be possible and wise to support JSON Patch? https://tools.ietf.org/html/rfc6902 One of the problems I have as a user is how to update a portion of a JSON object efficiently. Right now I have to read the entire field from the database, update it, and then write it back. I am thinking JSON Patch might be a good way to solve this problem because it would allow partial updates and I think it could easily fit into the existing set of JSON functions such as: // applies a JSON Patch json_patch_apply(json, patch) // diffs two JSON objects and produces a JSON Patch json_patch_diff(json a, json b) I can't speak to the technical difficulties, but *I* would use it. Note that on the backend Postgres is still going to re-write the entire JSON value. Also, we'd want both a json_patch and jsonb_patch, which would have the same syntax but different internal plumbing. Some of this will be less than trivial especially for text-format json. But without committing myself I'll be interested to see a patch. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY table FROM STDIN doesn't show count tag
Rajeev rastogi rajeev.rast...@huawei.com writes: [ updated patch ] I've committed this patch with additional revisions. Based on my analysis, I observed that just file pointer comparison may not be sufficient to decide whether to display command tag or not. E.g. imagine below scenario: psql.exe -d postgres -o 'file.dat' -c \copy tbl to 'file.dat'; I don't think it's our responsibility to avoid printing both data and status to the same place in such cases; arguably, in fact, that's exactly what the user told us to do. The important thing is to avoid printing both for the straightforward case of COPY TO STDOUT. For that, file pointer comparison is the right thing, since the option-parsing code will set copysource to match queryFout in exactly the relevant cases. In any case, this revised patch suppressed the status print in *all* COPY_OUT cases, which surely seems like throwing the baby out with the bathwater. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN improvements part2: fast scan
On Thu, Mar 13, 2014 at 8:58 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 03/12/2014 07:52 PM, Alexander Korotkov wrote: * I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, rather than GIN_TRUE. The equivalent boolean version returns 'true' without recheck. Is that a typo, or was there some reason for the discrepancy? Actually, there is not difference in current implementation, But I implemented it so that trueTriConsistentFn can correctly work with shimBoolConsistentFn. In this case it should return GIN_MAYBE in case when it have no GIN_MAYBE in the input (as analogue of setting recheck flag). So, it could return GIN_TRUE only if it checked that input has GIN_MAYBE. However, checking would be just wasting of cycles. So I end up with just GIN_MAYBE :-) I don't understand that. As it is, it's inconsistent with the boolean trueConsistent function. trueConsistent always returns TRUE with recheck=false. And in GIN_SEARCH_MODE_EVERYTHING mode, there are no regular scan keys. Ok, I see. I just messed it up. -- With best regards, Alexander Korotkov.
Re: [HACKERS] gaussian distribution pgbench
We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. Hmmm. That is possible, obviously. Note that it does not need to resort to a custom script, if one can do something like --define=exp_threshold=5.6. If so, maybe one simpler named variable could be used, say threshold, instead of separate names for each options. However there is a catch: currently the option allows to check that the threshold is large enough so as to avoid loops in the generator. So this mean moving the check in the generator, and doing it over and over. Possibly this is a good idea, because otherwise a custom script could circumvent the check. Well, the current status is that the check can be avoided with --define... Also, a shorter possibly additional name, would be nice, maybe something like: --dist=exp|gauss|uniform? Not sure. I like long options not to be too long. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On Thu, Mar 13, 2014 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote: On 03/13/2014 04:07 AM, Andres Freund wrote: On 2014-03-12 13:34:47 -0700, Josh Berkus wrote: On 03/12/2014 12:34 PM, Robert Haas wrote: Urgh. That error message looks susceptible to improvement. How about: replication slot %s cannot be dropped because it is currently in use I think that'd require duplicating some code between acquire and drop, but how about replication slot %s is in use by another backend? Sold. Wait ... before you go further ... I object to dropping the word active from the error message. The column is called active, and that's where a DBA should look; that word needs to stay in the error message. replication slot %s is in active in another backend? *for* another backend, but that works for me. I just want to keep the word active, because when I encountered that error in testing I knew *immediately* where to look because of the word. I think in use is just as clear as active, and I think the text Andres proposed previously reads a whole lot more nicely than this: replication slot %s is in use by another backend -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-13 11:26:10 -0400, Robert Haas wrote: I have however had the thought before that it would be nice to allow for callbacks of invalidation functions of some kind even on catalogs that don't have catcaches. Unfortunately the format catcache invalidations have is pretty tightly tied to the hash function catcaches use internally. And we need something that can be included in the WAL, otherwise it won't work on HS nodes. Note that the existence of a cache doesn't mean it's necessarily populated. In this example, a catcache on pg_seclabels could be used just fine, as long as it wasn't used to load labels for large objects. Even if it were never used at all, it would still provide a usable conduit for invalidation events. That'd need some commenting, but it seems like a possibly workable approach that wouldn't require changing much. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 11:42 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-13 11:26:10 -0400, Robert Haas wrote: On Thu, Mar 13, 2014 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: If there's not a catcache for pg_seclabels, I'd have no objection to adding one. As for your userland cache objection, you certainly could build such a thing using the existing inval callbacks (if we had a catcache on pg_seclabels), and in any case what have userland caches got to do with relcache? I avoided doing that for the same reasons that we've been careful to add no such cache to pg_largeobject_metadata: the number of large objects could be big enough to cause problems with backend memory consumption. Note that large objects are one of the object types to which security labels can be applied, so any concern that applies there also applies here. Good point. Are you primarily worried about the size of the cache, or about the size of the queued invaldations? Mostly the former. I can't really see the latter being a big deal. I mean, if you do a lot of DDL, you'll get more sinval resets, but oh well. We can't optimize away re-examining the data when it actually is changing underneath us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 11:37 AM, Andres Freund and...@2ndquadrant.com wrote: I seriously doubt that's going to work nicely. Now you've implicitly introduced a dependency from every object that has a label to the label provider. pg_dump is going to have to restore the validator function before it restores anything that has a label, and before that it's going to have to restore the languages used to create those validator functions, and those languages might themselves be labeled, either by that provider or by other providers. Aren't pretty much all of those problems already solved because already need to be able to order all this to dump the extension for a datatype before the relation with a column of that type? Well, there's dependency tracking in general. But security label providers don't exist as SQL objects today, so they don't participate in it. You'd need to make them dumpable objects and then add dependencies in pg_(sh)depend and then figure out how to break cycles. We could do that, but I'm not finding it very compelling. Perhaps you could untangle that mess, but I'm disinclined to try because I can't see what real problem we're solving here. Extension that just provide particular functions or datatypes can be loaded on demand, but those that change underlying system behavior need to be loaded by the postmaster, or at least at backend startup. Why is adding an annotation to a table changing the underlying system behaviour? There might be cases where it is, and those can easily require having been loaded via s_p_l. I guess that's true. There are five contrib modules that define custom variables: auth_delay, auto_explain, pg_stat_statements, sepgsql, and worker_spi. auth_delay, worker_spi and pg_stat_statements have to be loaded at postmaster startup time, and you have to decide whether you want sepgsql at *initdb* time. You forgot at least plpgsql. Which is already a good showcase why we want this to be per database, not per cluster, i.e. not preloaded. OK, true. Maybe there are better examples outside the core distribution, but to me it's looking like the idea that you can add GUCs on the fly into individual sessions is a big fizz. It seems to be on a somewhat odd warpath against against custom gucs ;) . I've used the capability to do so *dozens* of times. What problems have they actually caused? Note that postgresql.conf is parsed long before we initiate shared_preload_libraries et al are taking effect, so even if we'd require libraries to be loaded before custom GUCs can be defined, we'd need to create a entirely new mechanism of loading libraries for it. With a very odd circularity, because to parse postgresql.conf you'd need to have it parsed to load the libraries. Yes, that's part of the problem there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Thu, Mar 13, 2014 at 11:30 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas escribió: Well, I don't have a big problem with the idea that some sessions might not have a certain extension loaded. For some extensions, that might not lead to very coherent behavior, but I guess it's the extension developer's job to tell the user whether or not that extension needs shared_preload_libraries, needs either shared or local preload_libraries, or can be installed however. At the same time, I don't feel compelled to provide an autoload mechanism to cover the case where a user tries to set a label in a session which does not have the label provider preloaded. Such a mechanism will be complex and introduce many problems of its own for what is in my mind a pretty darn narrow benefit; and we sure as heck do not have time to engineer it for 9.4. Eh? Why do we need an autoload mechanism? As far as I know, we already have one --- you call a function that's in an installed module, and if the module is not loaded, it will then be loaded. So if we have a registry of validator functions, it will just be a matter of calling the validator function, and the autoloader will load the module. We have an autoload mechanism for functions, but not for label providers. To make label providers autoload, we'd have to store them in a catalog with pointers to pg_proc entries for their validators - which seems like a can of worms, at least at this point in the release cycle. Of course, the module needs to have been installed previously, but at least for extensions surely this is going to be the case. I don't really like the LABEL idea being proposed in this subthread to store options. The nice thing about reloptions is that the code to parse input, validate the option names and values, and put the values in a struct is all already there. All a module has to do is call a few appropriate functions and provide a parsing table that goes alongside a struct definition. With LABELs, each module is going to have to provide code to do this all over, unless you're all thinking of something that I'm missing. Well, I'm not sure that's really any big deal, but I'm not wedded to the label idea. My principal concern is: I'm opposed to allowing unvalidated options into the database. I think it should be a requirement that if the validator can't be found and called, then the reloption is no good and you just reject it. So, if we go with the reloptions route, I'd want to see pg_register_option_namespace go away in favor of some solution that preserves that property. One thing I kind of like about the LABEL approach is that it applies to virtually every object type, meaning that we might not have to repeat this discussion when (as seems inevitable) people want to be able to do things to schemas or tablespaces or roles. But I don't hold that position so strongly as to be unwilling to entertain any other options. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 13.3.2014 13:28, Oleg Bartunov wrote: On Thu, Mar 13, 2014 at 4:21 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. That's right ! Should we do these now, that's the question. Yeah, those are basically the two solutions I proposed a few messages back in this thread. I'm pleased I haven't proposed a complete nonsense. The question whether do that now or wait for 9.5 is a tough one. Doing both for 9.4 is certainly stretching the commitfest to it's limits :-( My impression is that while (2) means rather significant implementation changes in jsonb_hash_ops, (1) is rather straightforward. Is that correct (e.g. how's the truncation going to work with arrays?). If that's true, I'd like propose doing (1) for 9.4 and leaving (2) to 9.5. I'm ready to spend non-trivial amount of time testing the changes required in (1). regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 4:18 AM, Peter Geoghegan p...@heroku.com wrote: * Extensive additional documentation. References to the very new JSON RFC. I think that this revision is in general a lot more coherent, and I found that reflecting on what idiomatic usage should look like while writing the documentation brought clarity to my thoughts on how the code should be structured. The documentation is worth a read if you want to get a better sense of what the patch is about relatively quickly. The attached documentation is excellent -- wow. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] jsonb status
Peter Geoghegan has been doing a lot of great cleanup of the jsonb code, after moving in the bits we wanted from nested hstore. You can see the current state of the code at https://github.com/feodor/postgres/tree/jsonb_and_hstore I've been working through some of his changes, I will probably make a couple of minor tweaks, but basically they look pretty good. I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has finished by the time I am back on deck late tomorrow and that I am able to commit this on Saturday. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add CREATE support to event triggers
Alvaro Herrera escribió: I also fixed the sequence OWNED BY problem simply by adding support for ALTER SEQUENCE. Of course, the intention is that all forms of CREATE and ALTER are supported, but this one seems reasonable standalone because CREATE TABLE uses it internally. I have been hacking on this on and off. This afternoon I discovered that interval typmod output can also be pretty unusual. Example: create table a (a interval year to month); For the column, we get this type spec (note the typmod): coltype: { is_array: false, schemaname: pg_catalog, typename: interval, typmod: year to month }, so the whole command output ends up being this: NOTICE: expanded: CREATE TABLE public.a (a pg_catalog.interval year to month )WITH (oids=OFF) However, this is not accepted on input: alvherre=# CREATE TABLE public.a (a pg_catalog.interval year to month ) WITH (oids=OFF); ERROR: syntax error at or near year LÍNEA 1: CREATE TABLE public.a (a pg_catalog.interval year to mon... ^ I'm not too sure what to do about this yet. I checked the catalogs and gram.y, and it seems that interval is the only type that allows such strange games to be played. I would hate to be forced to add a kludge specific to type interval, but that seems to be the only option. (This would involve checking the OID of the type in deparse_utility.c, and if it's INTERVALOID, then omit the schema qualification and quoting on the type name). I have also been working on adding ALTER TABLE support. So far it's pretty simple; here is an example. Note I run a single command which includes a SERIAL column, and on output I get three commands (just like a serial column on create table). alvherre=# alter table tt add column b numeric, add column c serial, alter column a set default extract(epoch from now()); NOTICE: JSON blob: { definition: [ { clause: cache, fmt: CACHE %{value}s, value: 1 }, { clause: cycle, fmt: %{no}s CYCLE, no: NO }, { clause: increment_by, fmt: INCREMENT BY %{value}s, value: 1 }, { clause: minvalue, fmt: MINVALUE %{value}s, value: 1 }, { clause: maxvalue, fmt: MAXVALUE %{value}s, value: 9223372036854775807 }, { clause: start, fmt: START WITH %{value}s, value: 1 }, { clause: restart, fmt: RESTART %{value}s, value: 1 } ], fmt: CREATE %{persistence}s SEQUENCE %{identity}D %{definition: }s, identity: { objname: tt_c_seq, schemaname: public }, persistence: } NOTICE: expanded: CREATE SEQUENCE public.tt_c_seq CACHE 1 NO CYCLE INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START WITH 1 RESTART 1 NOTICE: JSON blob: { fmt: ALTER TABLE %{identity}D %{subcmds:, }s, identity: { objname: tt, schemaname: public }, subcmds: [ { definition: { collation: { fmt: COLLATE %{name}D, present: false }, coltype: { is_array: false, schemaname: pg_catalog, typename: numeric, typmod: }, default: { fmt: DEFAULT %{default}s, present: false }, fmt: %{name}I %{coltype}T %{default}s %{not_null}s %{collation}s, name: b, not_null: , type: column }, fmt: ADD COLUMN %{definition}s, type: add column }, { definition: { collation: { fmt: COLLATE %{name}D, present: false }, coltype: { is_array: false, schemaname: pg_catalog, typename: int4, typmod: }, default: { default: pg_catalog.nextval('public.tt_c_seq'::pg_catalog.regclass), fmt: DEFAULT %{default}s }, fmt: %{name}I %{coltype}T %{default}s %{not_null}s %{collation}s, name: c, not_null: , type: column }, fmt: ADD COLUMN %{definition}s, type: add column }, { column: a, definition: pg_catalog.date_part('epoch'::pg_catalog.text, pg_catalog.now()),
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I have yet to be convinced that if you hit the GIN limitation it's an indication of anything other than that you need to reconsider your indexing choices (how often have we heard that complaint of GIN before in practice?). Even if you don't hit the limitation directly, with something like jsonb_hash_ops you're still hashing a large nested structure, very probably uselessly. Are you really going to look for an exact match to an elaborate nested structure? I would think, probably not. Now, as Alexander says, there might be a role for another (jsonb_hash_ops) opclass that separately indexes values only. I still think that by far the simplest solution is to use expressional indexes, because we index key values and array element values indifferently. Of course, nothing we have here precludes the development of such an opclass. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On 03/13/2014 01:17 PM, Robert Haas wrote: I think in use is just as clear as active, and I think the text Andres proposed previously reads a whole lot more nicely than this: replication slot %s is in use by another backend Then we should change the column name in the pg_stat_replication_slots view to in_use. My point is that the error message and the diagnostic view should use the same word, or we're needlessly confusing our users. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus j...@agliodbs.com wrote: On 03/13/2014 01:17 PM, Robert Haas wrote: I think in use is just as clear as active, and I think the text Andres proposed previously reads a whole lot more nicely than this: replication slot %s is in use by another backend Then we should change the column name in the pg_stat_replication_slots view to in_use. My point is that the error message and the diagnostic view should use the same word, or we're needlessly confusing our users. I see. That's an interesting point -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On 03/13/2014 05:01 PM, Robert Haas wrote: On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus j...@agliodbs.com wrote: On 03/13/2014 01:17 PM, Robert Haas wrote: I think in use is just as clear as active, and I think the text Andres proposed previously reads a whole lot more nicely than this: replication slot %s is in use by another backend Then we should change the column name in the pg_stat_replication_slots view to in_use. My point is that the error message and the diagnostic view should use the same word, or we're needlessly confusing our users. I see. That's an interesting point As I said earlier, the fact that the current error message says active and the column in pg_stat_replication_slots is called active meant I knew *immediately* where to look. So I'm speaking from personal experience. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On Thu, Mar 13, 2014 at 8:09 PM, Josh Berkus j...@agliodbs.com wrote: On 03/13/2014 05:01 PM, Robert Haas wrote: On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus j...@agliodbs.com wrote: On 03/13/2014 01:17 PM, Robert Haas wrote: I think in use is just as clear as active, and I think the text Andres proposed previously reads a whole lot more nicely than this: replication slot %s is in use by another backend Then we should change the column name in the pg_stat_replication_slots view to in_use. My point is that the error message and the diagnostic view should use the same word, or we're needlessly confusing our users. I see. That's an interesting point As I said earlier, the fact that the current error message says active and the column in pg_stat_replication_slots is called active meant I knew *immediately* where to look. So I'm speaking from personal experience. Well we may have kind of hosed ourselves, because the in-memory data structures that represent the data structure have an in_use flag that indicates whether the structure is allocated at all, and then an active flag that indicates whether some backend is using it. I never liked that naming much. Maybe we should go through and let in_use - allocated and active - in_use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication slots and footguns
On 03/13/2014 05:28 PM, Robert Haas wrote: Well we may have kind of hosed ourselves, because the in-memory data structures that represent the data structure have an in_use flag that indicates whether the structure is allocated at all, and then an active flag that indicates whether some backend is using it. I never liked that naming much. Maybe we should go through and let in_use - allocated and active - in_use. Wait, which one of those does pg_drop_replication_slot() care about? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] requested shared memory size overflows size_t
On 03/04/2014 10:53 PM, Yuri Levinsky wrote: Please advise me: I just downloaded the source and compiled it. Sun Spark Solaris 9 is always 64 bit, I verified it with sys admin. He may run 32 bit applications as well. Have I use some special option during compilation to verify that compiled PostgreSQL is actually 64 bit app? Many platforms include both 32-bit and 64-bit target toolchains. So you might be on a 64-bit platform, but that doesn't mean you aren't compiling a 32-bit executable. Please run: grep '^#define SIZEOF' config.log and post the results. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty
On Wed, Mar 12, 2014 at 12:22 PM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Tue, Mar 11, 2014 at 2:59 PM, Amit Kapila amit.kapil...@gmail.com wrote: By the way have you checked if FreeSpaceMapVacuum() can serve your purpose, because this call already traverses FSM in depth-first order to update the freespace. So may be by using this call or wrapper on this such that it returns total freespace as well apart from updating freespace can serve the need. Thanks for information. we can get the table free space by writing some wrapper or modify a little bit of FreeSpaceMapVacuum() function. I think it might be okay to even change this API to return the FreeSpace, as the other place it is used is for Index Vacuum, so even if we don't have any intention to print such a message for index in this patch, but similar information could be useful there as well to suggest a user that index has lot of free space. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
Hi, In connection to my previous proposal about providing catalog view to pg_hba.conf file contents , I have developed the attached patch . [Current situation] Currently, to view the pg_hba.conf file contents, DB admin has to access the file from database server to read the settings. In case of huge and multiple hba files, finding the appropriate hba rules which are loaded will be difficult and take some time. [What this Patch does] Functionality of the attached patch is that it will provide a new view pg_hba_settings to admin users. Public access to the view is restricted. This view will display basic information about HBA setting details of postgresql cluster. Information to be shown , is taken from parsed hba lines and not directly read from pg_hba.conf files. Documentation files are also updated to include details of this new view under Chapter 47.System Catalogs. Also , a new note is added in chapter 19.1 The pg_hba.conf File [Advantage] Advantage of having this pg_hba_settings view is that the admin can check, what hba rules are loaded in runtime via database connection itself. And, thereby it will be easy and useful for admin to check all the users with their privileges in a single view to manage them. Thanks Regards, Vaishnavi Fujitsu Australia Catalog_view_to_HBA_settings_patch.patch Description: Catalog_view_to_HBA_settings_patch.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
Hi, (2014/03/14 4:21), Fabien COELHO wrote: We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. Hmmm. That is possible, obviously. Note that it does not need to resort to a custom script, if one can do something like --define=exp_threshold=5.6. Yeah, threshold paramter should be needed by generating distribution algorithms in my patch. And it is important that we can control distribution pattern by this paramter. If so, maybe one simpler named variable could be used, say threshold, instead of separate names for each options. If we separate threshold option, I think it is difficult to understand dependency of this parameter. Because threshold is very general term, and when we will add other new feature, it is difficult to undestand which parameter is dependent and be needed. However there is a catch: currently the option allows to check that the threshold is large enough so as to avoid loops in the generator. So this mean moving the check in the generator, and doing it over and over. Possibly this is a good idea, because otherwise a custom script could circumvent the check. Well, the current status is that the check can be avoided with --define... Also, a shorter possibly additional name, would be nice, maybe something like: --dist=exp|gauss|uniform? Not sure. I like long options not to be too long. Well, if we run standard benchmark in pgbench, we need not set option because it is default benmchmark, and it is same as uniform distribution. And if we run extra benchmarks in pgbench which are like '-S' or '-N', we need to set option. Because they are non-standard benchmark setting, and it is same as gaussian or exponential distribution. So present UI keeps consistency and along the pgbench history. I like long options not to be too long. Yes, I like so too. Present UI is very simple and useful for combination using such like '-S' and '--gaussian'. So I hope not changing UI. ex) pgbench -S --gaussian=5 pgbench -N --exponential=2 --sampling-rate=0.8 Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
(2014/03/13 23:00), Fujii Masao wrote: On Thu, Mar 13, 2014 at 10:51 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 03/13/2014 03:17 PM, Fujii Masao wrote: On Tue, Mar 11, 2014 at 1:49 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2014/03/09 1:49), Fabien COELHO wrote: I'm okay with this UI and itsaccess probability of top implementation. OK. We should do the same discussion for the UI of command-line option? The patch adds two options --gaussian and --exponential, but this UI seems to be a bit inconsistent with the UI for \setrandom. Instead, we can use something like --distribution=[uniform | gaussian | exponential]. IMHO we should just implement the \setrandom changes, and not add any of these options to modify the standard test workload. If someone wants to run TPC-B workload with gaussian or exponential distribution, they can implement it as a custom script. The docs include the script for the standard TPC-B workload; just copy-paster that and modify the \setrandom lines. Well, when we set '--gaussian=NUM' or '--exponential=NUM' on command line, we can see access probability of top N records in result of final output. This out put is under following, [mitsu-ko@localhost pgbench]$ ./pgbench --exponential=10 postgres starting vacuum...end. transaction type: Exponential distribution TPC-B (sort of) scaling factor: 1 exponential threshold: 10.0 access probability of top 20%, 10% and 5% records: 0.86466 0.63212 0.39347 ~ This feature helps user to understand bias of distribution for tuning threshold parameter. If this feature is nothing, it is difficult to understand distribution of access pattern, and it cannot realized on custom script. Because range of distribution (min, max, and SQL pattern) are unknown on custom script. So I think present UI is not bad and should not change. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers