Re: [HACKERS] gaussian distribution pgbench
OK. I'm not sure which idia is the best. So I wait for comments in community:) Hmmm. Maybe you can do what Tom voted for, he is the committer:-) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 4 March 2014 01:07, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-03 19:15:27 -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: Just to be clear, that list is not a commentary on the particular patch at hand. Those are merely the kinds of regressions to look for in a patch affecting this area of the code. A complaint on pgsql-bugs just now reminded me of a specific area that needs to be looked at hard: how bad are the implications for pg_dump? Up to now, pg_dump could be reasonably confident that once it had AccessShareLock on every table it intended to dump, there would be no schema changes happening on those tables until it got done. The guarantee wasn't actually that strong. It already was quite possible that indexes got created/dropped during that time, which probably is the by far most frequent DDL run in production. Good points. In most cases, DDL is applied manually after careful thought, so people seldom dump at the same time they upgrade the database. This is especially true for pg_dump since it captures the logical definition of tables. So most people will be happy with the default locking, but we could make the lock level optional. Currently we use AccessShareLock. Locking out all DDL, even with this patch applied would only require ShareUpdateExclusiveLock. Looking at the code, it will take about an hour to add an option to pg_dump that specifies the lock level used when dumping. I would be happy to include that as part of this patch. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
(2014/03/04 17:28), Fabien COELHO wrote: OK. I'm not sure which idia is the best. So I wait for comments in community:) Hmmm. Maybe you can do what Tom voted for, he is the committer:-) Yeah, but he might change his mind by our disscuttion. So I wait untill tomorrow, and if nothing to comment, I will start to fix what Tom voted for. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Hi Oleg, On Mon, Mar 3, 2014 at 7:17 AM, Oleg Bartunov obartu...@gmail.com wrote: you can always look at our development repository: I think I found a bug: [local]/postgres=# \d+ bar Table public.bar Column | Type | Modifiers | Storage | Stats target | Description +---+---+--+--+- i | jsonb | | extended | | Indexes: f gin (i) Has OIDs: no [local]/postgres=# insert into bar values ('{ firstName: John, lastName: Smith, age: 25, address: { streetAddress: 21 2nd Street, city: New York, state: NY, postalCode: 10021 }, phoneNumbers: [ { type: home, number: 212 555-1234 }, { type: fax, number: 646 555-4567 } ] }'); INSERT 0 1 Time: 7.635 ms [local]/postgres=# select * from bar where i @ '{age:25.0}'::jsonb; i --- (0 rows) Time: 2.443 ms [local]/postgres=# explain select * from bar where i @ '{age:25.0}'::jsonb; QUERY PLAN - Bitmap Heap Scan on bar (cost=16.01..20.02 rows=1 width=32) Recheck Cond: ((i)::hstore @ 'age=25.0'::hstore) - Bitmap Index Scan on f (cost=0.00..16.01 rows=1 width=0) Index Cond: ((i)::hstore @ 'age=25.0'::hstore) Planning time: 0.161 ms (5 rows) [local]/postgres=# set enable_bitmapscan = off; SET Time: 6.052 ms [local]/postgres=# select * from bar where i @ '{age:25.0}'::jsonb; -[ RECORD 1 ] i | {age: 25, address: {city: New York, state: NY, postalCode: 10021, streetAddress: 21 2nd Street}, lastName: Smith, firstName: John, phoneNumbers: [{type: home, number: 212 555-1234}, {type: fax, number: 646 555-4567}]} Time: 6.479 ms [local]/postgres=# explain select * from bar where i @ '{age:25.0}'::jsonb; QUERY PLAN - Seq Scan on bar (cost=0.00..26.38 rows=1 width=32) Filter: ((i)::hstore @ 'age=25.0'::hstore) Planning time: 0.154 ms (3 rows) Time: 6.565 ms -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Good points. In most cases, DDL is applied manually after careful thought, so people seldom dump at the same time they upgrade the database. This is especially true for pg_dump since it captures the logical definition of tables. So most people will be happy with the default locking, but we could make the lock level optional. Currently we use AccessShareLock. Locking out all DDL, even with this patch applied would only require ShareUpdateExclusiveLock. Looking at the code, it will take about an hour to add an option to pg_dump that specifies the lock level used when dumping. I would be happy to include that as part of this patch. I think the use case for specifying multiple locks is pretty slim given that a ShareUpdateExclusiveLock is good enough mostly for everybody. If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. I am not too comfortable with exposing the locking type to the user. That may be just me though. Regards, Atri -- Regards, Atri *l'apprenant*
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On 4 March 2014 04:18, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Mar 3, 2014 at 3:38 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-02-28 20:55:20 +0530, Amit Kapila wrote: On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse Well, as I already stated: we don't. I copied the behavior we use in CHECK constraints (ExecBuildSlotValueDescription()). I think now more people are of opinion that displaying whole tuple is not useful. I believe it is good to go ahead by displaying just primary key for this case and move ahead. The patch does not, and has never, printed the full tuple. What it does is copying the behaviour of printing the tuple for check constraint violations, which is truncating the tuple after a certain length. I know that patch truncates the values if they are greater than certain length (30), but the point is why it is not sufficient to have tuple location (and primary key if available) which uniquely identifies any tuple? The patch follows a pattern established elsewhere, so arguing for this change would be a change in existing behaviour that is outside the scope of this patch. Please raise a new thread if you wish that change, especially since it is opposed here. This patch is small, but adds important functionality for diagnosing user's locking problems. If you have alterations to the patch please list them concisely so people can understand them and progress towards getting this committed or rejected. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On Tue, Mar 4, 2014 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: You're reasoning from a false premise: it's *not* necessarily an error. When wouldn't it be an error? Can you give a real-life example of when it would be a good idea to use the same name of an input parameter as a declared variable? Isn't this almost exactly the same situation as we had in 9.0? PL/pgSQL now throws an error if a variable name conflicts with a column name used in a query (Tom Lane) (http://www.postgresql.org/docs/9.1/static/release-9-0.html) Making variables in conflict with column names an error was a very good thing. Making variables in conflict with in/out parameters would also be a very good thing, by default. And for the ones who are unable to fix their code, let them turn the error off by a setting. Maybe we don't even need a new setting, maybe the existing plpgsql.variable_conflict could be reused? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Thanks, looks like a bug. On Tue, Mar 4, 2014 at 12:38 PM, Peter Geoghegan p...@heroku.com wrote: Hi Oleg, On Mon, Mar 3, 2014 at 7:17 AM, Oleg Bartunov obartu...@gmail.com wrote: you can always look at our development repository: I think I found a bug: [local]/postgres=# \d+ bar Table public.bar Column | Type | Modifiers | Storage | Stats target | Description +---+---+--+--+- i | jsonb | | extended | | Indexes: f gin (i) Has OIDs: no [local]/postgres=# insert into bar values ('{ firstName: John, lastName: Smith, age: 25, address: { streetAddress: 21 2nd Street, city: New York, state: NY, postalCode: 10021 }, phoneNumbers: [ { type: home, number: 212 555-1234 }, { type: fax, number: 646 555-4567 } ] }'); INSERT 0 1 Time: 7.635 ms [local]/postgres=# select * from bar where i @ '{age:25.0}'::jsonb; i --- (0 rows) Time: 2.443 ms [local]/postgres=# explain select * from bar where i @ '{age:25.0}'::jsonb; QUERY PLAN - Bitmap Heap Scan on bar (cost=16.01..20.02 rows=1 width=32) Recheck Cond: ((i)::hstore @ 'age=25.0'::hstore) - Bitmap Index Scan on f (cost=0.00..16.01 rows=1 width=0) Index Cond: ((i)::hstore @ 'age=25.0'::hstore) Planning time: 0.161 ms (5 rows) [local]/postgres=# set enable_bitmapscan = off; SET Time: 6.052 ms [local]/postgres=# select * from bar where i @ '{age:25.0}'::jsonb; -[ RECORD 1 ] i | {age: 25, address: {city: New York, state: NY, postalCode: 10021, streetAddress: 21 2nd Street}, lastName: Smith, firstName: John, phoneNumbers: [{type: home, number: 212 555-1234}, {type: fax, number: 646 555-4567}]} Time: 6.479 ms [local]/postgres=# explain select * from bar where i @ '{age:25.0}'::jsonb; QUERY PLAN - Seq Scan on bar (cost=0.00..26.38 rows=1 width=32) Filter: ((i)::hstore @ 'age=25.0'::hstore) Planning time: 0.154 ms (3 rows) Time: 6.565 ms -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 4 March 2014 08:39, Atri Sharma atri.j...@gmail.com wrote: Good points. In most cases, DDL is applied manually after careful thought, so people seldom dump at the same time they upgrade the database. This is especially true for pg_dump since it captures the logical definition of tables. So most people will be happy with the default locking, but we could make the lock level optional. Currently we use AccessShareLock. Locking out all DDL, even with this patch applied would only require ShareUpdateExclusiveLock. Looking at the code, it will take about an hour to add an option to pg_dump that specifies the lock level used when dumping. I would be happy to include that as part of this patch. I think the use case for specifying multiple locks is pretty slim given that a ShareUpdateExclusiveLock is good enough mostly for everybody. Increasing the lock strength would be a change in behaviour that might adversely affect existing users. If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. That is most certainly the wise choice. I am not too comfortable with exposing the locking type to the user. That may be just me though. Why would that be a problem? Hard reasons, please. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Hi all, I think this patch is completely forgotten, and feel very unfortunate:( Min, max, and stdev is basic statistics in general monitoring tools, So I'd like to push it. (2014/02/12 15:45), KONDO Mitsumasa wrote: (2014/01/29 17:31), Rajeev rastogi wrote: No Issue, you can share me the test cases, I will take the performance report. Attached patch is supported to latest pg_stat_statements. It includes min, max, and stdev statistics. Could you run compiling test on your windows enviroments? I think compiling error was fixed. We had disscuttion about which is needed useful statistics in community, I think both of statistics have storong and weak point. When we see the less(2 or 3) executed statement, stdev will be meaningless because it cannot calculate estimated value precisely very much, however in this situation, min and max will be propety work well because it isn't estimated value but fact value. On the other hand, when we see the more frequency executed statement, they will be contrary position statistics, stdev will be very useful statistics for estimating whole statements, and min and max might be extremely value. At the end of the day, these value were needed each other for more useful statistics when we want to see several actual statments. And past my experience showed no performance problems in this patch. So I'd like to implements all these values in pg_stat_statements. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 1:30 AM, Oleg Bartunov obartu...@gmail.com wrote: Thanks, looks like a bug. I guess this is down to the continued definition of gin_hstore_ops as an opclass with text storage?: + CREATE OPERATOR CLASS gin_hstore_ops + DEFAULT FOR TYPE hstore USING gin + AS + OPERATOR7 @, + OPERATOR9 ?(hstore,text), + OPERATOR10 ?|(hstore,text[]), + OPERATOR11 ?(hstore,text[]), + FUNCTION1 bttextcmp(text,text), + FUNCTION2 gin_extract_hstore(internal, internal), + FUNCTION3 gin_extract_hstore_query(internal, internal, int2, internal, internal), + FUNCTION4 gin_consistent_hstore(internal, int2, internal, int4, internal, internal), + STORAGE text; -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. That is most certainly the wise choice. I am not too comfortable with exposing the locking type to the user. That may be just me though. Why would that be a problem? Hard reasons, please. Should we genuinely depend on the user's good judgement to decide the locking types? -- Regards, Atri *l'apprenant*
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
Hello, I examined the your patch and it seemed reasonable, but I have one question about this patch. You made ec_relids differ to the union of all ec members' em_relids. Is it right? At Mon, 03 Mar 2014 14:05:10 -0500, Tom Lane wrote Noah Misch n...@leadboat.com writes: If you are convinced that a separate flattening pass is best, that suffices for me at this stage. Please submit the patch you have in mind, incorporating any improvements from the v7 patch that are relevant to both approaches. I went back and re-read the original message, and this time it struck me that really the issue here is that add_child_rel_equivalences() doesn't think it has to deal with the case of a parent rel being itself a child. Yes, that is what I tried to solve in one of the first patch but the brute way led to failure:( That's not inherent though; it's just that it didn't occur to me at the time that such a situation could arise. It takes only very small changes to allow that to happen. I saw what you did. I marked not-full-fledged eq members as 'not child'(its real meaning there was full-fledged in contradiction) to reach the child rels. In contrast, you loosened the ec_relids shortcut filter and it seems to work but.. Is it right to make ec_relids different to union of all members' em_relids? This seems to affect joins afterwards. If you do that, as in the attached changes to equivclass.c, you get could not find pathkey item to sort errors from createplan.c; but that's just because create_merge_append_plan() is likewise not expecting the mergeappend's parent rel to be itself a child. Fix for that is a one-liner, ie, pass down relids. This seems to just correcting the over simplification by the assumption that the best_path there cannot have a parent. That gets you to a point where the code generates a valid plan, but it's got nested MergeAppends, which are unnecessary expense. Kyotaro-san's original fix for that was overcomplicated. It's sufficient to teach accumulate_append_subpath() to flatten nested MergeAppendPaths. Ah, it was in typ1-collapse patch. As you might noticed seeing there, I couldn't put the assumption that directly nested MergeAppendPaths share the same pathkeys (really?). It's no problem if the assumption goes on. In short, the attached patch seems to fix it, for a lot less added complication than anything else that's been discussed on this thread. I've only lightly tested it (it could use a new regression test case), but unless someone can find a flaw I think we should use this approach. Mmm. That's motifying but you seems to be right :) Equipping this with some regression tests become my work from now. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] requested shared memory size overflows size_t
Dear Developers, Please help with the following problem. I am running PostgreSQL 9.2.3 on SUN Solaris 9. This is 64 bit system with 32G swap and 16G RAM. I use same configuration file as on Linux or SUN Solaris 10, where everything is ok. I am unable to set shared buffer 5G, the maximum possible value is 4G. When I decrease the configuration parameters and start the instance successfully: some queries fails on out of memory error. I verified kernel parameters: they looks same as on Solaris 10 and big enough. The only one difference is: Solaris 9 PostgreSQL version, in opposite to Solaris 10 and Linux, was compiled by me with default options. My kernel is: set semsys:seminfo_semmap=64 set semsys:seminfo_semmni=4096 set semsys:seminfo_semmns=4096 set semsys:seminfo_semmnu=4096 set semsys:seminfo_semume=64 set semsys:seminfo_semmsl=500 set shmsys:shminfo_shmmax=0x set shmsys:shminfo_shmmin=100 set shmsys:shminfo_shmmni=4096 set shmsys:shminfo_shmseg=100 Config. shared_buffers = 3GB temp_buffers = 2GB work_mem = 1024MB Sincerely yours, [Description: Celltick logo_highres] Yuri Levinsky, DBA Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222 inline: image002.jpg
Re: [HACKERS] jsonb and nested hstore
I guess this is down to the continued definition of gin_hstore_ops as an opclass with text storage?: No, type of this storage describes type of keys. For gin_hstore_ops each key and each value will be stored as a text value. The root of problem is a JavaScript or/and our numeric type. In JavaScript (which was a base for json type) you need explicitly point type of compare to prevent unpredictable result. select '25.0'::numeric = '25'::numeric; ?column? -- t but select '25.0'::numeric::text = '25'::numeric::text; ?column? -- f and select '{a: 25}'::json-'a' = '{a: 25.0}'::json-'a'; ?column? -- f In pointed example inserted value has age: 25 but searching jsonb value has age:25.0. + CREATE OPERATOR CLASS gin_hstore_ops + DEFAULT FOR TYPE hstore USING gin + AS + OPERATOR7 @, + OPERATOR9 ?(hstore,text), + OPERATOR10 ?|(hstore,text[]), + OPERATOR11 ?(hstore,text[]), + FUNCTION1 bttextcmp(text,text), + FUNCTION2 gin_extract_hstore(internal, internal), + FUNCTION3 gin_extract_hstore_query(internal, internal, int2, internal, internal), + FUNCTION4 gin_consistent_hstore(internal, int2, internal, int4, internal, internal), + STORAGE text; -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
select '{a: 25}'::json-'a' = '{a: 25.0}'::json-'a'; ?column? -- f Although for development version of hstore (not a current version) # select 'a= 25'::hstore = 'a= 25.0'::hstore; ?column? -- t That is because compareJsonbValue compares numeric values with a help of numeric_cmp() instead of comparing text representation. This inconsistent will be fixed. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 2:07 AM, Teodor Sigaev teo...@sigaev.ru wrote: No, type of this storage describes type of keys. For gin_hstore_ops each key and each value will be stored as a text value. The root of problem is a JavaScript or/and our numeric type. In JavaScript (which was a base for json type) you need explicitly point type of compare to prevent unpredictable result. That's what I meant, I think. But I'm not sure what you mean: Native Chrome JavaScript. Copyright (c) 2013 Google Inc 25 == 25 = true 25 == 25.0 = true -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev teo...@sigaev.ru wrote: That is because compareJsonbValue compares numeric values with a help of numeric_cmp() instead of comparing text representation. This inconsistent will be fixed. Cool. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 2:18 AM, Peter Geoghegan p...@heroku.com wrote: On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev teo...@sigaev.ru wrote: That is because compareJsonbValue compares numeric values with a help of numeric_cmp() instead of comparing text representation. This inconsistent will be fixed. Cool. Perhaps this is obvious, but: I expect that you intend to fix the inconsistency by having everywhere use a native numeric comparison. Thanks -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] requested shared memory size overflows size_t
On 03/04/2014 11:59 AM, Yuri Levinsky wrote: Dear Developers, Please help with the following problem. I am running PostgreSQL 9.2.3 on SUN Solaris 9. This is 64 bit system with 32G swap and 16G RAM. I use same configuration file as on Linux or SUN Solaris 10, where everything is ok. I am unable to set shared buffer 5G, the maximum possible value is 4G. When I decrease the configuration parameters and start the instance successfully: some queries fails on out of memory error. I verified kernel parameters: they looks same as on Solaris 10 and big enough. The only one difference is: Solaris 9 PostgreSQL version, in opposite to Solaris 10 and Linux, was compiled by me with default options. Note that if a query fails with out of memory, it does *not* mean that you should increase shared_buffers. On the contrary: the higher you set shared_buffers, the less memory there is left for other things. My kernel is: set semsys:seminfo_semmap=64 set semsys:seminfo_semmni=4096 set semsys:seminfo_semmns=4096 set semsys:seminfo_semmnu=4096 set semsys:seminfo_semume=64 set semsys:seminfo_semmsl=500 set shmsys:shminfo_shmmax=0x set shmsys:shminfo_shmmin=100 set shmsys:shminfo_shmmni=4096 set shmsys:shminfo_shmseg=100 Config. shared_buffers = 3GB temp_buffers = 2GB work_mem = 1024MB temp_buffers = 2GB seems very high. That settings is *per backend*, so if you have 10 backends that all use temporary tables, they will consume 20GB altogether for temp buffers. work_mem works similarly, except that a single query can use many times work_mem even in a single backend, so you need to be even more conservative with that. 1GB seems very high for work_mem. Try resetting these back to the defaults, and see if that works for you. Increase them gradually, and only if you have a query where the higher value really helps. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
I tried try.mongodb.com 25 == 25.0 true On Tue, Mar 4, 2014 at 2:18 PM, Peter Geoghegan p...@heroku.com wrote: On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev teo...@sigaev.ru wrote: That is because compareJsonbValue compares numeric values with a help of numeric_cmp() instead of comparing text representation. This inconsistent will be fixed. Cool. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On 03/03/2014 04:57 PM, Andres Freund wrote: On 2014-03-03 16:27:05 +0200, Heikki Linnakangas wrote: Attached is a rewritten version, which does the prefix/suffix tests directly in heapam.c, and adds the prefix/suffix lengths directly as fields in the WAL record. If you could take one more look at this version, to check if I've missed anything. Have you rerun the benchmarks? No. I'd guess the CPU overhead of this version is lower than earlier versions, That's what I would expect too. but seing it tested won't be a bad idea. Agreed. Amit, do you have the test setup at hand, can you check the performance of this one more time? Also, I removed the GUC and table level options, on the assumption that this is cheap enough even when it's not helping that we don't need to make it configurable. This ought to be tested with the new logical decoding stuff as it modified the WAL update record format which the logical decoding stuff also relies, but I don't know anything about that. Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). That's a pity, but we can live with it. If we did this at a higher level and checked which columns have been modified, we could include just the modified fields in the record, which should to be enough for logical decoding. It might be even more useful for logical decoding too to know exactly which fields were changed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Do we have function to trim right zeros in numeric? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)
Hi, On 03/03/14 21:03, Heikki Linnakangas wrote: I spotted this in section 17.4.1 Shared Memory and Semaphores: Linux The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in unusual kernel configurations with huge pages (use getconf PAGE_SIZE to verify). It's not any more wrong now than it's always been, but I don't think huge pages ever affect PAGE_SIZE... Could I cajole you into rephrasing that, too? Hm… to be honest, I'm not sure how to change that. What about this? The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in kernel configurations with quotehuge pages/quote (use literalcat /proc/meminfo | grep Hugepagesize/literal to verify), but they have to be enabled explicitely via xref linkend=guc-huge-pages. See xref linkend=linux-huge-pages for details. I attached a patch doing this change. Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 7f4a235..8811097 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -881,9 +881,12 @@ optionSEMMAP=256 para The default maximum segment size is 32 MB, and the default maximum total size is 2097152 -pages. A page is almost always 4096 bytes except in unusual +pages. A page is almost always 4096 bytes except in kernel configurations with quotehuge pages/quote -(use literalgetconf PAGE_SIZE/literal to verify). +(use literalcat /proc/meminfo | grep Hugepagesize/literal +to verify), but they have to be enabled explicitely via +xref linkend=guc-huge-pages. See +xref linkend=linux-huge-pages for details. /para para pgpi_wQpDEj0f.pgp Description: PGP signature
Re: [HACKERS] requested shared memory size overflows size_t
Heikki, I changed postgresql.conf to decrease those parameters but no change: GMT54000FATAL: requested shared memory size overflows size_t My kernel is: set semsys:seminfo_semmap=64 set semsys:seminfo_semmni=4096 set semsys:seminfo_semmns=4096 set semsys:seminfo_semmnu=4096 set semsys:seminfo_semume=64 set semsys:seminfo_semmsl=500 set shmsys:shminfo_shmmax=0x set shmsys:shminfo_shmmin=100 set shmsys:shminfo_shmmni=4096 set shmsys:shminfo_shmseg=100 shared_buffers = 5GB# min 16 or max_connections*2, 8KB each temp_buffers = 256MB# min 100, 8KB each max_prepared_transactions = 1000# can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). # It is not advisable to set max_prepared_transactions nonzero unless you # actively intend to use prepared transactions. work_mem = 256MB# min 64, size in KB maintenance_work_mem = 256MB# min 1024, size in KB max_stack_depth = 4MB [L.Y.] temp_buffers = 2GB seems very high. That settings is *per backend*, so if you have 10 backends that all use temporary tables, they will consume 20GB altogether for temp buffers. work_mem works similarly, except that a single query can use many times work_mem even in a single backend, so you need to be even more conservative with that. 1GB seems very high for work_mem. Try resetting these back to the defaults, and see if that works for you. Increase them gradually, and only if you have a query where the higher value really helps. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev teo...@sigaev.ru wrote: Do we have function to trim right zeros in numeric? I'm not sure why you ask. I hope it isn't because you want to fix this bug by making text comparisons in place of numeric comparisons work by fixing the exact problem I reported, because there are other similar problems, such as differences in lc_numeric settings that your implementation cannot possibly account for. If that's not what you meant, I think it's okay if there are apparent trailing zeroes output under similar circumstances to the numeric type proper. Isn't this kind of thing intentionally not described by the relevant spec anyway? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire
On Tue, Mar 4, 2014 at 2:18 PM, Simon Riggs si...@2ndquadrant.com wrote: On 4 March 2014 04:18, Amit Kapila amit.kapil...@gmail.com wrote: I know that patch truncates the values if they are greater than certain length (30), but the point is why it is not sufficient to have tuple location (and primary key if available) which uniquely identifies any tuple? The patch follows a pattern established elsewhere, so arguing for this change would be a change in existing behaviour that is outside the scope of this patch. Please raise a new thread if you wish that change, especially since it is opposed here. Okay, I very well got this point and I was also not completely sure what is the best thing to do for this specific point, thats why I had asked for opinion of others upthread and there is a mixed feedback about it. I think best thing is to leave this point for final committer's decision and complete the other review/verification of patch. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On 2014-03-04 12:43:48 +0200, Heikki Linnakangas wrote: This ought to be tested with the new logical decoding stuff as it modified the WAL update record format which the logical decoding stuff also relies, but I don't know anything about that. Hm, I think all it needs to do disable delta encoding if need_tuple_data (which is dependent on wal_level=logical). That's a pity, but we can live with it. Agreed. This is hardly the first optimization that only works for some wal_levels. If we did this at a higher level and checked which columns have been modified, we could include just the modified fields in the record, which should to be enough for logical decoding. It might be even more useful for logical decoding too to know exactly which fields were changed. Yea, I argued that way elsewhere in this thread. I do think we're going to need per column info for further features in the near future. It's a bit absurd that we're computing various sets of changed columns (HOT, key, identity) plus the pre/postfix with this patchset. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 4 March 2014 09:31, Simon Riggs si...@2ndquadrant.com wrote: On 4 March 2014 08:39, Atri Sharma atri.j...@gmail.com wrote: Good points. In most cases, DDL is applied manually after careful thought, so people seldom dump at the same time they upgrade the database. This is especially true for pg_dump since it captures the logical definition of tables. So most people will be happy with the default locking, but we could make the lock level optional. Currently we use AccessShareLock. Locking out all DDL, even with this patch applied would only require ShareUpdateExclusiveLock. Looking at the code, it will take about an hour to add an option to pg_dump that specifies the lock level used when dumping. I would be happy to include that as part of this patch. I think the use case for specifying multiple locks is pretty slim given that a ShareUpdateExclusiveLock is good enough mostly for everybody. Increasing the lock strength would be a change in behaviour that might adversely affect existing users. The main impact I see is that this would block VACUUM while pg_dump runs. But then, while pg_dump runs VACUUM is ineffective anyway so perhaps that is no bad thing. Autovacuum requests VACOPT_NOWAIT so would skip the relations being dumped rather than waiting. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On Mon, Mar 3, 2014 at 7:57 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/16/2014 01:51 PM, Amit Kapila wrote: On Wed, Feb 5, 2014 at 5:29 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Thanks. I have to agree with Robert though that using the pglz encoding when we're just checking for a common prefix/suffix is a pretty crappy way of going about it [1]. As the patch stands, it includes the NULL bitmap when checking for a common prefix. That's probably not a good idea, because it defeats the prefix detection in a the common case that you update a field from NULL to not-NULL or vice versa. Attached is a rewritten version, which does the prefix/suffix tests directly in heapam.c, and adds the prefix/suffix lengths directly as fields in the WAL record. If you could take one more look at this version, to check if I've missed anything. I had verified the patch and found few minor points: 1. +extern bool heap_delta_encode(TupleDesc tupleDesc, HeapTuple oldtup, + HeapTuple newtup, char *encdata, uint32 *enclen); +extern void heap_delta_decode (char *encdata, uint32 enclen, HeapTuple oldtup, + HeapTuple newtup); Declaration for above functions are not required now. 2. + * later, but will not cause any problem because this function is used only to + * identify whether EWT is required for update. + */ +bool +XLogCheckBufferNeedsBackup(Buffer buffer) Here, I think we can change the comment to avoid word EWT (Encoded WAL tuple), as now we changed compression mechanism and it's not used anywhere else. One Question: + rdata[1].data = (char *) xlrec; Earlier it seems to store record hearder as first segment rdata[0], whats the reason of changing it? I have verified the patch by doing crash recovery for below scenario's and it worked fine: a. no change in old and new tuple b. all changed in new tuple c. half changed (update half of the values to NULLS) in new tuple d. only prefix same in new tuple e. only suffix same in new tuple f. prefix-suffix same, other columns values changed in new tuple. Performance Data Non-Default settings autovacuum = off checkpoitnt_segments = 256 checkpoint_timeout =15min full_page_writes = off Unpatched testname | wal_generated | duration -+---+-- one short and one long field, no change | 573506704 | 9.56587505340576 one short and one long field, no change | 575351216 | 9.97713398933411 one short and one long field, no change | 573501848 | 9.76377606391907 hundred tiny fields, all changed| 364894056 | 13.3053929805756 hundred tiny fields, all changed| 364891536 | 13.3533811569214 hundred tiny fields, all changed| 364889264 | 13.3041989803314 hundred tiny fields, half changed | 365411920 | 14.1831648349762 hundred tiny fields, half changed | 365918216 | 13.6393811702728 hundred tiny fields, half changed | 366456552 | 13.6420011520386 hundred tiny fields, half nulled| 300705288 | 12.8859741687775 hundred tiny fields, half nulled| 301665624 | 12.6988201141357 hundred tiny fields, half nulled| 300700504 | 13.3536100387573 9 short and 1 long, short changed | 396983080 | 8.83671307563782 9 short and 1 long, short changed | 396987976 | 9.23769211769104 9 short and 1 long, short changed | 396984080 | 9.45178604125977 wal-update-prefix-suffix-5.patch testname | wal_generated | duration -+---+-- one short and one long field, no change | 156278832 | 6.69434094429016 one short and one long field, no change | 156277352 | 6.70855903625488 one short and one long field, no change | 156280040 | 6.70657396316528 hundred tiny fields, all changed| 364895152 | 13.6677348613739 hundred tiny fields, all changed| 364892256 | 12.7107839584351 hundred tiny fields, all changed| 364890424 | 13.7760601043701 hundred tiny fields, half changed | 365970360 | 13.1902158260345 hundred tiny fields, half changed | 364895120 | 13.5730090141296 hundred tiny fields, half changed | 367031168 | 13.7023210525513 hundred tiny fields, half nulled| 204418576 | 12.1997199058533 hundred tiny fields, half nulled| 204422880 | 11.4583330154419 hundred tiny fields, half nulled| 204417464 | 12.0228970050812 9 short and 1 long, short changed | 220466016 | 8.14843511581421 9 short and 1 long, short changed | 220471168 | 8.03712797164917 9 short and 1 long, short changed | 220464464 | 8.55907511711121 (15 rows) Conclusion is that patch shows good WAL reduction and performance improvement for favourable cases without CPU overhead for
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Tue, Mar 4, 2014 at 6:57 AM, Simon Riggs si...@2ndquadrant.com wrote: The main impact I see is that this would block VACUUM while pg_dump runs. But then, while pg_dump runs VACUUM is ineffective anyway so perhaps that is no bad thing. Well, a vacuum that's already running when pg_dump starts up may be doing a lot of good, so it would be a shame to see pg_dump kill them all off. Also, this would put us in the surprising situation that you can't run two simultaneous dumps of overlapping sets of tables, which doesn't strike me as a great thing. I'd really like to see us find a way to apply some version of this patch. I was in favor of the concept 3 years ago when we did this the first time, and I've subsequently done quite a bit of work (viz., MVCC catalog snapshots) to eliminate the main objection that was raised at that time. But it's really hard to reason about what might happen with lowered lock levels, and convince yourself that there's absolutely nothing that can ever go wrong. I don't know what to do about that tension, but I think even modest improvements in this area stand to benefit an awful lot of users. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: option --if-exists for pg_dump
2014-03-04 8:55 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: 2014-03-03 18:18 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: This patch has redesigned implementation --if-exists for pg_dumpall. Now it is not propagated to pg_dump, but used on pg_dumpall level. Seems sane, thanks. BTW after this patch, I still don't see an error-free output from restoring a database on top of itself. One problem is plpgsql, which is now an extension, so pg_dump emits this error message: ERROR: cannot drop language plpgsql because extension plpgsql requires it SUGERENCIA: You can drop extension plpgsql instead. Another problem is that some DROP commands don't work. For instance, if the public schema in the target database contains objects that haven't been dropped yet, the DROP command will fail: ERROR: cannot drop schema public because other objects depend on it DETALLE: function bt_metap(text) depends on schema public function bt_page_items(text,integer) depends on schema public function bt_page_stats(text,integer) depends on schema public function f() depends on schema public function get_raw_page(text,integer) depends on schema public function heap_page_items(bytea) depends on schema public function locate_tuple_corruption() depends on schema public function page_header(bytea) depends on schema public SUGERENCIA: Use DROP ... CASCADE to drop the dependent objects too. (The way I got this was by using my 8.2 installation, on which I ran the regression tests; then I dumped the resulting regression database. The database on which I restored wasn't clean, as it contained unrelated junk in the public schema.) I'll recheck a behave of extensions. I rechecked extensions and it works - so it can be full quiet when old dump is imported, but import dump from fresh dumps should to work. Regards Pavel On second hand - usually, preferred way is using a dump related to target PostgreSQL release Not sure what's the right answer here to this problem, but it cannot be attributed to this patch anyway. I'm about to push this, since other than the above problems, this functionality seems to be working as designed. Thank you very much Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev teo...@sigaev.ru wrote: Do we have function to trim right zeros in numeric? Fixed, pushed to github (https://github.com/feodor/postgres/tree/jsonb_and_hstore). Now it used hash_numeric to index numeric value. As I can see, it provides needed trim and doesn't depend on locale. Possible mismatch (the same hash value for different numeric valye) will rechecked anyway - interested operations set recheck flag. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
I'd really like to see us find a way to apply some version of this patch. I was in favor of the concept 3 years ago when we did this the first time, and I've subsequently done quite a bit of work (viz., MVCC catalog snapshots) to eliminate the main objection that was raised at that time. But it's really hard to reason about what might happen with lowered lock levels, and convince yourself that there's absolutely nothing that can ever go wrong. I don't know what to do about that tension, but I think even modest improvements in this area stand to benefit an awful lot of users. Wouldnt MVCC's strictness rules pose harder restrictions on pg_dump instead of relaxing them? Regards, Atri -- Regards, Atri *l'apprenant*
Re: [HACKERS] Row-security on updatable s.b. views
On 04/03/14 02:36, Craig Ringer wrote: On 02/25/2014 01:28 AM, Dean Rasheed wrote: On 13 February 2014 04:12, Craig Ringer cr...@2ndquadrant.com wrote: It's crashing while pulling up the query over emp (hl7.employee) and part (hl7.participation). Given the simplicity of what the row-security code its self is doing, I'm wondering if this is a case that isn't handled in updatable s.b. views. I'll look into it. I'm not sure how much further you've got with this, but I think the issue is that the securityQuals that you're adding don't refer to the correct RTE. When adding securityQuals to an RTE, they are expected to have Vars whose varno matches the rt_index of the RTE (see for example the code in rewriteTargetView() which calls ChangeVarNodes() on viewqual before adding the qual to securityQuals or the main query jointree). prepend_row_security_quals() doesn't appear to have any similar code, and it would need to be passed the rt_index to do that. Thanks for the pointer. That was indeed the issue. I've pushed an update to the branch with the fix for varno handling. Thanks. It's tagged rls-9.4-upd-sb-views-v8 . I've almost run out of time to spend on row security for this commitfest, unfortunately. I'm putting a blog together with a current status update. Frustrating, as it's coming together now. Open issues include: - Passing plan inval items from rewriter into planner - COPY support pending - Clear syntax in DDL Most of the rest are solved; it's actually looking pretty good. Hi Craig, I've tested the results from the minirim.sql that was posted earlier, and the v8 gives the same results as v4 :-) Thanks for all the work! Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender can ignore send failures in WalSndLoop
On 02/14/2014 01:13 PM, Andres Freund wrote: There's a small issue in abfd192b, namely one of the error cases wasn't changed when WalSndLoop was changed to be able to return. I don't think this is likely to have any grave consequences, we'll likely error out soon afterwards again. Patch attached. Fixed, thanks! - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 4, 2014 at 6:48 AM, Teodor Sigaev teo...@sigaev.ru wrote: On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev teo...@sigaev.ru wrote: Do we have function to trim right zeros in numeric? Fixed, pushed to github (https://github.com/feodor/postgres/tree/jsonb_and_hstore). Now it used hash_numeric to index numeric value. As I can see, it provides needed trim and doesn't depend on locale. Possible mismatch (the same hash value for different numeric valye) will rechecked anyway - interested operations set recheck flag. huh. what it is the standard for equivalence? I guess we'd be following javascript ===, right? (http://dorey.github.io/JavaScript-Equality-Table/). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 4 March 2014 12:18, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 4, 2014 at 6:57 AM, Simon Riggs si...@2ndquadrant.com wrote: The main impact I see is that this would block VACUUM while pg_dump runs. But then, while pg_dump runs VACUUM is ineffective anyway so perhaps that is no bad thing. Well, a vacuum that's already running when pg_dump starts up may be doing a lot of good, so it would be a shame to see pg_dump kill them all off. Also, this would put us in the surprising situation that you can't run two simultaneous dumps of overlapping sets of tables, which doesn't strike me as a great thing. These changes in concurrency are the most serious objections and a definite change in previous behaviour. So we cannot pick a single lock level that suits all goals the user may have. I'd really like to see us find a way to apply some version of this patch. I was in favor of the concept 3 years ago when we did this the first time, and I've subsequently done quite a bit of work (viz., MVCC catalog snapshots) to eliminate the main objection that was raised at that time. But it's really hard to reason about what might happen with lowered lock levels, and convince yourself that there's absolutely nothing that can ever go wrong. I don't know what to do about that tension, but I think even modest improvements in this area stand to benefit an awful lot of users. Agreed. The question is, which subset? The issue just raised would affect whichever subset we choose, so reducing the scope of the patch does nothing to the impact of the pg_dump issue. I will add the option to change lock level for pg_dump. It's simple to use, clear as to why it would be needed and effective at removing this as an obstacle. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote: As I mentioned up-thread, I'd really like to see FDW join push-down, FDW aggregate push-down, parallel query execution, and parallel remote-FDW execution and I don't see this CustomScan approach as the right answer to any of those. In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. I suggested that example to KaiGai at last year's PGCon. It may indeed be something we want to have in core, but right now we don't. More generally, I think this discussion is focusing on the wrong set of issues. The threshold issue for this patch is whether there is a set of hook points that enable a workable custom-scan functionality, and whether KaiGai has correctly identified them. In other words, I think we should be worrying about whether KaiGai's found all of the places that need to be modified to support a custom scan, and whether the modifications he's made to each of those places are correct and adequate. Whether he's picked the best possible example does not strike me as a matter of principal concern, and it's far too late to tell him he's got to go pick a different one at this point anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote: Do you think it makes sense if my submission was only interface portion without working example? No, we're pretty strongly against putting in interfaces which don't have working examples in contrib- for one thing, we want to know when we break it. The purpose of ctidscan module is, similar to postgres_fdw, to demonstrate the usage of custom-scan interface with enough small code scale. If tons of code example were attached, nobody will want to review the patch. I gathered that's why it was included. Is the plan to eventually submit something larger to go into -contrib which will use this interface? Or will it always be external? The cache_scan module that I and Haribabu are discussing in another thread also might be a good demonstration for custom-scan interface, however, its code scale is a bit larger than ctidscan. That does sound interesting though I'm curious about the specifics... For one thing, an example where you could have this CustomScan node calling other nodes underneath would be interesting. I realize the CTID scan can't do that directly but I would think your GPU-based system could; after all, if you're running a join or an aggregate with the GPU, the rows could come from nearly anything. Have you considered that, or is the expectation that users will just go off and access the heap and/or whatever indexes directly, like ctidscan does? How would such a requirement be handled? In case when custom-scan node has underlying nodes, it shall be invoked using ExecProcNode as built-in node doing, then it will be able to fetch tuples come from underlying nodes. Of course, custom-scan provider can perform the tuples come from somewhere as if it came from underlying relation. It is responsibility of extension module. In some cases, it shall be required to return junk system attribute, like ctid, for row-level locks or table updating. It is also responsibility of the extension module (or, should not add custom- path if this custom-scan provider cannot perform as required). Right, tons of work to do to make it all fit together and play nice- what I was trying to get at is: has this actually been done? Is the GPU extension that you're talking about as the use-case for this been written? How does it handle all of the above? Or are we going through all these gyrations in vain hope that it'll actually all work when someone tries to use it for something real? Thanks, Stephen signature.asc Description: Digital signature
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Tue, Mar 4, 2014 at 7:39 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote: As I mentioned up-thread, I'd really like to see FDW join push-down, FDW aggregate push-down, parallel query execution, and parallel remote-FDW execution and I don't see this CustomScan approach as the right answer to any of those. In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. I suggested that example to KaiGai at last year's PGCon. It may indeed be something we want to have in core, but right now we don't. More generally, I think this discussion is focusing on the wrong set of issues. The threshold issue for this patch is whether there is a set of hook points that enable a workable custom-scan functionality, and whether KaiGai has correctly identified them. In other words, I think we should be worrying about whether KaiGai's found all of the places that need to be modified to support a custom scan, and whether the modifications he's made to each of those places are correct and adequate. Whether he's picked the best possible example does not strike me as a matter of principal concern, and it's far too late to tell him he's got to go pick a different one at this point anyway. There are so many places in the planner and optimizer code, where we create various types of paths and the number of such paths is again significant, if not large. If we want the custom scan contrib module to work in all those cases (which seems to be the intention here), then we have to expose so many hooks. I don't think all of those hooks have been identified. Second problem is, the functions which create those paths have signatures difficult enough to be exposed as hooks. Take example of the join hook that was exposed. These function signatures do get changed from time to time and thus corresponding hooks need to be changed to. This is going to be a maintenance burden. So, unless we have some way of exposing these hooks such that the definitions of the hooks are independent of the internal function signatures, supporting custom scan looks difficult. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote: During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries to foreign servers, those would be fired while EXPLAINing a query as well. We want to avoid that. Instead, we can run EXPLAIN on that query at foreign server. But again, not all foreign servers would be able to EXPLAIN the query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(), if it's for EXPLAIN (except for ANALYSE may be). Agreed that we wouldn't want to actually run a query when it's just being explain'd. If the FDW can't tell the difference then we'd need to address that, of course. A similar issue would, presumably, be around prepare/execute, though I haven't looked yet. These kinds of issues are why it was option '#2' instead of '#1'. :) I'm not sure they're able to be addressed. :/ Thanks, Stephen signature.asc Description: Digital signature
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-03-04 23:09 GMT+09:00 Robert Haas robertmh...@gmail.com: On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote: As I mentioned up-thread, I'd really like to see FDW join push-down, FDW aggregate push-down, parallel query execution, and parallel remote-FDW execution and I don't see this CustomScan approach as the right answer to any of those. In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. I suggested that example to KaiGai at last year's PGCon. It may indeed be something we want to have in core, but right now we don't. More generally, I think this discussion is focusing on the wrong set of issues. The threshold issue for this patch is whether there is a set of hook points that enable a workable custom-scan functionality, and whether KaiGai has correctly identified them. In other words, I think we should be worrying about whether KaiGai's found all of the places that need to be modified to support a custom scan, and whether the modifications he's made to each of those places are correct and adequate. Whether he's picked the best possible example does not strike me as a matter of principal concern, and it's far too late to tell him he's got to go pick a different one at this point anyway. That is definitely the point to be discussed here. Even though I *believe* I could put the callbacks needed to implement alternative join / scan, it may lead different conclusion from other person's viewpoint. At least, I could implement a custom-scan as an alternative of join using postgres_fdw, however, it's uncertain whether I could cover all the possible case we should care about. So, I'd like to see comments from others. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Robert Haas (robertmh...@gmail.com) wrote: More generally, I think this discussion is focusing on the wrong set of issues. The threshold issue for this patch is whether there is a set of hook points that enable a workable custom-scan functionality, and whether KaiGai has correctly identified them. Right- I was trying to hit on that in my email this morning. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, I examined the your patch and it seemed reasonable, but I have one question about this patch. You made ec_relids differ to the union of all ec members' em_relids. Is it right? ec_relids has never included child relids. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] requested shared memory size overflows size_t
On Tue, Mar 4, 2014 at 6:05 AM, Yuri Levinsky yu...@celltick.com wrote: I changed postgresql.conf to decrease those parameters but no change: GMT54000FATAL: requested shared memory size overflows size_t I think this means you are running on a 32-bit operating system, or at least on a 32-bit build. That means you can't use more than 4GB of address space per process, which has to fit shared_buffers and everything else. Typically it's best not to set shared_buffers above 2-2.5GB on such systems, but the real solution is to use a 64-bit PostgreSQL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. No, not really. The issue is more around what happens if we crash part way through. At crash recovery time, the system catalogs are not available, because the database isn't consistent yet and, anyway, the startup process can't be bound to a database, let alone every database that might contain unlogged tables. So the sentinel that's used to decide whether to flush the contents of a table or index is the presence or absence of an _init fork, which the startup process obviously can see just fine. The _init fork also tells us what to stick in the relation when we reset it; for a table, we can just reset to an empty file, but that's not legal for indexes, so the _init fork contains a pre-initialized empty index that we can just copy over. Now, to make an unlogged table logged, you've got to at some stage remove those _init forks. But this is not a transactional operation. If you remove the _init forks and then the transaction rolls back, you've left the system an inconsistent state. If you postpone the removal until commit time, then you have a problem if it fails, particularly if it works for the first file but fails for the second. And if you crash at any point before you've fsync'd the containing directory, you have no idea which files will still be on disk after a hard reboot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row-security on updatable s.b. views
On 04/03/14 02:36, Craig Ringer wrote: On 02/25/2014 01:28 AM, Dean Rasheed wrote: On 13 February 2014 04:12, Craig Ringer cr...@2ndquadrant.com wrote: It's crashing while pulling up the query over emp (hl7.employee) and part (hl7.participation). Given the simplicity of what the row-security code its self is doing, I'm wondering if this is a case that isn't handled in updatable s.b. views. I'll look into it. I'm not sure how much further you've got with this, but I think the issue is that the securityQuals that you're adding don't refer to the correct RTE. When adding securityQuals to an RTE, they are expected to have Vars whose varno matches the rt_index of the RTE (see for example the code in rewriteTargetView() which calls ChangeVarNodes() on viewqual before adding the qual to securityQuals or the main query jointree). prepend_row_security_quals() doesn't appear to have any similar code, and it would need to be passed the rt_index to do that. Thanks for the pointer. That was indeed the issue. I've pushed an update to the branch with the fix for varno handling. Thanks. It's tagged rls-9.4-upd-sb-views-v8 . I've almost run out of time to spend on row security for this commitfest, unfortunately. I'm putting a blog together with a current status update. Frustrating, as it's coming together now. Open issues include: - Passing plan inval items from rewriter into planner - COPY support pending - Clear syntax in DDL Most of the rest are solved; it's actually looking pretty good. Hi Craig, I've tested the results from the minirim.sql that was posted earlier, and the v8 gives the same results as v4 :-) Thanks for all the work! Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
* Atri Sharma (atri.j...@gmail.com) wrote: If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. I'm not following this as closely as I'd like to, but I wanted to voice my opinion that this is just not acceptable as a general answer. There are a good many applications out there which do DDL as part of ongoing activity (part of ETL, or something else) and still need to be able to get a pg_dump done. It's not a design I'd recommend, but I don't think we get to just write it off either. Thanks, Stephen signature.asc Description: Digital signature
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Stephen Frost sfr...@snowman.net writes: * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote: During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries to foreign servers, those would be fired while EXPLAINing a query as well. We want to avoid that. Instead, we can run EXPLAIN on that query at foreign server. But again, not all foreign servers would be able to EXPLAIN the query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(), if it's for EXPLAIN (except for ANALYSE may be). Agreed that we wouldn't want to actually run a query when it's just being explain'd. If the FDW can't tell the difference then we'd need to address that, of course. EXEC_FLAG_EXPLAIN_ONLY ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 2014-03-04 09:47:08 -0500, Robert Haas wrote: On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. No, not really. The issue is more around what happens if we crash part way through. At crash recovery time, the system catalogs are not available, because the database isn't consistent yet and, anyway, the startup process can't be bound to a database, let alone every database that might contain unlogged tables. So the sentinel that's used to decide whether to flush the contents of a table or index is the presence or absence of an _init fork, which the startup process obviously can see just fine. The _init fork also tells us what to stick in the relation when we reset it; for a table, we can just reset to an empty file, but that's not legal for indexes, so the _init fork contains a pre-initialized empty index that we can just copy over. Now, to make an unlogged table logged, you've got to at some stage remove those _init forks. But this is not a transactional operation. If you remove the _init forks and then the transaction rolls back, you've left the system an inconsistent state. If you postpone the removal until commit time, then you have a problem if it fails, particularly if it works for the first file but fails for the second. And if you crash at any point before you've fsync'd the containing directory, you have no idea which files will still be on disk after a hard reboot. Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] requested shared memory size overflows size_t
Robert, Please advise me: I just downloaded the source and compiled it. Sun Spark Solaris 9 is always 64 bit, I verified it with sys admin. He may run 32 bit applications as well. Have I use some special option during compilation to verify that compiled PostgreSQL is actually 64 bit app? Sincerely yours, Yuri Levinsky, DBA Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222 -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Tuesday, March 04, 2014 4:31 PM To: Yuri Levinsky Cc: Heikki Linnakangas; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] requested shared memory size overflows size_t On Tue, Mar 4, 2014 at 6:05 AM, Yuri Levinsky yu...@celltick.com wrote: I changed postgresql.conf to decrease those parameters but no change: GMT54000FATAL: requested shared memory size overflows size_t I think this means you are running on a 32-bit operating system, or at least on a 32-bit build. That means you can't use more than 4GB of address space per process, which has to fit shared_buffers and everything else. Typically it's best not to set shared_buffers above 2-2.5GB on such systems, but the real solution is to use a 64-bit PostgreSQL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-03-04 23:10 GMT+09:00 Stephen Frost sfr...@snowman.net: The cache_scan module that I and Haribabu are discussing in another thread also might be a good demonstration for custom-scan interface, however, its code scale is a bit larger than ctidscan. That does sound interesting though I'm curious about the specifics... This module caches a part of columns, but not all, thus allows to hold much larger number of records for a particular amount of RAM than the standard buffer cache. It is constructed on top of custom-scan node, and also performs a new hook for a callback on page vacuuming to invalidate its cache entry. (I originally designed this module for demonstration of on-vacuum hook because I already made ctidscan and postgres_fdw enhancement for custom-scan node, by the way.) For one thing, an example where you could have this CustomScan node calling other nodes underneath would be interesting. I realize the CTID scan can't do that directly but I would think your GPU-based system could; after all, if you're running a join or an aggregate with the GPU, the rows could come from nearly anything. Have you considered that, or is the expectation that users will just go off and access the heap and/or whatever indexes directly, like ctidscan does? How would such a requirement be handled? In case when custom-scan node has underlying nodes, it shall be invoked using ExecProcNode as built-in node doing, then it will be able to fetch tuples come from underlying nodes. Of course, custom-scan provider can perform the tuples come from somewhere as if it came from underlying relation. It is responsibility of extension module. In some cases, it shall be required to return junk system attribute, like ctid, for row-level locks or table updating. It is also responsibility of the extension module (or, should not add custom- path if this custom-scan provider cannot perform as required). Right, tons of work to do to make it all fit together and play nice- what I was trying to get at is: has this actually been done? Is the GPU extension that you're talking about as the use-case for this been written? Its chicken-and-egg problem, because implementation of the extension module fully depends on the interface from the backend. Unlike commit-fest, here is no deadline for my extension module, so I put higher priority on the submission of custom-scan node, than the extension. However, GPU extension is not fully theoretical stuff. I had implemented a prototype using FDW APIs, and it allowed to accelerate sequential scan if query has enough complicated qualifiers. See the movie (from 2:45). The table t1 is a regular table, and t2 is a foreign table. Both of them has same contents, however, response time of the query is much faster, if GPU acceleration is working. http://www.youtube.com/watch?v=xrUBffs9aJ0 So, I'm confident that GPU acceleration will have performance gain once it can run regular tables, not only foreign tables. How does it handle all of the above? Or are we going through all these gyrations in vain hope that it'll actually all work when someone tries to use it for something real? I don't talk something difficult. If junk attribute requires to return ctid of the tuple, custom-scan provider reads a tuple of underlying relation then includes a correct item pointer. If this custom-scan is designed to run on the cache, all it needs to do is reconstruct a tuple with correct item-pointer (thus this cache needs to have ctid also). It's all I did in the cache_scan module. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
Joel Jacobson j...@trustly.com writes: On Tue, Mar 4, 2014 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: You're reasoning from a false premise: it's *not* necessarily an error. Isn't this almost exactly the same situation as we had in 9.0? PL/pgSQL now throws an error if a variable name conflicts with a column name used in a query (Tom Lane) No; the reason why the old behavior was problematic was precisely that it failed to conform to normal block-structured language design rules (namely that the most closely nested definition should win). If it had been like that to start with we'd probably have just left it that way. The complexity of behavior that you see there today is there to help people with debugging issues created by that change of behavior. While I don't necessarily have an objection to creating a way to help debug variable-name-shadowing issues, the idea that they're broken and we can just start throwing errors is *wrong*. The whole point of block structure in a language is that a block of code can be understood independently of what surrounds it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Sun, Mar 02, 2014 at 05:38:38PM -0500, Noah Misch wrote: Concerning the immediate fix for non-Windows systems, does any modern system ignore modes of Unix domain sockets? It appears to be a long-fixed problem: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-1999-1402 http://unix.stackexchange.com/questions/83032/which-systems-do-not-honor-socket-read-write-permissions Nonetheless, it would be helpful for folks to test any rare platforms they have at hand. Start a postmaster with --unix-socket-permissions= and attempt to connect via local socket. If psql gives something other than psql: could not connect to server: Permission denied, please report it. Some results are in. Both Solaris 10 and omnios-6de5e81 (OmniOS v11 r151008) ignore socket modes. That justifies wrapping the socket in a directory. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
huh. what it is the standard for equivalence? I guess we'd be following javascript ===, right? (http://dorey.github.io/JavaScript-Equality-Table/). right. But in your link I don't understand array (and object) equality rules. Hstore (and jsonb) compare function believes that arrays are equal if each corresponding elements of them are equal. postgres=# select 'a=[]'::hstore = 'a=[]'::hstore; ?column? -- t (1 row) Time: 0,576 ms postgres=# select 'a=[0]'::hstore = 'a=[0]'::hstore; ?column? -- t (1 row) Time: 0,663 ms postgres=# select 'a=[0]'::hstore = 'a=[1]'::hstore; ?column? -- f -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Tue, Mar 4, 2014 at 3:31 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 01:10:50 -0300, Fabrízio de Royes Mello wrote: Today I do something like that: 1) create unlogged table tmp_foo ... 2) populate 'tmp_foo' table (ETL scripts or whatever) 3) start transaction 4) lock table tmp_foo in access exclusive mode 5) update pg_class set relpersistence = 'p' where oid = 'tmp_foo':regclass 6) drop table foo; -- the old foo table 7) alter table tmp_foo rename to foo; 8) end transaction 9) run pg_repack in table 'foo' I know it's very ugly, but works... and works for standbys too... :-) No, it doesn't work. It just may happen to not fail loudly/visibly in some cases. You're absolutely risking corruption of this *and* other relations when doing so. Well this already works for some time, but you are correct, exists the risk of corruption! But in my case if all run without any interrupt the relation is switched to logged. I do some checks before and after, and if something happens with this process we cleanup everything and start from the beginning. Maybe I must run CLUSTER inside the transaction block after update pg_class and execute DROP and RENAME after, in a second phase. Maybe this way is more secure. Is it? If some crash occurs and PostgreSQL restart I check if the unlogged table 'tmp_foo' exists and then I drop it. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Mon, Mar 03, 2014 at 08:15:41PM -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: On Mon, Mar 03, 2014 at 01:29:00AM -0500, Tom Lane wrote: What I was envisioning was that we'd be relying on the permissions of the containing directory to keep out bad guys. Permissions on the socket itself might be sufficient, but what does it save us to assume that? My first preference is to use the simplest code that POSIX requires to have the behavior we desire. POSIX specifies as implementation-defined whether connect() checks filesystem permissions. That's true of both directory search permissions and permissions on the socket itself. Surely you are misinterpreting that. If it worked as you suggest, connect() would provide a trivial method of bypassing directory permissions, at least to the extent of being able to probe for the existence of files within supposedly-unreadable directories. I can believe that there are implementations that don't examine the permissions on the socket file itself, but not that there are any that disregard directory permissions during the file lookup. Wherever POSIX makes something implementation-defined, it is possible to design a conforming system with detestable properties. That does not shake the fact that this behavior is indeed implementation-defined. I found no evidence either way concerning the prevalence of systems that ignore directory search permissions above sockets. That's because the question is ridiculous on its face, so nobody ever bothered to address it. I think the burden is on you to show that there has ever been any system that read the spec the way you propose. I doubt any exist. I don't care for interposing a directory based solely on the fact that some ancient systems needed that. Changing unix_socket_permissions is a one-liner in each test driver. Placing the socket in a directory entails setting PGHOST in the psql and postmaster environments and cleaning up the directory on exit. Placing the socket anywhere besides the default location will require setting PGHOST anyway, so I don't see that this argument holds much water. If we have moved the socket anyway, then the costs of moving the socket vanish? Yes, yes they do... Your responses have not added to this thread. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote: During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries to foreign servers, those would be fired while EXPLAINing a query as well. We want to avoid that. Instead, we can run EXPLAIN on that query at foreign server. But again, not all foreign servers would be able to EXPLAIN the query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(), if it's for EXPLAIN (except for ANALYSE may be). Agreed that we wouldn't want to actually run a query when it's just being explain'd. If the FDW can't tell the difference then we'd need to address that, of course. EXEC_FLAG_EXPLAIN_ONLY ... Yeah, figured there should be a way. Still not sure that kicking the query off from ExecInitNode() is a good idea though. Perhaps it could be optional somehow. I really like the idea of being able to make Append work in an async mode where it's pulling data from multiple sources at the same time, but it's a fair bit of work. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Stephen Frost escribió: * Atri Sharma (atri.j...@gmail.com) wrote: If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. I'm not following this as closely as I'd like to, but I wanted to voice my opinion that this is just not acceptable as a general answer. There are a good many applications out there which do DDL as part of ongoing activity (part of ETL, or something else) and still need to be able to get a pg_dump done. It's not a design I'd recommend, but I don't think we get to just write it off either. Agreed -- user caution is a recipe for trouble, because these things cannot always be planned in minute detail (or such planning creates an excessive cost.) One concern is schema changes that make a dump unrestorable, for instance if there's a foreign key relationship between tables A and B, such that pg_dump dumps the FK for table A but by the time it dumps table B the unique index has gone and thus restoring the FK fails. If this is a realistic failure scenario, then we need some mechanism to avoid it. One possible idea would be to create a new lock level which conflicts with DDL changes but not with regular operation including dumps; so it wouldn't self-conflict but it would conflict with ShareUpdateExclusive. pg_dump would acquire a lock of that level instead of AccessShare; thus two pg_dumps would be able to run on the same table simultaneously, but it would block and be blocked by DDL changes that grab SUE. The big hole in this is that pg_dump would still block vacuum, which is a problem. I hesitate two suggest two extra levels, one for dumps (which wouldn't conflict with SUE) and one for non-exclusive DDL changes (which would.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade: allow multiple -o/-O options
Hello, RFE: Consider that you want to run pg_upgrade via some script with some default '-o' option. But then you also want to give the script's user a chance to specify the old-server's options according user's needs. Then something like the following is not possible: $ cat script ... pg_upgrade ... -o 'sth' $PG_UPGRADE_OPT ... ... I know that this problem is still script-able, but the fix should be innocent and it would simplify things. Thanks for considering, PavelFrom 44ac4867a6fb67ab086ba22db8d0ad2788e9860e Mon Sep 17 00:00:00 2001 From: Pavel Raiskup prais...@redhat.com Date: Tue, 4 Mar 2014 15:58:16 +0100 Subject: [PATCH] pg_upgrade: allow passing multiple -o/-O options The final options passed to subsequent servers are concatenated from particular option arguments. --- contrib/pg_upgrade/option.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/contrib/pg_upgrade/option.c b/contrib/pg_upgrade/option.c new file mode 100644 index cd9f66d..d3d2460 *** a/contrib/pg_upgrade/option.c --- b/contrib/pg_upgrade/option.c *** *** 25,30 --- 25,31 static void usage(void); static void check_required_directory(char **dirpath, char **configpath, char *envVarName, char *cmdLineOption, char *description); + static void append_opt(char **str, const char *str2); #define FIX_DEFAULT_READ_ONLY -c default_transaction_read_only=false *** parseCommandLine(int argc, char *argv[]) *** 138,148 break; case 'o': ! old_cluster.pgopts = pg_strdup(optarg); break; case 'O': ! new_cluster.pgopts = pg_strdup(optarg); break; /* --- 139,149 break; case 'o': ! append_opt(old_cluster.pgopts, optarg); break; case 'O': ! append_opt(new_cluster.pgopts, optarg); break; /* *** parseCommandLine(int argc, char *argv[]) *** 232,237 --- 233,252 } + static void + append_opt(char **str, const char *str2) + { + if (!*str) + *str = pg_strdup(str2); + else + { + *str = pg_realloc(*str, strlen(*str) + strlen(str2) + 2); + strcat(*str, ); + strcat(*str, str2); + } + } + + static void usage(void) { -- 1.8.5.3 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Tue, Mar 4, 2014 at 8:19 PM, Stephen Frost sfr...@snowman.net wrote: * Atri Sharma (atri.j...@gmail.com) wrote: If its not the case, the user should be more careful about when he is scheduling backups to so that they dont conflict with DDL changes. I'm not following this as closely as I'd like to, but I wanted to voice my opinion that this is just not acceptable as a general answer. There are a good many applications out there which do DDL as part of ongoing activity (part of ETL, or something else) and still need to be able to get a pg_dump done. It's not a design I'd recommend, but I don't think we get to just write it off either. Well, that will require something like MVCC or stricter locking in general. That is not in line with the aim of this patch, hence I raised this point. Regards, Atri Regards, Atri *l'apprenant*
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Tue, Mar 4, 2014 at 11:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. No, not really. The issue is more around what happens if we crash part way through. At crash recovery time, the system catalogs are not available, because the database isn't consistent yet and, anyway, the startup process can't be bound to a database, let alone every database that might contain unlogged tables. So the sentinel that's used to decide whether to flush the contents of a table or index is the presence or absence of an _init fork, which the startup process obviously can see just fine. The _init fork also tells us what to stick in the relation when we reset it; for a table, we can just reset to an empty file, but that's not legal for indexes, so the _init fork contains a pre-initialized empty index that we can just copy over. Now, to make an unlogged table logged, you've got to at some stage remove those _init forks. But this is not a transactional operation. If you remove the _init forks and then the transaction rolls back, you've left the system an inconsistent state. If you postpone the removal until commit time, then you have a problem if it fails, particularly if it works for the first file but fails for the second. And if you crash at any point before you've fsync'd the containing directory, you have no idea which files will still be on disk after a hard reboot. Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Did you see this initial patch [1] from Leonardo Francalanci ? Regards, [1] http://www.postgresql.org/message-id/263033.9223...@web29013.mail.ird.yahoo.com -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] VACUUM FULL/CLUSTER doesn't update pg_class's pg_class.relfrozenxid
On Mon, Mar 3, 2014 at 7:52 AM, Robert Haas robertmh...@gmail.com wrote: But all that having been said, a deadline is a deadline, so if anyone wishes to declare this untimely please speak up. Hearing only crickets, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On Tue, Mar 4, 2014 at 4:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Joel Jacobson j...@trustly.com writes: On Tue, Mar 4, 2014 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: You're reasoning from a false premise: it's *not* necessarily an error. Isn't this almost exactly the same situation as we had in 9.0? PL/pgSQL now throws an error if a variable name conflicts with a column name used in a query (Tom Lane) No; the reason why the old behavior was problematic was precisely that it failed to conform to normal block-structured language design rules (namely that the most closely nested definition should win). If it had been like that to start with we'd probably have just left it that way. The complexity of behavior that you see there today is there to help people with debugging issues created by that change of behavior. While I don't necessarily have an objection to creating a way to help debug variable-name-shadowing issues, the idea that they're broken and we can just start throwing errors is *wrong*. The whole point of block structure in a language is that a block of code can be understood independently of what surrounds it. I agree it should be possible to reuse a variable in a new block, but I think the IN/OUT variable should be considered to be at the *same* block-level as the first block of code, thus an error should be thrown. Consider the same scenario in for instance Perl: # Example 1 # Prints 1 and doesn't throw an error, which is perfectly OK. use warnings; my $foo = 1; { my $foo = 2; } print $foo; # Example 2 # my variable $foo masks earlier declaration in same scope at warn_shadow.pl line 3. use warnings; my $foo = 1; my $foo = 2; print $foo; Or maybe this is a better example, since we are talking about functions: # Example 3 # my variable $bar masks earlier declaration in same scope at warn_shadow.pl line 7. use warnings; sub foo { # IN-variables: my ($bar) = @_; # DECLARE: my $bar; # BEGIN: $bar = 1; return $bar; } foo(2); I understand that from a technical perspective, the mandatory BEGIN...END you always need in a PL/pgSQL function, is a new block, and the variables declared are perhaps technically in a new block, at a deeper level than the IN/OUT variables. But I would still argue the expected behaviour of PL/pgSQL for a new user would be to consider the IN/OUT variables to be in the same block as the variables declared in the function's first block. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Tue, Mar 4, 2014 at 10:17 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: One possible idea would be to create a new lock level which conflicts with DDL changes but not with regular operation including dumps; so it wouldn't self-conflict but it would conflict with ShareUpdateExclusive. pg_dump would acquire a lock of that level instead of AccessShare; thus two pg_dumps would be able to run on the same table simultaneously, but it would block and be blocked by DDL changes that grab SUE. The big hole in this is that pg_dump would still block vacuum, which is a problem. I hesitate two suggest two extra levels, one for dumps (which wouldn't conflict with SUE) and one for non-exclusive DDL changes (which would.) AFAIK, the only reason why vacuum takes ShareUpdateExclusive lock is because it can't run at the same time as another vacuum. I tend to think (and have thought for some time) that we really ought to have vacuum take AccessShareLock on the relation plus some other lock that is specific to vacuum (say, a relation vacuum lock, just as we have relation extension locks). Your idea of a lock strong enough to conflict with DDL but not self-conflicting is interesting, too, but I can't claim to have thought through it all that carefully just yet. I think this is all too late for 9.4, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Alvaro Herrera alvhe...@2ndquadrant.com writes: One concern is schema changes that make a dump unrestorable, for instance if there's a foreign key relationship between tables A and B, Yeah. Ideally, what pg_dump would produce would be a consistent snapshot of the database state as of its transaction snapshot time. We have always had that guarantee so far as user data was concerned, but it's been shaky (and getting worse) so far as the database schema is concerned. What bothers me about the current patch is that it's going to make it a whole lot more worse. Also, I don't have any love at all for proposals that we increase the lock level that pg_dump holds. pg_dump tends to run for a long time. I've not been paying all that much attention to the logical-decoding patches, but wasn't there something in there about being able to see the catalog state as it was at some point in the past? If so, maybe we could leverage that to allow a backend to enter a pg_dump state wherein its view of the catalogs was frozen at its transaction start snapshot. We'd have to restrict it to read-only operation for safety, but that's surely no problem for pg_dump. If we had that, then this whole problem of server-side computations producing inconsistent results would go away. There might still be a window wherein tables visible at transaction start could be dropped before AccessShareLock could be acquired, but I think we could let pg_dump error out in that case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] requested shared memory size overflows size_t
On Tue, Mar 4, 2014 at 9:53 AM, Yuri Levinsky yu...@celltick.com wrote: Robert, Please advise me: I just downloaded the source and compiled it. Sun Spark Solaris 9 is always 64 bit, I verified it with sys admin. He may run 32 bit applications as well. Have I use some special option during compilation to verify that compiled PostgreSQL is actually 64 bit app? Sorry, I'm not familiar with compiling PostgreSQL on Solaris. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
* Tom Lane (t...@sss.pgh.pa.us) wrote: Yeah. Ideally, what pg_dump would produce would be a consistent snapshot of the database state as of its transaction snapshot time. We have always had that guarantee so far as user data was concerned, but it's been shaky (and getting worse) so far as the database schema is concerned. What bothers me about the current patch is that it's going to make it a whole lot more worse. Also, I don't have any love at all for proposals that we increase the lock level that pg_dump holds. pg_dump tends to run for a long time. Agreed. I've not been paying all that much attention to the logical-decoding patches, but wasn't there something in there about being able to see the catalog state as it was at some point in the past? If so, maybe we could leverage that to allow a backend to enter a pg_dump state wherein its view of the catalogs was frozen at its transaction start snapshot. We'd have to restrict it to read-only operation for safety, but that's surely no problem for pg_dump. If we had that, then this whole problem of server-side computations producing inconsistent results would go away. That certainly sounds like a tempting idea. There might still be a window wherein tables visible at transaction start could be dropped before AccessShareLock could be acquired, but I think we could let pg_dump error out in that case. I don't have too much of an issue with the above, but I would like to have us figure out a solution to the deadlock problem with parallel pg_dump. The issue arises when pg_dump gets an AccessShareLock and then another process attempts to acquire an AccessExclusiveLock, which then blocks, and then the pg_dump worker process tries to get its AccessShareLock- we end up not being able to make any progress on anything at that point. One suggestion that was discussed at PGConf.EU was having processes which share the same snapshot (the pg_dump master and worker processes) able to either share the same locks or at least be able to jump the lock queue (that is, the worker process wouldn't have to wait being the AEL to get an ASL, since the ASL was already aquired for the snapshot which was exported and shared with it). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Robert Haas robertmh...@gmail.com writes: I think this is all too late for 9.4, though. I agree with the feeling that a meaningful fix for pg_dump isn't going to get done for 9.4. So that leaves us with the alternatives of (1) put off the lock-strength-reduction patch for another year; (2) push it anyway and accept a reduction in pg_dump reliability. I don't care for (2). I'd like to have lock strength reduction as much as anybody, but it can't come at the price of reduction of reliability. The bigger picture here is that it seems like anytime I've thought for more than five minutes about the lock strength reduction patch, I've come up with some fundamental problem. That doesn't leave me with a warm feeling that we're getting close to having something committable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Stephen Frost sfr...@snowman.net writes: I don't have too much of an issue with the above, but I would like to have us figure out a solution to the deadlock problem with parallel pg_dump. The issue arises when pg_dump gets an AccessShareLock and then another process attempts to acquire an AccessExclusiveLock, which then blocks, and then the pg_dump worker process tries to get its AccessShareLock- we end up not being able to make any progress on anything at that point. The original ASL was acquired by the pg_dump master, right? One suggestion that was discussed at PGConf.EU was having processes which share the same snapshot (the pg_dump master and worker processes) able to either share the same locks or at least be able to jump the lock queue (that is, the worker process wouldn't have to wait being the AEL to get an ASL, since the ASL was already aquired for the snapshot which was exported and shared with it). Yeah, it seems like we need lock export not only snapshot export to make this work nicely. But that sounds orthogonal to the issues being discussed in this thread. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I don't have too much of an issue with the above, but I would like to have us figure out a solution to the deadlock problem with parallel pg_dump. The issue arises when pg_dump gets an AccessShareLock and then another process attempts to acquire an AccessExclusiveLock, which then blocks, and then the pg_dump worker process tries to get its AccessShareLock- we end up not being able to make any progress on anything at that point. The original ASL was acquired by the pg_dump master, right? Yup. It goes through and gets ASLs on everything first. One suggestion that was discussed at PGConf.EU was having processes which share the same snapshot (the pg_dump master and worker processes) able to either share the same locks or at least be able to jump the lock queue (that is, the worker process wouldn't have to wait being the AEL to get an ASL, since the ASL was already aquired for the snapshot which was exported and shared with it). Yeah, it seems like we need lock export not only snapshot export to make this work nicely. But that sounds orthogonal to the issues being discussed in this thread. Indeed, just figured I'd mention it since we're talking about pg_dump-related locking. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Tom Lane escribió: I'd like to have lock strength reduction as much as anybody, but it can't come at the price of reduction of reliability. Can we have at least a cut-down version of it? If we can just reduce the lock level required for ALTER TABLE / VALIDATE, that would be an enormous improvement already. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] The behavior of CheckRequiredParameterValues()
Hi all, I had doubts regarding behavior of CheckRequiredParameterValues() function. I could not start standby server which is created by pg_basebackup with following scenario. 1. Start the master server with 'wal_level = archve' , 'hot_standby = on' and other settings of replication. 2. Create the standby server from the master server by using pg_basebackup. 3. Change the wal_level value of both master and standby server to 'hot_standby'. 4. Restarting the master server. 5. Starting the standby server. In #5, I got following error even if I set wal_level to 'hot_standby'. FATAL: hot standby is not possible because wal_level was not set to hot_standby or higher on the master server I tried to investigate this behaviour. Currently CheckRequiredParameterValues() function uses wal_level value which is got from ControlFile when comparing between wal_level and WAL_LEVEL_HOT_STANDBY as following code. xlog.c:6177 if (ControlFile-wal_level WAL_LEVEL_HOT_STANDBY) ereport(ERROR, (errmsg(hot standby is not possible because wal_level was not So we have to start and stop standby server with changed wal_level(i.g., hot_standby) if we want to enable hot standby. In this case, I think that the standby server didn't need to confirm wal_level value of ControlFile. I think that it should confirm value which is written in postgreql.conf. I might be missing something. Please let me know that. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Tue, Mar 4, 2014 at 10:20 PM, Stephen Frost sfr...@snowman.net wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I don't have too much of an issue with the above, but I would like to have us figure out a solution to the deadlock problem with parallel pg_dump. The issue arises when pg_dump gets an AccessShareLock and then another process attempts to acquire an AccessExclusiveLock, which then blocks, and then the pg_dump worker process tries to get its AccessShareLock- we end up not being able to make any progress on anything at that point. The original ASL was acquired by the pg_dump master, right? Yup. It goes through and gets ASLs on everything first. One suggestion that was discussed at PGConf.EU was having processes which share the same snapshot (the pg_dump master and worker processes) able to either share the same locks or at least be able to jump the lock queue (that is, the worker process wouldn't have to wait being the AEL to get an ASL, since the ASL was already aquired for the snapshot which was exported and shared with it). Yeah, it seems like we need lock export not only snapshot export to make this work nicely. But that sounds orthogonal to the issues being discussed in this thread. Indeed, just figured I'd mention it since we're talking about pg_dump-related locking. What happens for foreign key constraints? For pg_dump, do we lock the tables referenced by the table which is being dumped right now? If that is the case, wouldnt MVCC based approach be the best way for this? Please ignore if I said anything silly. I am just trying to understand how it works here. Regards, Atri
Re: [HACKERS] jsonb and nested hstore
On 03/03/2014 09:06 PM, Peter Geoghegan wrote: On Mon, Mar 3, 2014 at 9:05 PM, Andrew Dunstan and...@dunslane.net wrote: What you're not welcome to do, from my POV, is move jsonb into the hstore extension. I strenuously object to any such plan. We both know that that isn't really the point of contention at all. Actually, I didn't know any such thing. Just a couple days ago, you were arguing fairly strongly for moving jsonb to the hstore extension. You weren't clear that you'd given up on that line of argument. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
Pavel Stehule escribió: Hello updated version - a precheck is very simple, and I what I tested it is enough Okay, thanks. I pushed it after some more editorialization. I don't think I broke anything, but please have a look. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: Is the TODO item make an unlogged table logged [1] a good GSoC project? I'm pretty sure we found some problems in that design that we couldn't figure out how to solve. I don't have a pointer to the relevant -hackers discussion off-hand, but I think there was one. ISTR the discussion going something along the lines of we'd have to WAL log the entire table to do that, and if we have to do that, what's the point?. No, not really. The issue is more around what happens if we crash part way through. At crash recovery time, the system catalogs are not available, because the database isn't consistent yet and, anyway, the startup process can't be bound to a database, let alone every database that might contain unlogged tables. So the sentinel that's used to decide whether to flush the contents of a table or index is the presence or absence of an _init fork, which the startup process obviously can see just fine. The _init fork also tells us what to stick in the relation when we reset it; for a table, we can just reset to an empty file, but that's not legal for indexes, so the _init fork contains a pre-initialized empty index that we can just copy over. Now, to make an unlogged table logged, you've got to at some stage remove those _init forks. But this is not a transactional operation. If you remove the _init forks and then the transaction rolls back, you've left the system an inconsistent state. If you postpone the removal until commit time, then you have a problem if it fails, particularly if it works for the first file but fails for the second. And if you crash at any point before you've fsync'd the containing directory, you have no idea which files will still be on disk after a hard reboot. Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:
On 2014-03-04 11:40:10 -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: I think this is all too late for 9.4, though. I agree with the feeling that a meaningful fix for pg_dump isn't going to get done for 9.4. So that leaves us with the alternatives of (1) put off the lock-strength-reduction patch for another year; (2) push it anyway and accept a reduction in pg_dump reliability. I don't care for (2). I'd like to have lock strength reduction as much as anybody, but it can't come at the price of reduction of reliability. I am sorry, but I think this is vastly overstating the scope of the pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the amount of problems that has caused in the past is surprisingly low. If such a frequently used command doesn't cause problems, why are you assuming other commands to be that problematic? And I think it's hard to argue that the proposed changes are more likely to cause problems. Let's try to go at this a bit more methodically. The commands that - afaics - change their locklevel due to latest patch (v21) are: 01) ALTER TABLE .. ADD CONSTRAINT 02) ALTER TABLE .. ADD CONSTRAINT ... USING 03) ALTER TABLE .. ENABLE | DISABLE [ REPLICA | ALWAYS ] TRIGGER [ ALL ] 04) ALTER TABLE .. ALTER CONSTRAINT 05) ALTER TABLE .. REPLICA IDENTITY 06) ALTER TABLE .. ALTER COLUMN .. SET NOT NULL (*not* DROP NULL) cmd_lockmode = ShareRowExclusiveLock; 07) ALTER TABLE ... ALTER COLUMN ... SET STATISTICS 08) ALTER TABLE ... CLUSTER ON ... 09) ALTER TABLE ... SET WITHOUT CLUSTER 10) ALTER TABLE ... SET (...) 11) ALTER TABLE ... RESET (...) 12) ALTER TABLE ... ALTER COLUMN ... SET (...) 13) ALTER TABLE ... ALTER COLUMN ... RESET (...) 14) ALTER TABLE ... VALIDATE CONSTRAINT constraint_name cmd_lockmode = ShareUpdateExclusiveLock; I have my reservations about ADD CONSTRAINT (including SET NOT NULL) being unproblematic (mostly because I haven't thought through possible consquences for the planner making different choices with added constraints). From the perspective of pg_dump consistency, except ADD CONSTRAINT, none of those seem to have graver possible consequences than CREATE INDEX (and DROP INDEX CONCURRENTLY) already being unsafe. In fact all of those should actually end up being *safer*, even ending up always being dumped consistently since they are all reconstructed clientside by pg_dump. You argue elsewhere that that's a fragile coincidence. But so what, even if it changes, the consequences still are going to be *far* less significant than missing various index, trigger, and whatnot commands. I think the set of problems you mention are going to be really important when we someday get around to make stuff like ALTER TABLE ... ADD/DROP COLUMN require lower lock levels, but that's not what's proposed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 2014-03-04 12:54:02 -0500, Robert Haas wrote: On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. I think it'd already much better than today's situation, and it's a required codepath for wal_level logical anyway. So even if somebody wants to make this work without the full copy for minimal, it'd still be a required codepath. So I am perfectly ok with a patch just adding that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trigger information for auto_explain.
Kyotaro HORIGUCHI wrote: Hi, I saw this patch has been moved into committed patches but only the first part (0001_..) for the core is committed as of 32001ab but the rest for extension side seem not to have been committed. Would you mind taking a look on that, Álvaro? Yep, pushed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
2014-03-04 19:12 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: Hello updated version - a precheck is very simple, and I what I tested it is enough Okay, thanks. I pushed it after some more editorialization. I don't think I broke anything, but please have a look. It looks well Thank you very much Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] plpgsql.warn_shadow
On 03/04/2014 11:23 AM, Joel Jacobson wrote: I understand that from a technical perspective, the mandatory BEGIN...END you always need in a PL/pgSQL function, is a new block, and the variables declared are perhaps technically in a new block, at a deeper level than the IN/OUT variables. But I would still argue the expected behaviour of PL/pgSQL for a new user would be to consider the IN/OUT variables to be in the same block as the variables declared in the function's first block. No they are not. Teaching a new user to consider them as the same is simply wrong. The parameters belong to a block that matches the function name. The outermost block has a different name if supplied (I usually use fn), or is otherwise anonymous. Lots of code quite correctly relies on this, including some I have written. This isn't a mere technical difference, and there is surely zero chance that we will label use of it an error under any circumstances. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
Pavel Stehule escribió: 2014-03-04 19:12 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: Hello updated version - a precheck is very simple, and I what I tested it is enough Okay, thanks. I pushed it after some more editorialization. I don't think I broke anything, but please have a look. It looks well Coypu is showing a strange failure though: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypudt=2014-03-04%2018%3A22%3A31 select make_interval(secs := 'inf'); ! make_interval ! - ! @ 0.01 secs ago ! (1 row) I realize that we have some hacks in float4in and float8in to deal with these portability issues ... Maybe the fix is just take out the test. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Robert Haas (robertmh...@gmail.com) wrote: On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote: In accordance with the above, what I'd like to see with this patch is removal of the postgres_fdw changes and any changes which were for that support. In addition, I'd like to understand why 'ctidscan' makes any sense to have as an example of what to use this for- if that's valuable, why wouldn't we simply implement that in core? I do want an example in contrib of how to properly use this capability, but I don't think that's it. I suggested that example to KaiGai at last year's PGCon. It may indeed be something we want to have in core, but right now we don't. Alright- so do you feel that the simple ctidscan use-case is a sufficient justification and example of how this can be generally useful that we should be adding these hooks to core..? I'm willing to work through the patch and clean it up this weekend if we agree that it's useful and unlikely to immediately be broken by expected changes.. Thanks, Stephen signature.asc Description: Digital signature
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
I apologize for not having paid much attention to this thread so far. It kept getting stuck on my to look at later queue. Anyway, I've taken a preliminary look at the v7 patch now. While the patch seems roughly along the lines of what we talked about last PGCon, I share Stephen's unease about a lot of the details. It's not entirely clear that these hooks are really good for anything, and it's even less clear what APIs the hook functions should be expected to depend on. I really do not like the approach embodied in the later patches of oh, we'll just expose whatever static planner functions seem convenient. That's not an approach that's available when doing actual external development of an extension, and even if it were that doesn't make it a good idea. The larger the exposed surface of functions the harder it is to know what's safe to change. Anyway, on to specifics: * Please drop the whole register_custom_provider/get_custom_provider API. There is no reason other than debugging for a provider to have a name at all, and if we expect providers to have unique names then that creates a collision risk for independently-created extensions. AFAICS, it's sufficient for a function hooked into one of the add-a-path hooks to include a pointer to a struct-of-function-pointers in the Path object it returns, and similarly the CustomScan Plan object can contain a pointer inserted when it's created. I don't object to having a name field in the function pointer structs for debugging reasons, but I don't want any lookups being done on it. * The function-struct pointers should be marked const in the referencing nodes, to indicate that the core code won't be modifying or copying them. In practice they'd probably be statically allocated constants in the extensions anyway. * The patch does lots of violence to the separation between planner and executor, starting with the decision to include nodes/relation.h in executor.h. That will not do at all. I see that you did that because you wanted to make ExecSupportsMarkRestore take a Path, but we need some other answer. One slightly grotty answer is to invent two different customscan Plan types, one that supports mark/restore and one that doesn't, so that ExecSupportsMarkRestore could still just look at the Plan type tag. (BTW, path-pathtype is supposed to contain the node tag of the Plan node that the path would produce. Putting T_CustomPath in it is entirely tone-deaf.) Another way would be to remove ExecSupportsMarkRestore in favor of some new function in the planner; but it's good to keep it in execAmi.c since that has other knowledge of which plan types support mark/restore. * More generally, I'm not convinced that exactly one Path type and exactly one Plan type is going to get us very far. It seems rather ugly to use the same Plan type for both scan and join nodes, and what will happen if somebody wants to build a custom Append node, or something else that has neither zero nor two subplans? * nodeCustom.h is being completely abused. That should only export the functions in nodeCustom.c, which are going to be pretty much one-liners anyway. The right place to put the function pointer struct definitions is someplace else. I'd be inclined to start by separating the function pointers into two structs, one for functions needed for a Path and one for functions needed for a Plan, so that you don't have this problem of having to import everything the planner knows into an executor header or vice versa. Most likely you could just put the Path function pointer struct declaration next to CustomPath in relation.h, and the one for Plans next to CustomPlan (or the variants thereof) in plannodes.h. * The set of fields provided in CustomScan seems nonsensical. I'm not even sure that it should be derived from Scan; that's okay for plan types that actually are scans of a base relation, but it's confusing overhead for anything that's say a join, or a custom sort node, or anything like that. Maybe one argument for multiple plan node types is that one would be derived from Scan and one directly from Plan. * More generally, what this definition for CustomScan exposes is that we have no idea whatever what fields a custom plan node might need. I'm inclined to think that what we should be assuming is that any custom path or plan node is really an object of a struct type known only to its providing extension, whose first field is the CustomPath or CustomPlan struct known to the core backend. (Think C++ subclassing.) This would imply that copyfuncs/equalfuncs/outfuncs support would have to be provided by the extension, which is in principle possible if we add function pointers for those operations to the struct linked to from the path/plan object. (Notationally this might be a bit of a pain, since the macros that we use in the functions in copyfuncs.c etc aren't public. Not sure if it's worth exposing those somewhere, or if people should just copy/paste them.) This
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 4 March 2014 16:27, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: One concern is schema changes that make a dump unrestorable, for instance if there's a foreign key relationship between tables A and B, Yeah. Ideally, what pg_dump would produce would be a consistent snapshot of the database state as of its transaction snapshot time. We have always had that guarantee so far as user data was concerned, but it's been shaky (and getting worse) so far as the database schema is concerned. What bothers me about the current patch is that it's going to make it a whole lot more worse. While thinking this through it occurs to me that there is no problem at all. Your earlier claim that the dump is inconsistent just isn't accurate. We now have MVCC catalogs, so any dump is going to see a perfectly consistent set of data plus DDL. OK the catalogs may change AFTER the snapshot was taken for the dump, but then so can the data change - that's just MVCC. I was going to add an option to increase lock level, but I can't see why you'd want it even. The dumps are consistent... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On March 4, 2014 8:39:55 PM CET, Simon Riggs si...@2ndquadrant.com wrote: On 4 March 2014 16:27, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: One concern is schema changes that make a dump unrestorable, for instance if there's a foreign key relationship between tables A and B, Yeah. Ideally, what pg_dump would produce would be a consistent snapshot of the database state as of its transaction snapshot time. We have always had that guarantee so far as user data was concerned, but it's been shaky (and getting worse) so far as the database schema is concerned. What bothers me about the current patch is that it's going to make it a whole lot more worse. While thinking this through it occurs to me that there is no problem at all. Your earlier claim that the dump is inconsistent just isn't accurate. We now have MVCC catalogs, so any dump is going to see a perfectly consistent set of data plus DDL. OK the catalogs may change AFTER the snapshot was taken for the dump, but then so can the data change - that's just MVCC. I was going to add an option to increase lock level, but I can't see why you'd want it even. The dumps are consistent... Mvcc scans only guarantee that individual scans are consistent, not that separate scans are. Each individual scan takes a new snapshot if there's been ddl. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 12:54:02 -0500, Robert Haas wrote: On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. I think it'd already much better than today's situation, and it's a required codepath for wal_level logical anyway. So even if somebody wants to make this work without the full copy for minimal, it'd still be a required codepath. So I am perfectly ok with a patch just adding that. Then is this a good idea for a GSoC project ? I don't know very well this internals, but I am willing to learn and I think the GSoC is a good opportunity. Any of you are willing to mentoring this project? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: Fwd: [HACKERS] patch: make_timestamp function
2014-03-04 20:20 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: 2014-03-04 19:12 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: Hello updated version - a precheck is very simple, and I what I tested it is enough Okay, thanks. I pushed it after some more editorialization. I don't think I broke anything, but please have a look. It looks well Coypu is showing a strange failure though: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypudt=2014-03-04%2018%3A22%3A31 select make_interval(secs := 'inf'); ! make_interval ! - ! @ 0.01 secs ago ! (1 row) I realize that we have some hacks in float4in and float8in to deal with these portability issues ... Maybe the fix is just take out the test. I have no idea, how to fix it now and have to leave a office. Tomorrow I'll try to fix it. Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Andres Freund and...@2ndquadrant.com writes: On 2014-03-04 11:40:10 -0500, Tom Lane wrote: I don't care for (2). I'd like to have lock strength reduction as much as anybody, but it can't come at the price of reduction of reliability. I am sorry, but I think this is vastly overstating the scope of the pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the amount of problems that has caused in the past is surprisingly low. CREATE INDEX happens to be okay because pg_dump won't try to dump indexes it doesn't see in its snapshot, ie the list of indexes to dump is created client-side. CREATE INDEX CONCURRENTLY, otoh, already did break pg_dump, and we had to hack things to fix it; see commit 683abc73dff549e94555d4020dae8d02f32ed78b. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
On 3 March 2014 23:00, Florian Pflug f...@phlo.org wrote: * In show_windowagg_info(), this calculation looks suspicious to me: double tperrow = winaggstate-aggfwdtrans / (inst-nloops * inst-ntuples); If the node is executed multiple times, aggfwdtrans will be reset in each loop, so the transitions per row figure will be under-estimated. ISTM that if you want to report on this, you'd need aggfwdtrans to be reset once per query, but I'm not sure exactly how to do that. ... Actually, I think it's misleading to only count forward transition function calls, because a call to the inverse transition function still represents a state transition, and is likely to be around the same cost. For a window of size 2, there would not be much advantage to using inverse transition functions, because it would be around 2 transitions per row either way. True. In fact, I pondered whether to avoid using the inverse transition function for windows of 2 rows. In the end, I didn't because I felt that it makes custom aggregates harder to test. On the question of whether to count inverse transition function calls - the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the number of state transitions, but rather to show whether the aggregation has O(n) or O(n^2) behaviour. The idea being that a value close to 1 means inverse transition function works as expected, and larger values mean not working so well. Regarding multiple evaluations - I think I based the behaviour on how ntuples works, which also only reports the value of the last evaluation I think. But maybe I'm confused about this. No, it doesn't look like that's correct for multiple loops. Consider this example: explain (verbose, analyse) select * from (values (10), (20), (30), (40)) v(x), lateral (select sum(i) over (rows between 4 preceding and current row) from generate_series(1, x) i) t; QUERY PLAN Nested Loop (cost=0.00..170.06 rows=4000 width=12) (actual time=0.027..0.414 rows=100 loops=1) Output: *VALUES*.column1, (sum(i.i) OVER (?)) - Values Scan on *VALUES* (cost=0.00..0.05 rows=4 width=4) (actual time=0.002..0.006 rows=4 loops=1) Output: *VALUES*.column1 - WindowAgg (cost=0.00..22.50 rows=1000 width=4) (actual time=0.019..0.094 rows=25 loops=4) Output: sum(i.i) OVER (?) Transitions Per Row: 0.2 - Function Scan on pg_catalog.generate_series i (cost=0.00..10.00 rows=1000 width=4) (actual time=0.010..0.015 rows=25 loops=4) Output: i.i Function Call: generate_series(1, *VALUES*.column1) It turns out that show_windowagg_info() is only called once at the end, with ntuples=100, nloops=4 and aggfwdtrans=100, so it's computing tperrow=100/(4*100)=0.25, which then gets truncated to 0.2. So to get 1, you'd have to use this formula: double tperrow = winaggstate-aggfwdtrans / inst-ntuples; I'm still not convinced that's the most useful thing to report though. Personally, I'd prefer to just see the separate counts, e.g.: - WindowAgg (cost=0.00..22.50 rows=1000 width=4) (actual time=0.019..0.094 rows=25 loops=4) Output: sum(i.i) OVER (?) Forward transitions: 25 Inverse transitions: 25 IMO that gives a clearer picture of what's going on. Thoughts? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GSoC propostal - CREATE SCHEMA ... LIKE ...
Hi all, Is the TODO item CREATE SCHEMA ... LIKE ... [1] a good GSoC project? Regards [1] http://wiki.postgresql.org/wiki/Todo -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] plpgsql.warn_shadow
On Tue, Mar 4, 2014 at 8:04 PM, Andrew Dunstan and...@dunslane.net wrote: Lots of code quite correctly relies on this, including some I have written. I really cannot see when it would be a good coding practise to do so, there must be something I don't understand, I would greatly appreciate if you can give a real-life example of such a PL/pgSQL function. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote: Alright- so do you feel that the simple ctidscan use-case is a sufficient justification and example of how this can be generally useful that we should be adding these hooks to core..? I'm willing to work through the patch and clean it up this weekend if we agree that it's useful and unlikely to immediately be broken by expected changes.. Yeah, I think it's useful. But based on Tom's concurrently-posted review, I think there's probably a good deal of work left here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
* Robert Haas (robertmh...@gmail.com) wrote: On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote: Alright- so do you feel that the simple ctidscan use-case is a sufficient justification and example of how this can be generally useful that we should be adding these hooks to core..? I'm willing to work through the patch and clean it up this weekend if we agree that it's useful and unlikely to immediately be broken by expected changes.. Yeah, I think it's useful. But based on Tom's concurrently-posted review, I think there's probably a good deal of work left here. Yeah, it certainly looks like it. KaiGai- will you have time to go over and address Tom's concerns..? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On Tue, Mar 4, 2014 at 2:39 PM, Simon Riggs si...@2ndquadrant.com wrote: On 4 March 2014 16:27, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: One concern is schema changes that make a dump unrestorable, for instance if there's a foreign key relationship between tables A and B, Yeah. Ideally, what pg_dump would produce would be a consistent snapshot of the database state as of its transaction snapshot time. We have always had that guarantee so far as user data was concerned, but it's been shaky (and getting worse) so far as the database schema is concerned. What bothers me about the current patch is that it's going to make it a whole lot more worse. While thinking this through it occurs to me that there is no problem at all. Your earlier claim that the dump is inconsistent just isn't accurate. We now have MVCC catalogs, so any dump is going to see a perfectly consistent set of data plus DDL. OK the catalogs may change AFTER the snapshot was taken for the dump, but then so can the data change - that's just MVCC. Unfortunately, this isn't correct. The MVCC snapshots taken for catalog scans are instantaneous; that is, we take a new, current snapshot for each catalog scan. If all of the ruleutils.c stuff were using the transaction snapshot rather than instantaneous snapshots, this would be right. But as has been previously discussed, that's not the case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql.warn_shadow
On 03/04/2014 03:40 PM, Joel Jacobson wrote: On Tue, Mar 4, 2014 at 8:04 PM, Andrew Dunstan and...@dunslane.net wrote: Lots of code quite correctly relies on this, including some I have written. I really cannot see when it would be a good coding practise to do so, there must be something I don't understand, I would greatly appreciate if you can give a real-life example of such a PL/pgSQL function. I can't give you one because it's not mine to share. But I can tell you a couple of ways I have seen it come about. One is when a piece if code is re-used in another function which happens to have a parameter name which is the same. Another is when translating some code and this is the simplest way to do it, with the least effort involved. If I am writing a piece of green fields code, than like you I avoid this. But the vast majority of what I do for people is not green fields code. In any case, it's not our responsibility to enforce a coding standard. That's a management issue, in the end, not a technological issue. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
2014-03-05 5:52 GMT+09:00 Stephen Frost sfr...@snowman.net: * Robert Haas (robertmh...@gmail.com) wrote: On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote: Alright- so do you feel that the simple ctidscan use-case is a sufficient justification and example of how this can be generally useful that we should be adding these hooks to core..? I'm willing to work through the patch and clean it up this weekend if we agree that it's useful and unlikely to immediately be broken by expected changes.. Yeah, I think it's useful. But based on Tom's concurrently-posted review, I think there's probably a good deal of work left here. Yeah, it certainly looks like it. KaiGai- will you have time to go over and address Tom's concerns..? Yes, I need to do. Let me take it through the later half of this week and the weekend. So, I'd like to submit revised one by next Monday. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
Robert Haas robertmh...@gmail.com writes: On Tue, Mar 4, 2014 at 2:39 PM, Simon Riggs si...@2ndquadrant.com wrote: Your earlier claim that the dump is inconsistent just isn't accurate. We now have MVCC catalogs, so any dump is going to see a perfectly consistent set of data plus DDL. OK the catalogs may change AFTER the snapshot was taken for the dump, but then so can the data change - that's just MVCC. Unfortunately, this isn't correct. The MVCC snapshots taken for catalog scans are instantaneous; that is, we take a new, current snapshot for each catalog scan. If all of the ruleutils.c stuff were using the transaction snapshot rather than instantaneous snapshots, this would be right. But as has been previously discussed, that's not the case. Yeah. And that's *necessary* for catalog lookups in a normally functioning backend, because we have to see latest data (eg, it wouldn't do for a backend to fail to enforce a just-added CHECK constraint because it was committed after the backend's transaction started). However, it seems possible that we could have a mode in which a read-only session did all its catalog fetches according to the transaction snapshot. That would get us to a situation where the backend-internal lookups that ruleutils relies on would give the same answers as queries done by pg_dump. Robert's work on getting rid of SnapshotNow has probably moved that much closer than it was before, but it's still not exactly a trivial patch. Meanwhile, Andres claimed upthread that none of the currently-proposed reduced-lock ALTER commands affect data that pg_dump is using ruleutils to fetch. If that's the case, then maybe this is a problem that we can punt till later. I've not gone through the list to verify it though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers