Re: [HACKERS] ERROR: argument to pg_get_expr() must come from system catalogs
On Fri, Jul 30, 2010 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote: Dave Page dp...@pgadmin.org writes: We had a report of the above error from a pgAdmin user testing 1.12.0b3 with PG 9.0b3. The (highly simplified) query below works fine as a superuser: SELECT pg_get_expr(proargdefaults, 'pg_catalog.pg_class'::regclass) FROM pg_proc pr LEFT OUTER JOIN pg_description des ON des.objoid=pr.oid Run as a regular user though, we get the error. I've applied a (rather hurried) patch for this for 9.0beta4. Thanks. Bruce seemed to think it affected 8.4.4 as well - would that be the case, or is it something else? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions ( correct patch attached )
Hello 2010/7/29 Erik Rijkers e...@xs4all.nl: On Thu, July 29, 2010 22:43, Erik Rijkers wrote: Hi Pavel, In xfunc.sgml, I came across a function example (for use of VARIADIC in polymorphic functions), where the function name is concat(): (in the manual: 35.4.10. Polymorphic SQL Functions). Although that is not strictly wrong, it seems better to change that name when concat goes into core, as seems to be the plan. If you agree, it seems best to include this change in your patch and change that example function's name when the stringfunc patch gets applied. My apologies, the previous email had the wrong doc-patch attached. Here is the correct one. it is good idea, thank you Pavel Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
2010/7/29 Josh Berkus j...@agliodbs.com: Or maybe checking against the source code and its documentation, if any. No, not really. What you really want to know is: what's the real planner overhead of having dozens/hundreds of partial indexes? What's the write overhead? There's no way you can derive that from the source code faster than you can test it. Again, as the test would be rather killing for my group at this stage. I think that knowing whether certain parts have been implemented with linear or sub-linear (or whatever else) algorithms would give good insights about scalability. At a first glance it seems that for inheritance some bottleneck is hindering a full exploit for table partitioning. Is there anyone who knows whether those algorithms are linear or not? And of course, I agree that real tests on real data will provide the real thing. -- Vincenzo Romano at NotOrAnd Information Technologies Software Hardware Networking Training Support Security -- NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano vincenzo.rom...@notorand.it wrote: At a first glance it seems that for inheritance some bottleneck is hindering a full exploit for table partitioning. There have been lengthy discussions of how to implement partitioning to fix these precise problems, yes. Is there anyone who knows whether those algorithms are linear or not? They're linear in both cases. But they happen at plan time rather than query execution time. So if your application prepares all its queries and then uses them many times it would not slow down query execution but would slow down the query planning time. In some applications this is much better but in others unpredictable run-times is as bad as long run-times. Also in the case of having many partial indexes it would slow down inserts and updates as well, though to a lesser degree, and that would happen at execution time. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TwoPO: experimental join order algorithm
On 24/07/10 15:20, Adriano Lange wrote: Hi, Hi! I'd like to release the last version of my experimental join order algorithm (TwoPO - Two Phase Optimization [1]): http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary This algorithm is not production-ready, but an experimental set of ideas, which need to be refined and evaluated. As the join order optimization is a hard problem, the evaluation of a search strategy is also a hard task. Therefore, I think the most important TODO item related to replacement of GEQO algorithm is to define a set of evaluation criteria considered as relevant. Good to hear from you -- I don't know if you are aware about a simulated annealing join search module that I'm working on. When I first started, I planned to base is on the patch that you sent to -hackers some time ago (and actually, if you look at the repo for my module, twopo.c is still in there) but ended up doing it from scratch. However, reading your code was a big help in the beginning. I gave a talk at this year's PGCon (http://www.pgcon.org/2010/schedule/events/211.en.html) and you will find your name in the acknowledgments section of the presentation :) I'll make sure to read your new code and compare the approaches, my results so far are not perfect, but also not very pessimistic. I think there is actually a chance to replace GEQO with SA in the future, or at least to ship more join search modules with the standard distribution and gather field reports. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: But, looking at it a bit more carefully, isn't the maximum-size logic for numeric rather bogus? Perhaps, but I think you're confused on at least one point. numeric(2,1) has to be able to hold 2 decimal digits, not 2 NumericDigits (which'd actually be 8 decimal digits given the current code). I get that. The point is: if one of those 2 decimal digits is before the decimal point and the other is after it, then two NumericDigits will be used. The value '11'::numeric is only size 10 (untoasted), but the value '1.1'::numeric is size 12 (untoasted). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
2010/7/30 Greg Stark gsst...@mit.edu: On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano vincenzo.rom...@notorand.it wrote: At a first glance it seems that for inheritance some bottleneck is hindering a full exploit for table partitioning. There have been lengthy discussions of how to implement partitioning to fix these precise problems, yes. Any reference? Is there anyone who knows whether those algorithms are linear or not? They're linear in both cases. But they happen at plan time rather than query execution time. So if your application prepares all its queries and then uses them many times it would not slow down query execution but would slow down the query planning time. In some applications this is much better but in others unpredictable run-times is as bad as long run-times. Hmmm ... maybe I'm missing the inner meaning of your remarks, Greg. By using PREPARE I run the query planned sooner and I should use the plan with the later execution. You can bet that some of the PREPAREd query variables will pertain to either the child table's CHECK contraints (for table partitions) or to the partial index's WHERE condition (for index partitioning). It's exactly this point (execution time) where the linearity will kill the query over a largely partitioned table. Is this what you meant? :-) Also in the case of having many partial indexes it would slow down inserts and updates as well, though to a lesser degree, and that would happen at execution time. This makes fully sense to me. -- Vincenzo Romano at NotOrAnd Information Technologies Software Hardware Networking Training Support Security -- NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
Hi Pavel, Thanks for taking the time to review my patch. Attached is a new version addressing your concerns. On 29/07/10 14:21, Pavel Stehule wrote: I have a few issues: * broken regress test (fedora 13 - xmllint: using libxml version 20707) postgres=# SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo'); xml_is_well_formed f (1 row) this xml is broken - but in regress tests is ok [pa...@pavel-stehule ~]$ xmllint xxx xxx:1: parser error : error parsing attribute name pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo This is a problem that was observered recently by Thom Brown - the commit fest webapp adds the semicolon after the quote. If you look at the attachment I sent in a email client you'll find there is no semicolon. The patch attached here will also not have the semicolon. * xml_is_well_formed returns true for simple text postgres=# SELECT xml_is_well_formed(''); xml_is_well_formed t (1 row) it is probably wrong result - is it ok?? Yes this is OK, pure text is valid XML content. * I don't understand to this fragment PG_TRY(); + { + size_t count; + xmlChar*version = NULL; + int standalone = -1; +. + res_code = parse_xml_decl(string,count,version, NULL,standalone); + if (res_code != 0) + xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT, + invalid XML content: invalid XML declaration, + res_code); +. + doc = xmlNewDoc(version); + doc-encoding = xmlStrdup((const xmlChar *) UTF-8); + doc-standalone = 1; why? This function can raise exception when declaration is wrong. It is wrong - I think, this function should returns false instead exception. You're quite right, I should be returning false here and not causing an exception. I have corrected this in the attached patch. Regards, -- Mike Fowler Registered Linux user: 379787 *** a/contrib/xml2/xpath.c --- b/contrib/xml2/xpath.c *** *** 27,33 PG_MODULE_MAGIC; /* externally accessible functions */ - Datum xml_is_well_formed(PG_FUNCTION_ARGS); Datum xml_encode_special_chars(PG_FUNCTION_ARGS); Datum xpath_nodeset(PG_FUNCTION_ARGS); Datum xpath_string(PG_FUNCTION_ARGS); --- 27,32 *** *** 70,97 pgxml_parser_init(void) xmlLoadExtDtdDefaultValue = 1; } - - /* Returns true if document is well-formed */ - - PG_FUNCTION_INFO_V1(xml_is_well_formed); - - Datum - xml_is_well_formed(PG_FUNCTION_ARGS) - { - text *t = PG_GETARG_TEXT_P(0); /* document buffer */ - int32 docsize = VARSIZE(t) - VARHDRSZ; - xmlDocPtr doctree; - - pgxml_parser_init(); - - doctree = xmlParseMemory((char *) VARDATA(t), docsize); - if (doctree == NULL) - PG_RETURN_BOOL(false); /* i.e. not well-formed */ - xmlFreeDoc(doctree); - PG_RETURN_BOOL(true); - } - - /* Encodes special characters (, , , and \r) as XML entities */ PG_FUNCTION_INFO_V1(xml_encode_special_chars); --- 69,74 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8554,8562 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; ]]/screen /para /sect3 sect3 ! titleXML Predicates/title indexterm primaryIS DOCUMENT/primary --- 8554,8566 ]]/screen /para /sect3 + /sect2 + + sect2 +titleXML Predicates/title sect3 ! titleIS DOCUMENT/title indexterm primaryIS DOCUMENT/primary *** *** 8574,8579 SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; --- 8578,8675 between documents and content fragments. /para /sect3 + +sect3 + titlexml_is_well_formed/title + + indexterm + primaryxml_is_well_formed/primary + secondarywell formed/secondary + /indexterm + + synopsis + functionxml_is_well_formed/function(replaceabletext/replaceable) + /synopsis + + para + The function functionxml_is_well_formed/function evaluates whether + the replaceabletext/replaceable is well formed XML content, returning + a boolean. + /para + para + Example: + screen![CDATA[ + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + t + (1 row) + + SELECT xml_is_well_formed('foobar/foo'); + xml_is_well_formed + + f + (1 row) + + SELECT xml_is_well_formed('foobarstuff/foo'); + xml_is_well_formed + + f + (1 row) + ]]/screen + /para + para + In addition to the structure checks, the function ensures that namespaces are correcty matched. + screen![CDATA[ + SELECT xml_is_well_formed('pg:foo
Re: [HACKERS] review: xml_is_well_formed
Hello 2010/7/30 Mike Fowler m...@mlfowler.com: Hi Pavel, Thanks for taking the time to review my patch. Attached is a new version addressing your concerns. On 29/07/10 14:21, Pavel Stehule wrote: I have a few issues: * broken regress test (fedora 13 - xmllint: using libxml version 20707) ok - main regress test is ok now, next I checked a contrib test for xml2 inside contrib/xml2 make installcheck, and there is a problem SET client_min_messages = warning; \set ECHO none + psql:pgxml.sql:10: ERROR: could not find function xml_is_well_formed in file /usr/local/pgsql.xwf/lib/pgxml.so + psql:pgxml.sql:15: ERROR: could not find function xml_is_well_formed in file /usr/local/pgsql.xwf/lib/pgxml.so RESET client_min_messages; select query_to_xml('select 1 as x',true,false,''); you have to remove declaration from pgxml.sql.in and uninstall_pgxml.sql, and other related files in contrib/xml2/ regress test postgres=# SELECT xml_is_well_formed('pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo'); xml_is_well_formed f (1 row) this xml is broken - but in regress tests is ok [pa...@pavel-stehule ~]$ xmllint xxx xxx:1: parser error : error parsing attribute name pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo This is a problem that was observered recently by Thom Brown - the commit fest webapp adds the semicolon after the quote. If you look at the attachment I sent in a email client you'll find there is no semicolon. The patch attached here will also not have the semicolon. ok - attached patch is correct, ... please, can you remove a broken patch? * xml_is_well_formed returns true for simple text postgres=# SELECT xml_is_well_formed(''); xml_is_well_formed t (1 row) it is probably wrong result - is it ok?? Yes this is OK, pure text is valid XML content. It interesting for me - is somewhere some documentation about it? My colleagues speak some else :) http://zvon.org/comp/r/tut-XML.html#Pages~MinimalQ20XnumberQ20XofQ20Xelements http://www.w3.org/TR/REC-xml/#sec-prolog-dtd I am not a specialist on XML - so just don't know * I don't understand to this fragment PG_TRY(); + { + size_t count; + xmlChar *version = NULL; + int standalone = -1; +. + res_code = parse_xml_decl(string,count,version, NULL,standalone); + if (res_code != 0) + xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT, + invalid XML content: invalid XML declaration, + res_code); +. + doc = xmlNewDoc(version); + doc-encoding = xmlStrdup((const xmlChar *) UTF-8); + doc-standalone = 1; why? This function can raise exception when declaration is wrong. It is wrong - I think, this function should returns false instead exception. You're quite right, I should be returning false here and not causing an exception. I have corrected this in the attached patch. ok Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
I think that queries like this: select * from test where val - 500 1 order by val - 500; can also be optimized using knngist. In case of btree_gist this query can be easily rewritten: select * from test where val 499 and val 501 order by val - 500; But, in pg_trgm it makes it possible to combine different similarity levels in one query. For example: select * from test_trgm order by t - 'asdf' 0.5 or t - 'qwer' 0.4; Is there any chance to handle this syntax also? With best regards, Alexander Korotkov.
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure mmonc...@gmail.com wrote: concat() is not a variadic text function. it is variadic any that happens to do text coercion (not casting) inside the function. The the assumption that concat is casting internally is probably wrong. Suppose I had hacked the int-text cast to call a custom function -- I would very much expect concat() not to produce output from that function, just the vanilla output text (I could always force the cast if I wanted to). concat is just a function that does something highly similar to casting. suppose I had a function count_memory(variadic any) that summed memory usage of input args -- forcing casts would make no sense in that context (I'm not suggesting that you think so -- just bringing up a case that illustrates how forcing cast into the function can change behavior in subtle ways). Right, but I already said I wasn't objecting to the use of variadic ANY in cases like that - only in cases where, as here, you were basically taking any old arguments and forcing them all to text. I believe that another unpleasant side effect of this is that CONCAT() will have to be declared stable rather than immutable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: argument to pg_get_expr() must come from system catalogs
Dave Page dp...@pgadmin.org writes: On Fri, Jul 30, 2010 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote: I've applied a (rather hurried) patch for this for 9.0beta4. Thanks. Bruce seemed to think it affected 8.4.4 as well - would that be the case, or is it something else? He's mistaken. The bug is in all the branches, but there have been no releases with it except 9.0beta3. I will work on back-patching the older branches this morning. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Perhaps, but I think you're confused on at least one point. numeric(2,1) has to be able to hold 2 decimal digits, not 2 NumericDigits (which'd actually be 8 decimal digits given the current code). I get that. The point is: if one of those 2 decimal digits is before the decimal point and the other is after it, then two NumericDigits will be used. Ah, I see. Maybe we should allow for one more NumericDigit in the calculation for such cases. I guess you could look at the scale too to detect if the case is possible, but not sure it's worth the trouble. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote: We ran into a problem on 9.0beta3 with check constraints using table inheritance in a multi-level hierarchy with multiple inheritance. A test script is provided below and a proposed patch is attached to this email. Thanks for the report. This bug also appears to exist in 8.4; I'm not sure yet how far back it goes. I'm not so sure your proposed patch is the right fix, though; it seems like it ought to be the job of AddRelationNewConstraints() and MergeWithExistingConstraint() to make sure that the right thing happens, here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Robert Haas robertmh...@gmail.com writes: On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote: We ran into a problem on 9.0beta3 with check constraints using table inheritance in a multi-level hierarchy with multiple inheritance. Thanks for the report. This bug also appears to exist in 8.4; I'm not sure yet how far back it goes. I'm not so sure your proposed patch is the right fix, though; it seems like it ought to be the job of AddRelationNewConstraints() and MergeWithExistingConstraint() to make sure that the right thing happens, here. The original design idea was that coninhcount/conislocal would act exactly like attinhcount/attislocal do for multiply-inherited columns. Where did we fail to copy that logic? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote: We ran into a problem on 9.0beta3 with check constraints using table inheritance in a multi-level hierarchy with multiple inheritance. Thanks for the report. This bug also appears to exist in 8.4; I'm not sure yet how far back it goes. I'm not so sure your proposed patch is the right fix, though; it seems like it ought to be the job of AddRelationNewConstraints() and MergeWithExistingConstraint() to make sure that the right thing happens, here. The original design idea was that coninhcount/conislocal would act exactly like attinhcount/attislocal do for multiply-inherited columns. Where did we fail to copy that logic? We didn't. That logic is broken, too. Using the OP's test setup: rhaas=# alter table level_0_parent add column a_new_column integer; NOTICE: merging definition of column a_new_column for child level_1_child NOTICE: merging definition of column a_new_column for child level_2_child NOTICE: merging definition of column a_new_column for child level_2_child ALTER TABLE rhaas=# SELECT t.oid, t.relname, a.attinhcount FROM pg_class t JOIN pg_attribute a ON (a.attrelid = t.oid) JOIN pg_namespace n ON (t.relnamespace = n.oid) WHERE n.nspname = 'test_inheritance' AND a.attname = 'a_new_column' ORDER BY t.oid; oid |relname | attinhcount ---++- 16420 | level_0_parent | 0 16423 | level_0_child | 1 16429 | level_1_parent | 1 16432 | level_1_child | 2 16438 | level_2_parent | 1 16441 | level_2_child | 3 (6 rows) The attached patch (please review) appears to fix the coninhcount case. I haven't tracked down the attinhcount case yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company multiple_inheritance-v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: The original design idea was that coninhcount/conislocal would act exactly like attinhcount/attislocal do for multiply-inherited columns. Where did we fail to copy that logic? We didn't. That logic is broken, too. Uh, full stop there. If you think that the multiply-inherited column logic is wrong, it's you that is mistaken --- or at least you're going to have to do a lot more than just assert that you don't like it. We spent a *lot* of time hashing that behavior out, back around 7.3. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: The original design idea was that coninhcount/conislocal would act exactly like attinhcount/attislocal do for multiply-inherited columns. Where did we fail to copy that logic? We didn't. That logic is broken, too. Uh, full stop there. If you think that the multiply-inherited column logic is wrong, it's you that is mistaken --- or at least you're going to have to do a lot more than just assert that you don't like it. We spent a *lot* of time hashing that behavior out, back around 7.3. Well, I'm glad you spent a lot of time hashing it out, but you've got a bug. :-) Since the output in the previous email apparently wasn't sufficient for you to understand what the problem is, here it is in more detail. Initial setup: DROP SCHEMA IF EXISTS test_inheritance CASCADE; CREATE SCHEMA test_inheritance; SET search_path TO test_inheritance; CREATE TABLE level_0_parent (i int); CREATE TABLE level_0_child (a text) INHERITS (level_0_parent); CREATE TABLE level_1_parent() INHERITS (level_0_parent); CREATE TABLE level_1_child() INHERITS (level_0_child, level_1_parent); CREATE TABLE level_2_parent() INHERITS (level_1_parent); CREATE TABLE level_2_child() INHERITS (level_1_child, level_2_parent); Then: rhaas=# \d level_2_child Table test_inheritance.level_2_child Column | Type | Modifiers +-+--- i | integer | a | text| Inherits: level_1_child, level_2_parent rhaas=# ALTER TABLE level_0_parent ADD COLUMN a_new_column integer; NOTICE: merging definition of column a_new_column for child level_1_child NOTICE: merging definition of column a_new_column for child level_2_child NOTICE: merging definition of column a_new_column for child level_2_child ALTER TABLE rhaas=# ALTER TABLE level_0_parent DROP COLUMN a_new_column; ALTER TABLE rhaas=# \d level_2_child Table test_inheritance.level_2_child Column| Type | Modifiers --+-+--- i| integer | a| text| a_new_column | integer | Inherits: level_1_child, level_2_parent Adding a column to the toplevel parent of the inheritance hierarchy and then dropping it again shouldn't leave a leftover copy of the column in the grandchild. rhaas=# ALTER TABLE level_2_child DROP COLUMN a_new_column; ERROR: cannot drop inherited column a_new_column -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock_timeout GUC patch - Review
The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote : I fixed this by adding CheckLockTimeout() function that works like CheckStatementTimeout() and ensuring that the same start time is used for both deadlock_timeout and lock_timeout if both are active. The preference of errors if their timeout values are equal is: statement_timeout lock_timeout deadlock_timeout As soon as lock_timeout is bigger than deadlock_timeout, it doesn't work, with this new version. Keeping the deadlock_timeout to 1s, when lock_timeout = 1001, lock_timeout doesn't trigger anymore. * Consider the changes to the code in the context of the project as a whole: * Is everything done in a way that fits together coherently with other features/modules? I have a feeling that enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a very specific problem, and it gets complicated because there is no infrastructure in the code to handle several timeouts at the same time with sigalarm, so each timeout has its own dedicated and intertwined code. But I'm still discovering this part of the code. This WIP patch is also attached for reference, too. I would prefer this way, but I don't have more time to work on it and there are some interdependencies in the signal handler when e.g. disable_sig_alarm(true); means to disable ALL timers not just the statement_timeout. The specifically coded lock_timeout patch goes with the flow and doesn't change the semantics and works. If someone wants to pick up the timer framework patch and can make it work, fine. But I am not explicitely submitting it for the commitfest. The original patch with the fixes works and needs only a little more review. Ok, understood. But I like the principle of this framework much more (the rest of the code seems simpler to me as a consequence of this framework). But it goes far beyond the initial intent of the patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote: Uh, full stop there. If you think that the multiply-inherited column logic is wrong, it's you that is mistaken --- or at least you're going to have to do a lot more than just assert that you don't like it. We spent a *lot* of time hashing that behavior out, back around 7.3. Since the output in the previous email apparently wasn't sufficient for you to understand what the problem is, here it is in more detail. ... Adding a column to the toplevel parent of the inheritance hierarchy and then dropping it again shouldn't leave a leftover copy of the column in the grandchild. Actually, it probably should. The inheritance rules were intentionally designed to avoid dropping inherited columns that could conceivably still contain valuable data. There isn't enough information in the inhcount/islocal data structure to recognize that a multiply-inherited column is ultimately derived from only one distant ancestor, but it was agreed that an exact tracking scheme would be more complication than it was worth. Instead, the mechanism is designed to ensure that no column will be dropped if it conceivably is still wanted --- not that columns might not be left behind and require another drop step. *Please* go re-read the old discussions before you propose tampering with this behavior. In particular I really really do not believe that any one-line fix is going to make things better --- almost certainly it will break other cases. Being materially more intelligent would require a more complex data structure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: Since the output in the previous email apparently wasn't sufficient for you to understand what the problem is, here it is in more detail. ... Adding a column to the toplevel parent of the inheritance hierarchy and then dropping it again shouldn't leave a leftover copy of the column in the grandchild. Actually, it probably should. The inheritance rules were intentionally designed to avoid dropping inherited columns that could conceivably still contain valuable data. There isn't enough information in the inhcount/islocal data structure to recognize that a multiply-inherited column is ultimately derived from only one distant ancestor, but it was agreed that an exact tracking scheme would be more complication than it was worth. Instead, the mechanism is designed to ensure that no column will be dropped if it conceivably is still wanted --- not that columns might not be left behind and require another drop step. This is not about dropping columns or not, but about proper maintenance of the housekeeping of coninhcount, defined as The number of direct inheritance ancestors this constraint has. A constraint with a nonzero number of ancestors cannot be dropped nor renamed. Regard the following lattice (direction from top to bottom): 1 |\ 2 3 \|\ 4 5 \| 6 When adding a constraint to 1, the proper coninhcount for that constraint on relation 6 is 2. But the code currently counts to 3, since 6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6. This wrong number is a bug. *Please* go re-read the old discussions before you propose tampering with this behavior. In particular I really really do not believe that any one-line fix is going to make things better --- almost certainly it will break other cases. Our (more than one line) patch was explicitly designed to walk from a specific parent to a child exactly once. It passes regression tests. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 10:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote: Uh, full stop there. If you think that the multiply-inherited column logic is wrong, it's you that is mistaken --- or at least you're going to have to do a lot more than just assert that you don't like it. We spent a *lot* of time hashing that behavior out, back around 7.3. Since the output in the previous email apparently wasn't sufficient for you to understand what the problem is, here it is in more detail. ... Adding a column to the toplevel parent of the inheritance hierarchy and then dropping it again shouldn't leave a leftover copy of the column in the grandchild. Actually, it probably should. The inheritance rules were intentionally designed to avoid dropping inherited columns that could conceivably still contain valuable data. There isn't enough information in the inhcount/islocal data structure to recognize that a multiply-inherited column is ultimately derived from only one distant ancestor, but it was agreed that an exact tracking scheme would be more complication than it was worth. Instead, the mechanism is designed to ensure that no column will be dropped if it conceivably is still wanted --- not that columns might not be left behind and require another drop step. While I certainly agree that accidentally dropping a column that should be retained is a lot worse than accidentally retaining a column that should be dropped, that doesn't make the current behavior correct. I don't think that's really an arguable point. Another drop step doesn't help: the leftover column is undroppable short of nuking the whole table. So let's focus on what the actual problem is here, what is required to fix it, and whether or not and how far the fix can be back-patched. There are two separate bugs here, so we should consider them individually. In each case, the problem is that with a certain (admittedly rather strange) inheritance hierarchy, adding a column to the top-level parent results in the inheritance count in the bottom-most child ending up as 3 rather than 2. 2 is the correct value because the bottom-most child has 2 immediate parents. The consequence of ending up with a count of 3 rather than 2 is that if the constraint/column added at the top-level is subsequent dropped, necessarily also from the top-level, the bottom-level child ends up with the constraint/column still in existence and with an inheritance count of 1. This isn't the correct behavior, and furthermore the child object can't be dropped, because it still believes that it's inherited (even though its parents no longer have a copy to inherit). In the case of coninhcount, I believe that the fix actually is quite simple, although of course it's possible that I'm missing something. Currently, ATAddConstraint() first calls AddRelationNewConstraints() to add the constraint, or possibly merge it into an existing, inherited constraint; it then adds the constraint to the phase-3 work queue so that it will be checked; and finally it recurses to its own children. It seems to me that it should only recurse to its children if the constraint was really added, rather than merged into an existing constraint, because if it was merged into an existing constraint, then the children already have it. I'm still trying to figure out why this bug doesn't show up in simpler cases, but I think that's the root of the issue. attinhcount exhibits analogous misbehavior, but if you're saying the fix for that case is going to be a lot more complicated, I think I agree with you. In that case, it seems that we traverse the inheritance hierarchy during the prep phase, at which point we don't yet know whether we're going to end up merging anything. It might be that a fix for this part of the problem ends up being too complex to back-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Yeb Havinga yebhavi...@gmail.com writes: Regard the following lattice (direction from top to bottom): 1 |\ 2 3 \|\ 4 5 \| 6 When adding a constraint to 1, the proper coninhcount for that constraint on relation 6 is 2. But the code currently counts to 3, since 6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6. Mph. I'm not sure that 3 is wrong. You have to consider what happens during a DROP CONSTRAINT, which as far as I saw this patch didn't address. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 11:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeb Havinga yebhavi...@gmail.com writes: Regard the following lattice (direction from top to bottom): 1 |\ 2 3 \|\ 4 5 \| 6 When adding a constraint to 1, the proper coninhcount for that constraint on relation 6 is 2. But the code currently counts to 3, since 6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6. Mph. I'm not sure that 3 is wrong. You have to consider what happens during a DROP CONSTRAINT, which as far as I saw this patch didn't address. The behavior of DROP CONSTRAINT matches the fine documentation. The behavior of ADD CONSTRAINT does not. Perhaps there's a definition of coninhcount that would make 3 the right answer, but (a) I'm not sure what that definition would be and (b) it's certainly not the one in the manual. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
Boxuan Zhai wrote: I create a clean patch file (no debug clauses). See the attachment. Some general coding feedback for you on this. It's not obvious to people who might want to try this out what exactly they are supposed to do. Particularly for complicated patches like this, where only a subset of the final feature might actually be implemented, some sort of reviewer guide suggesting what should and shouldn't work would be extremely helpful. I recall there was some sort of patch design guide in an earlier version of this patch; it doesn't seem to be there anymore. Don't remember if that had any examples in it. Ultimately, the examples of working behavior for this patch will need to be put into code. The way that happens is by adding working examples into the regression tests for the database. If you had those done for this patch, I wouldn't have to ask for code examples; I could just look at the source to the regression output and see how to use it against the standard database the regression samples create, and then execute against. This at least lets you avoid having to generate some test data, because there will already be some in the regression database for you to use. There is an intro this topic at http://wiki.postgresql.org/wiki/Regression_test_authoring Another common way to generate test data is to run pgbench which creates 4 tables and populates them with data. As far as the patch itself goes, you have some work left on cleaning that up still you'll need to do eventually. What I would suggest is actually reading the patch itself; don't just generate it and send it, read through the whole thing like someone new to it would do. One way you can improve what you've done already is to find places where you have introduced changes to the code structure just through editing. Here's an example of what I'm talking about, from line 499 of your patch: -break; +break; This happened because you added two invisible tabs to the end of this line. This makes the patch larger for no good reason and tends to infuriate people who work on the code. There's quite a bit of extra lines added in here too that should go. You should consider reducing the ultimate size of the patch in terms of lines a worthwhile use of your time, even if it doesn't change how things work. There's lots of examples in this one where you put two or three lines between two statements when a single one would match the look of the code in that file. A round of reading the diff looking for that sort of problem would be useful. Another thing you should do is limit how long each line is when practical. You have lots of seriously wide comment lines here right now in particular. While there are some longer lines in the PostgreSQL code right now, that's code, not comments. And when you have a long line and a long comment, don't tack the comment onto the end. Put it on the line above instead. Also, don't use // in the middle of comments the way you've done in a few places here. Getting some examples sorted out and starting on regression tests is more important than the coding style parts I think, just wanted to pass those along while I noticed them reading the patch, so you could start looking out for them more as you continue to work on it. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 11:35 AM, Robert Haas robertmh...@gmail.com wrote: In the case of coninhcount, I believe that the fix actually is quite simple, although of course it's possible that I'm missing something. Currently, ATAddConstraint() first calls AddRelationNewConstraints() to add the constraint, or possibly merge it into an existing, inherited constraint; it then adds the constraint to the phase-3 work queue so that it will be checked; and finally it recurses to its own children. It seems to me that it should only recurse to its children if the constraint was really added, rather than merged into an existing constraint, because if it was merged into an existing constraint, then the children already have it. I'm still trying to figure out why this bug doesn't show up in simpler cases, but I think that's the root of the issue. OK, it looks like level_2_parent is actually irrelevant to this issue. So here's a slightly simplified test case: DROP SCHEMA IF EXISTS test_inheritance CASCADE; CREATE SCHEMA test_inheritance; SET search_path TO test_inheritance; CREATE TABLE top (i int); CREATE TABLE mid1 () INHERITS (top); CREATE TABLE mid2 () INHERITS (top); CREATE TABLE bottom () INHERITS (mid1, mid2); CREATE TABLE basement () INHERITS (bottom); ALTER TABLE top ADD CONSTRAINT a_check_constraint CHECK (i = 0); You can also trigger it this way: DROP SCHEMA IF EXISTS test_inheritance CASCADE; CREATE SCHEMA test_inheritance; SET search_path TO test_inheritance; CREATE TABLE top1 (i int); CREATE TABLE top2 (i int); CREATE TABLE bottom () INHERITS (top1, top2); CREATE TABLE basement () INHERITS (bottom); ALTER TABLE top1 ADD CONSTRAINT a_check_constraint CHECK (i = 0); ALTER TABLE top2 ADD CONSTRAINT a_check_constraint CHECK (i = 0); In other words, the problem occurs when table A inherits a constraint from multiple parents, and table B inherits from table A. Most of the code (including ALTER TABLE .. DROP CONSTRAINT and ALTER TABLE .. [NO] INHERIT) maintains the invariant that coninhcount is the number of direct parents from which the constraint is inherited, but the ADD CONSTRAINT fails to do so in this particular case. So at this point, I'm fairly confident that this is the correct fix. I think the reason we haven't noticed this before is that it's most likely to occur when you have a diamond inheritance pattern with another child hanging off the bottom, which is not something most people probably do. It appears that the coninhcount mechanism was introduced in 8.4; prior to that, we didn't really make an effort to get this right. Therefore, I believe the patch I posted upthread should be applied to HEAD, REL9_0_STABLE, and REL8_4_STABLE. This still leaves open the question of what to do about the similar case involving attributes: DROP SCHEMA IF EXISTS test_inheritance CASCADE; CREATE SCHEMA test_inheritance; SET search_path TO test_inheritance; CREATE TABLE top (i int); CREATE TABLE mid1 () INHERITS (top); CREATE TABLE mid2 () INHERITS (top); CREATE TABLE bottom () INHERITS (mid1, mid2); CREATE TABLE basement () INHERITS (bottom); ALTER TABLE top ADD COLUMN a_table_column integer; That problem looks significantly more difficult to solve, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
On Fri, Jul 30, 2010 at 12:21 PM, Greg Smith g...@2ndquadrant.com wrote: If you had those done for this patch, I wouldn't have to ask for code examples; I could just look at the source to the regression output and see how to use it against the standard database the regression samples create, and then execute against. I agree. While not every feature needs regression tests, something this complex certainly does. Also, running the regression tests (frequently!) can help you realize when you've broken the existing code before too much time goes by and it's no longer easy to figure out which change is responsible for the breakage. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
On Friday 30 July 2010 18:21:49 Greg Smith wrote: -break; +break; This happened because you added two invisible tabs to the end of this line. This makes the patch larger for no good reason and tends to If you want to remove changes like this using: git checkout -p HEAD is very useful if youre using git. It allows you to selectively revert hunks of not-checked-in changes... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gincostestimate
OK, here's a review, as much as I was able to do it without understanding deeply how GIN works. The patch is context, applies cleanly to HEAD, compiles without warnings and passes regression tests. Using the script from http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php I was able to get an index scan with commonterm40, while with the unpatched source I was getting an index scan only for commonterm20, so it indeed improves the situation as far as cost estimation is concerned. Codewise I have one question: the patch changes a loop in ginvacuumcleanup from for (blkno = GIN_ROOT_BLKNO + 1; blkno npages; blkno++) to for (blkno = GIN_ROOT_BLKNO; blkno npages; blkno++) why should it now go through all blocks? I couldn't immediately see why was it not OK to do it before and why is it OK to do it now, but I don't really get how GIN works internally. I guess a comment would be good to have there in any case. The patch has lots of statements like if ( GinPageIsLeaf(page) ), that is with extra space between the outer parenthesis and the condition, which AFAIK is not the project style. I guess pgindent fixes that, so it's no big deal. There are lines with elog(NOTICE) that are #if 0, they probably should either become elog(DEBUGX) or get removed. As for performance, I tried running the attached script a couple of times. I used the standard config file, only changed checkpoint_segments to 30 and shared_buffers to 512MB. The timings were: HEAD INSERT 0 50 Time: 13487.450 ms VACUUM Time: 337.673 ms INSERT 0 50 Time: 13751.110 ms VACUUM Time: 315.812 ms INSERT 0 50 Time: 12691.259 ms VACUUM Time: 312.320 ms HEAD + gincostestimate INSERT 0 50 Time: 13961.894 ms VACUUM Time: 355.798 ms INSERT 0 50 Time: 14114.975 ms VACUUM Time: 341.822 ms INSERT 0 50 Time: 13679.871 ms VACUUM Time: 340.576 ms so there is no immediate slowdown for a quick test with one client. Since there was no stability or performance issues and it solves the problem, I am marking this as ready for committer, although it might be beneficial if someone more acquianted with GIN takes another look at it before the committer review. I will be travelling during the whole August and will only have intermittent email access, so in case of any questions with regards to review the respionse time can be a few days. Cheers, Jan drop table if exists x; create table x(f tsvector); create index x_idx on x using gin(f); insert into x (select 'test words taken at random but a few of them indeed' from generate_series(1, 50) g); vacuum; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gincostestimate
On Fri, Jul 30, 2010 at 1:19 PM, Jan Urbański wulc...@wulczer.org wrote: The patch has lots of statements like if ( GinPageIsLeaf(page) ), that is with extra space between the outer parenthesis and the condition, which AFAIK is not the project style. I guess pgindent fixes that, so it's no big deal. It's better if these get cleaned up. pgindent will fix it eventually, but the less stuff pgindent has to touch, the less likelihood there is of breaking outstanding patches when it's run. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
Robert Haas robertmh...@gmail.com writes: Maybe something like this, obviously with a suitable comment which I haven't written yet: numeric_digits = (precision + 6) / 4; return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ; This is OK for the base-10K case, but there's still code in there for the base-10 and base-100 cases. Can you express this logic in terms of DEC_DIGITS and sizeof(NumericDigit) ? I think you might find it was actually clearer that way, cf Polya. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
[ forgot to answer this part ] Robert Haas robertmh...@gmail.com writes: Another question here is whether we should just fix this in CVS HEAD, or whether there's any reason to back-patch it. Agreed, fixing it in HEAD seems sufficient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
Is there anyone who knows whether those algorithms are linear or not? Read the code? It's really very accessible, and there's lots and lots of comments. While the person who wrote the code is around, isn't it better to see the real implementation? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Robert Haas wrote: It seems to me that it should only recurse to its children if the constraint was really added, rather than merged into an existing constraint, because if it was merged into an existing constraint, then the children already have it. Yes. (then the children already have it - already have it from the current parent) - now I understand how AddRelationNewConstraints is used, this effectively blocks 1 times propagation to childs from the same parent, and that is what our patch was written todo too. OK, it looks like level_2_parent is actually irrelevant to this issue. So here's a slightly simplified test case: CREATE TABLE top (i int); CREATE TABLE mid1 () INHERITS (top); CREATE TABLE mid2 () INHERITS (top); CREATE TABLE bottom () INHERITS (mid1, mid2); CREATE TABLE basement () INHERITS (bottom); This is, probably provable, the smallest case where this problem can occur. Removing any table would mean that this bug is not hit anymore. This still leaves open the question of what to do about the similar case involving attributes: That problem looks significantly more difficult to solve, though. I'm looking at ATPrepAddColumn right now, where there is actually some comments about getting the right attinhcount in the case of multiple inherited children, but it's the first time I'm looking at this part of PostgreSQL and it needs some time to sink in. It looks a bit like at the place of the comment /* child should see column as singly inherited */, maybe the recursion could be stopped there as well, i.e. not only setting inhcount to 1, but actually adding the child relation one time to the wqueue. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
2010/7/30 Josh Berkus j...@agliodbs.com: Is there anyone who knows whether those algorithms are linear or not? Read the code? It's really very accessible, and there's lots and lots of comments. While the person who wrote the code is around, isn't it better to see the real implementation? If the programmer(s) who wrote that part is around, a simple hint would suffice. Even an hint to where look into the code would be very appreciated: the query planner is not as simple as the ls command (which is not that simple any more, though). It looks like I need to go the hard way ... Starting from postgresql-8.4.4/src/backend/optimizer -- Vincenzo Romano at NotOrAnd Information Technologies Software Hardware Networking Training Support Security -- cel +393398083886 fix +390823454163 fax +3902700506964 gtalk. vincenzo.rom...@notorand.it skype. notorand.it -- NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
Yeb Havinga wrote: Robert Haas wrote: This still leaves open the question of what to do about the similar case involving attributes: That problem looks significantly more difficult to solve, though. I'm looking at ATPrepAddColumn right now, where there is actually some comments about getting the right attinhcount in the case of multiple inherited children, but it's the first time I'm looking at this part of PostgreSQL and it needs some time to sink in. It looks a bit like at the place of the comment /* child should see column as singly inherited */, maybe the recursion could be stopped there as well, i.e. not only setting inhcount to 1, but actually adding the child relation one time to the wqueue. I believe the crux is to stop double recursion from a parent in ATOneLevelRecursion. I did a quick test by adding a globally defined static List *visitedrels = NIL; together by adding in front of ATOneLevelRecursion this: if (list_member_oid(visitedrels, relid)) return; visitedrels = lappend_oid(visitedrels, relid); Running Roberts example gives the correct attinhcount for the basement. SELECT t.oid, t.relname, a.attinhcount FROM pg_class t JOIN pg_attribute a ON (a.attrelid = t.oid) JOIN pg_namespace n ON (t.relnamespace = n.oid) WHERE n.nspname = 'test_inheritance' AND a.attname = 'a_table_column' ORDER BY oid; oid | relname | attinhcount ---+--+- 16566 | top | 0 16569 | mid1 | 1 16572 | mid2 | 1 16575 | bottom | 2 16578 | basement | 1 regards, Yeb Havinga PS: forgot to say thanks for looking into this problem earlier in the thread. Thanks! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
Vincenzo Romano wrote: By using PREPARE I run the query planned sooner and I should use the plan with the later execution. You can bet that some of the PREPAREd query variables will pertain to either the child table's CHECK contraints (for table partitions) or to the partial index's WHERE condition (for index partitioning). Prepared statements are not necessarily a cure for long query planning time, because the sort of planning decisions made with partitioned child tables and index selection can need to know the parameter values to execute well; that's usually the situation rather than the exception with partitions. You run the risk that the generic prepared plan will end up looking at all the partitions, because at preparation plan time it can't figure out which can be excluded. Can only figure that out once they're in there for some types of queries. I think you aren't quite lined up with the people suggesting test it in terms of what that means. The idea is not that you should build a full on application test case yet, which can be very expensive. The idea is that you might explore things like when I partition this way increasing the partitions from 1 to n, does query time go up linearly? by measuring with fake data and a machine-generated schema. What's happened in some of these cases is that, despite the theoretical, some constant or external overhead ends up dominating behavior for lower numbers. As an example, it was recognized that the amount of statistics for a table collected with default_statistics_target had a quadratic impact on some aspects of performance. But it turned out that for the range of interesting values to most people, the measured runtime did not go up with the square as feared. Only way that was sorted out was to build a simple simulation. Here's a full example from that discussion that shows the sort of tests you probably want to try, and comments on the perils of guessing based on theory rather than testing: http://archives.postgresql.org/pgsql-hackers/2008-12/msg00601.php http://archives.postgresql.org/pgsql-hackers/2008-12/msg00687.php generate_series can be very helpful here, and you can even use that to generate timestamps if you need them in the data set. That said, anecdotally everyone agrees that partitions don't scale well into even the very low hundreds for most people, and doing multi-level ones won't necessarily normally drop query planning time--just the cost of maintaining the underlying tables and indexes. My opinion is that building a simple partitioned case and watching how the EXPLAIN plans change as you adjust things will be more instructive for you than either asking about it or reading the source. Vary the parameters, watch the plans, measure things and graph them if you want to visualize the behavior better. Same thing goes for large numbers of partial indexes, which have a similar query planning impact, but unlike partitions I haven't seen anyone analyze them via benchmarks. I'm sure you could get help here (probably the performance list is a better spot though) with getting your test case right if you wanted to try and nail that down. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On Scalability
On Fri, Jul 30, 2010 at 3:50 PM, Vincenzo Romano vincenzo.rom...@notorand.it wrote: 2010/7/30 Josh Berkus j...@agliodbs.com: Is there anyone who knows whether those algorithms are linear or not? Read the code? It's really very accessible, and there's lots and lots of comments. While the person who wrote the code is around, isn't it better to see the real implementation? If the programmer(s) who wrote that part is around, a simple hint would suffice. Even an hint to where look into the code would be very appreciated: the query planner is not as simple as the ls command (which is not that simple any more, though). It looks like I need to go the hard way ... Starting from postgresql-8.4.4/src/backend/optimizer I think you're approaching this in the wrong way. You've repeatedly said you don't want to do all the work of setting up a test, but trying to search the code for algorithms that might not be linear is not going to be easier. I've been reading this thread and I'm fairly familiar with this code, and I even understand the algorithms pretty well, and I don't know whether they're going to be linear for what you want to do or not. Certainly, the overall task of join planning is exponential-time in the number of *tables*, but if you're just doing SELECT statements on a single table, will that be linear? Tough to say. Certainly, there are a lot of linked lists in there, so if we have any place where we have two nested loops over the list of indices, it won't be linear. I can't think of a place where we do that, but that doesn't mean there isn't one. And even if they are linear, or n log n or something, the coefficients could still be lousy. Theoretical computer science is one of my favorite subjects, but, it doesn't always tell you what you want to know about the real world. It doesn't seem like it should be very hard to figure this out empirically. Just create a big database full of random data. Maybe you could populate one of the columns with something like (random() * 1000)::int. Then you could create partial indices ON (some_other_column) WHERE that_column = blat for blat in 0..999. Then you could run some test queries and see how you make out. Or, I mean, you can read the source code. That's fine, too. It's just... I've already read the source code quite a few times, and I still don't know the answer. Admittedly, I wasn't trying to answer this specific question, but still - I don't think it's an easy question to answer that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 10:22, Robert Haas robertmh...@gmail.com wrote: OK, it looks like level_2_parent is actually irrelevant to this issue. So here's a slightly simplified test case: DROP SCHEMA IF EXISTS test_inheritance CASCADE; CREATE SCHEMA test_inheritance; SET search_path TO test_inheritance; CREATE TABLE top (i int); CREATE TABLE mid1 () INHERITS (top); CREATE TABLE mid2 () INHERITS (top); CREATE TABLE bottom () INHERITS (mid1, mid2); CREATE TABLE basement () INHERITS (bottom); ALTER TABLE top ADD CONSTRAINT a_check_constraint CHECK (i = 0); The other problem with the current code is it creates a dump that is not consistent with \d. The dump gets it right in the sense that basement does not have the constraint. We could debate what coninhcount should be, but clearly there is a bug here. [ FYI with your patch the dump is, of course, consistent again (no a_check_constraint) ] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
On Fri, Jul 30, 2010 at 9:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Perhaps, but I think you're confused on at least one point. numeric(2,1) has to be able to hold 2 decimal digits, not 2 NumericDigits (which'd actually be 8 decimal digits given the current code). I get that. The point is: if one of those 2 decimal digits is before the decimal point and the other is after it, then two NumericDigits will be used. Ah, I see. Maybe we should allow for one more NumericDigit in the calculation for such cases. I guess you could look at the scale too to detect if the case is possible, but not sure it's worth the trouble. Since this just needs to be a reasonable upper bound, probably not. Another small but related issue is this comment is out-of-date: /* Numeric stores 2 decimal digits/byte, plus header */ return (precision + 1) / 2 + NUMERIC_HDRSZ; Currently, numeric stores 4 decimal digits/2 bytes, which is not quite the same thing. I think that for clarity it would make sense to break this calculation in two parts. First, estimate the number of NumericDigits that could be needed; then, estimate the amount of space that could be needed to store them. Maybe something like this, obviously with a suitable comment which I haven't written yet: numeric_digits = (precision + 6) / 4; return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ; The first line would be incorrect for precision 0 (it would produce 1, not 0) but precision 0 isn't allowed anyway, and overestimating just that one case would be OK even if it did. As far as I can see, it's correct for all higher values. If you have 1 decimal digit, you can't need more than one NumericDigit, but if you have 2 decimal digits, you can need one for each side of the decimal point. But you can't manage to get a third NumericDigit until you get up to 6 decimal digits, because with only 5 you can split them 1-4 or 3-2 across the decimal point, but getting to a second NumericDigit on either side of the decimal point uses up all 5 digits, leaving nothing for the other side. Similarly, to get a fourth NumericDigit, you have to have enough decimal digits to split them 1-4-4-1 (with the decimal point somewhere in the middle), so 10 is the minimum. Another question here is whether we should just fix this in CVS HEAD, or whether there's any reason to back-patch it. I can't see much reason to back-patch, unless there's third-party code depending on this for something more than what we use it for. The only callers of type_maximum_size are get_typavgwidth(), which doesn't need to be exact, and needs_toast_table(). So it seems that the worst thing that could happen as a result of this problem is that we might fail to create a toast table when one is needed, and even that seems extremely unlikely. First, you'd need to have a table containing no text or unlimited-length varchar columns, because those will force toast table creation anyway. Second, numerics are typically going to be stored with a short varlena header on disk, so a 2-byte underestimate of the max size is going to be swamped by a 3-byte savings on the varlena header. Third, even if you can find a case where the above doesn't matter, it's not that hard to force a toast table to get created. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] reducing NUMERIC size for 9.1, take two
Here's a second version of the main patch, in which I have attempted to respond to Tom's concerns/suggestions. (There is still a small, side issue with numeric_maximum_size() which I plan to fix, but this patch is the good stuff.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company numeric_2b-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock_timeout GUC patch - Review
Excerpts from Boszormenyi Zoltan's message of jue jul 29 07:55:38 -0400 2010: Hi, Marc Cousin írta: Hi, I've been reviewing this patch for the last few days. Here it is : ... * Are there dangers? As it is a guc, it could be set globally, is that a danger ? I haven't added any new code covering supplemental (e.g. autovacuum) processes, the interactions are yet to be discovered. Setting it globally is not recommended. FWIW there is some code in autovacuum and other auxiliary processes that forcibly sets statement_timeout to 0. I think this patch should do likewise. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
On Fri, Jul 30, 2010 at 2:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Maybe something like this, obviously with a suitable comment which I haven't written yet: numeric_digits = (precision + 6) / 4; return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ; This is OK for the base-10K case, but there's still code in there for the base-10 and base-100 cases. Can you express this logic in terms of DEC_DIGITS and sizeof(NumericDigit) ? I think you might find it was actually clearer that way, cf Polya. It appears to work out to: numeric_digits = (precision + 2 * (DEC_DIGITS - 1)) / DEC_DIGITS return (numeric_digits * sizeof(NumericDigits)) + NUMERIC_HDRSZ; The smallest value for precision which requires 2 numeric_digits is always 2; and the required number of numeric_digits increases by 1 each time the number of base-10 digits increases by DEC_DIGITS. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
I've been poking at the FTS problem example recently submitted by Artur Dabrowski. It contains an index declared like so: CREATE INDEX idx_keywords_ger ON search_tab USING gin (to_tsvector('german'::regconfig, keywords)); which can be queried like so select * from search_tab where to_tsvector('german', keywords) @@ to_tsquery('german', 'whatever'); which is all fine and matches suggestions in our documentation. But I was dismayed to discover that ANALYZE wasn't storing any statistics for this expression index, which rather hobbles the planner's ability to produce useful estimates for the selectivity of such WHERE clauses. The reason for that turns out to be this bit in analyze.c: /* * Can't analyze if the opclass uses a storage type * different from the expression result type. We'd get * confused because the type shown in pg_attribute for * the index column doesn't match what we are getting * from the expression. Perhaps this can be fixed * someday, but for now, punt. */ if (exprType(indexkey) != Irel[ind]-rd_att-attrs[i]-atttypid) continue; Since a tsvector index does have a storage type different from tsvector (in fact, it's shown as TEXT), this code decides to punt and not analyze the column. I think it's past time we did something about this. After a bit of study of the code, it appears to me that it's not too difficult to fix: we just have to use the expression's result type rather than the index column's atttypid in the subsequent processing. ANALYZE never actually looks at the index column contents (indeed cannot, since our index AMs provide no API for extracting the contents), so atttypid isn't really all that relevant to it. However, this'll imply an API/ABI break for custom typanalyze functions: we'll need to store the actual type OID in struct VacAttrStats, and use that rather than looking to attr-atttypid or any of the other type-dependent fields of the Form_pg_attribute. So that seems to make it not practical to back-patch. I thought of an ugly hack that would avoid an API/ABI break: since VacAttrStats.attr is a copy of the pg_attribute data, we could scribble on it, changing atttypid, attlen, attbyval, etc to match the index expression result type. This seems pretty grotty, but it would allow the fix to be back-patched into existing branches. Comments? I'm not sure which way to jump here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
Another misbehavior that I noted while playing with Artur's example is that, while GIN index build seems to adhere pretty well to whatever maintenance_work_mem limit you give it in 8.4, it blows right by that limit in 9.0 and HEAD --- I observed it happily eating something north of 128MB with a maintenance_work_mem of 70MB. It looks to me like the reason is the new rbtree.c code, which adds a whole lot of data structure that's not being counted against the maintenance_work_mem limit. Now the first question that might be asked is what we'd need to do to rbtree.c's API to allow its memory consumption to be tracked. However, I think there's a considerably more pressing question, which is what is it that rbtree.c is doing that justifies a 2X bloat factor in GIN index build --- which is pretty much what it's costing us, given the observation above. We know that the amount of memory available for the build has an enormous effect on build time. If we just do the obvious thing of including the rbtree data structure in the maintenance_work_mem calculation, what we're going to get is a very substantial slowdown in build speed for the same maintenance_work_mem setting, compared to the way 8.4 worked. So, I would like somebody to show cause why that whole module shouldn't be ripped out and the code reverted to where it was in 8.4. My recollection is that the argument for adding it was to speed things up in corner cases, but what I think it's actually going to do is slow things down in every case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers