Re: [HACKERS] ERROR: argument to pg_get_expr() must come from system catalogs

2010-07-30 Thread Dave Page
On Fri, Jul 30, 2010 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dave Page dp...@pgadmin.org writes:
 We had a report of the above error from a pgAdmin user testing
 1.12.0b3 with PG 9.0b3. The (highly simplified) query below works fine
 as a superuser:

 SELECT pg_get_expr(proargdefaults, 'pg_catalog.pg_class'::regclass)
   FROM pg_proc pr
   LEFT OUTER JOIN pg_description des ON des.objoid=pr.oid

 Run as a regular user though, we get the error.

 I've applied a (rather hurried) patch for this for 9.0beta4.

Thanks. Bruce seemed to think it affected 8.4.4 as well - would that
be the case, or is it something else?


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch (for 9.1) string functions ( correct patch attached )

2010-07-30 Thread Pavel Stehule
Hello

2010/7/29 Erik Rijkers e...@xs4all.nl:
 On Thu, July 29, 2010 22:43, Erik Rijkers wrote:
 Hi Pavel,

 In xfunc.sgml, I came across a function example (for use of VARIADIC in 
 polymorphic functions),
 where the function name is concat():  (in the manual: 35.4.10. Polymorphic 
 SQL Functions).
 Although that is not strictly wrong, it seems better to change that name 
 when concat goes into
 core, as seems to be the plan.

 If you agree, it seems best to include this change in your patch and change 
 that example
 function's name when the stringfunc patch gets applied.


 My apologies, the previous email had the wrong doc-patch attached.

 Here is the correct one.


it is good idea, thank you

Pavel


 Erik Rijkers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Vincenzo Romano
2010/7/29 Josh Berkus j...@agliodbs.com:

 Or maybe checking against the source code and its documentation, if any.

 No, not really.  What you really want to know is: what's the real
 planner overhead of having dozens/hundreds of partial indexes?  What's
 the write overhead?  There's no way you can derive that from the source
 code faster than you can test it.

Again, as the test would be rather killing for my group at this stage.

I think that knowing whether certain parts have been implemented
with linear or sub-linear (or whatever else) algorithms would
give good insights about scalability.

At a first glance it seems that for inheritance some bottleneck is
hindering a full exploit for table partitioning.

Is there anyone who knows whether those algorithms are linear or not?

And of course, I agree that real tests on real data will provide the real thing.

-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Greg Stark
On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano
vincenzo.rom...@notorand.it wrote:
 At a first glance it seems that for inheritance some bottleneck is
 hindering a full exploit for table partitioning.

There have been lengthy discussions of how to implement partitioning
to fix these precise problems, yes.


 Is there anyone who knows whether those algorithms are linear or not?

They're linear in both cases. But they happen at plan time rather than
query execution time. So if your application prepares all its queries
and then uses them many times it would not slow down query execution
but would slow down the query planning time. In some applications this
is much better but in others unpredictable run-times is as bad as long
run-times.

Also in the case of having many partial indexes it would slow down
inserts and updates as well, though to a lesser degree, and that would
happen at execution time.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-30 Thread Jan Urbański

On 24/07/10 15:20, Adriano Lange wrote:

Hi,


Hi!



I'd like to release the last version of my experimental join order
algorithm (TwoPO - Two Phase Optimization [1]):

http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

This algorithm is not production-ready, but an experimental set of
ideas, which need to be refined and evaluated. As the join order
optimization is a hard problem, the evaluation of a search strategy is
also a hard task. Therefore, I think the most important TODO item
related to replacement of GEQO algorithm is to define a set of
evaluation criteria considered as relevant.


Good to hear from you --  I don't know if you are aware about a 
simulated annealing join search module that I'm working on. When I first 
started, I planned to base is on the patch that you sent to -hackers 
some time ago (and actually, if you look at the repo for my module, 
twopo.c is still in there) but ended up doing it from scratch. However, 
reading your code was a big help in the beginning.


I gave a talk at this year's PGCon 
(http://www.pgcon.org/2010/schedule/events/211.en.html) and you will 
find your name in the acknowledgments section of the presentation :)


I'll make sure to read your new code and compare the approaches, my 
results so far are not perfect, but also not very pessimistic. I think 
there is actually a chance to replace GEQO with SA in the future, or at 
least to ship more join search modules with the standard distribution 
and gather field reports.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 But, looking at it a bit more carefully, isn't the maximum-size logic
 for numeric rather bogus?

 Perhaps, but I think you're confused on at least one point.
 numeric(2,1) has to be able to hold 2 decimal digits, not 2
 NumericDigits (which'd actually be 8 decimal digits given
 the current code).

I get that.  The point is: if one of those 2 decimal digits is before
the decimal point and the other is after it, then two NumericDigits
will be used.  The value '11'::numeric is only size 10 (untoasted),
but the value '1.1'::numeric is size 12 (untoasted).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Vincenzo Romano
2010/7/30 Greg Stark gsst...@mit.edu:
 On Fri, Jul 30, 2010 at 11:24 AM, Vincenzo Romano
 vincenzo.rom...@notorand.it wrote:
 At a first glance it seems that for inheritance some bottleneck is
 hindering a full exploit for table partitioning.

 There have been lengthy discussions of how to implement partitioning
 to fix these precise problems, yes.

Any reference?

 Is there anyone who knows whether those algorithms are linear or not?

 They're linear in both cases. But they happen at plan time rather than
 query execution time. So if your application prepares all its queries
 and then uses them many times it would not slow down query execution
 but would slow down the query planning time. In some applications this
 is much better but in others unpredictable run-times is as bad as long
 run-times.

Hmmm ... maybe I'm missing the inner meaning of your remarks, Greg.
By using PREPARE I run the query planned sooner and I should use
the plan with the later execution.
You can bet that some of the PREPAREd query variables will
pertain to either the child table's CHECK contraints (for table partitions)
or to the partial index's WHERE condition (for index partitioning).

It's exactly this point (execution time) where the linearity will
kill the query
over a largely partitioned table.

Is this what you meant?  :-)

 Also in the case of having many partial indexes it would slow down
 inserts and updates as well, though to a lesser degree, and that would
 happen at execution time.

This makes fully sense to me.


-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: xml_is_well_formed

2010-07-30 Thread Mike Fowler

Hi Pavel,

Thanks for taking the time to review my patch. Attached is a new version 
addressing your concerns.


On 29/07/10 14:21, Pavel Stehule wrote:

I have a few issues:
* broken regress test (fedora 13 - xmllint: using libxml version 20707)

postgres=# SELECT xml_is_well_formed('pg:foo
xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo');
  xml_is_well_formed

  f
(1 row)

this xml is broken - but in regress tests is ok

[pa...@pavel-stehule ~]$ xmllint xxx
xxx:1: parser error : error parsing attribute name
pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo
   


This is a problem that was observered recently by Thom Brown - the 
commit fest webapp adds the semicolon after the quote. If you look at 
the attachment I sent in a email client you'll find there is no 
semicolon. The patch attached here will also not have the semicolon.



* xml_is_well_formed returns true for simple text

postgres=# SELECT xml_is_well_formed('');
  xml_is_well_formed

  t
(1 row)

it is probably wrong result - is it ok??
   


Yes this is OK, pure text is valid XML content.


* I don't understand to this fragment

PG_TRY();
+   {
+   size_t  count;
+   xmlChar*version = NULL;
+   int standalone = -1;
+.
+   res_code = parse_xml_decl(string,count,version,
NULL,standalone);
+   if (res_code != 0)
+   xml_ereport_by_code(ERROR, ERRCODE_INVALID_XML_CONTENT,
+ invalid XML
content: invalid XML declaration,
+   res_code);
+.
+   doc = xmlNewDoc(version);
+   doc-encoding = xmlStrdup((const xmlChar *) UTF-8);
+   doc-standalone = 1;

why? This function can raise exception when declaration is wrong. It
is wrong - I think, this function should returns false instead
exception.

   


You're quite right,  I should be returning false here and not causing an 
exception. I have corrected this in the attached patch.


Regards,

--
Mike Fowler
Registered Linux user: 379787

*** a/contrib/xml2/xpath.c
--- b/contrib/xml2/xpath.c
***
*** 27,33  PG_MODULE_MAGIC;
  
  /* externally accessible functions */
  
- Datum		xml_is_well_formed(PG_FUNCTION_ARGS);
  Datum		xml_encode_special_chars(PG_FUNCTION_ARGS);
  Datum		xpath_nodeset(PG_FUNCTION_ARGS);
  Datum		xpath_string(PG_FUNCTION_ARGS);
--- 27,32 
***
*** 70,97  pgxml_parser_init(void)
  	xmlLoadExtDtdDefaultValue = 1;
  }
  
- 
- /* Returns true if document is well-formed */
- 
- PG_FUNCTION_INFO_V1(xml_is_well_formed);
- 
- Datum
- xml_is_well_formed(PG_FUNCTION_ARGS)
- {
- 	text	   *t = PG_GETARG_TEXT_P(0);		/* document buffer */
- 	int32		docsize = VARSIZE(t) - VARHDRSZ;
- 	xmlDocPtr	doctree;
- 
- 	pgxml_parser_init();
- 
- 	doctree = xmlParseMemory((char *) VARDATA(t), docsize);
- 	if (doctree == NULL)
- 		PG_RETURN_BOOL(false);	/* i.e. not well-formed */
- 	xmlFreeDoc(doctree);
- 	PG_RETURN_BOOL(true);
- }
- 
- 
  /* Encodes special characters (, , ,  and \r) as XML entities */
  
  PG_FUNCTION_INFO_V1(xml_encode_special_chars);
--- 69,74 
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 8554,8562  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
  ]]/screen
  /para
 /sect3
  
 sect3
! titleXML Predicates/title
  
  indexterm
   primaryIS DOCUMENT/primary
--- 8554,8566 
  ]]/screen
  /para
 /sect3
+   /sect2
+ 
+   sect2
+titleXML Predicates/title
  
 sect3
! titleIS DOCUMENT/title
  
  indexterm
   primaryIS DOCUMENT/primary
***
*** 8574,8579  SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
--- 8578,8675 
   between documents and content fragments.
  /para
 /sect3
+ 
+sect3
+ titlexml_is_well_formed/title
+ 
+ indexterm
+  primaryxml_is_well_formed/primary
+  secondarywell formed/secondary
+ /indexterm
+ 
+ synopsis
+ functionxml_is_well_formed/function(replaceabletext/replaceable)
+ /synopsis
+ 
+ para
+  The function functionxml_is_well_formed/function evaluates whether
+  the replaceabletext/replaceable is well formed XML content, returning
+  a boolean.
+ /para
+ para
+ Example:
+ screen![CDATA[
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  t
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobar/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ 
+ SELECT xml_is_well_formed('foobarstuff/foo');
+  xml_is_well_formed
+ 
+  f
+ (1 row)
+ ]]/screen
+ /para
+ para
+ In addition to the structure checks, the function ensures that namespaces are correcty matched.
+ screen![CDATA[
+ SELECT xml_is_well_formed('pg:foo 

Re: [HACKERS] review: xml_is_well_formed

2010-07-30 Thread Pavel Stehule
Hello

2010/7/30 Mike Fowler m...@mlfowler.com:
 Hi Pavel,

 Thanks for taking the time to review my patch. Attached is a new version
 addressing your concerns.

 On 29/07/10 14:21, Pavel Stehule wrote:

 I have a few issues:
 * broken regress test (fedora 13 - xmllint: using libxml version 20707)

ok - main regress test is ok now, next I checked a contrib test for xml2

inside contrib/xml2 make installcheck, and there is a problem

  SET client_min_messages = warning;
  \set ECHO none
+ psql:pgxml.sql:10: ERROR:  could not find function
xml_is_well_formed in file /usr/local/pgsql.xwf/lib/pgxml.so
+ psql:pgxml.sql:15: ERROR:  could not find function
xml_is_well_formed in file /usr/local/pgsql.xwf/lib/pgxml.so
  RESET client_min_messages;
  select query_to_xml('select 1 as x',true,false,'');

you have to remove declaration from pgxml.sql.in and
uninstall_pgxml.sql, and other related files in contrib/xml2/ regress
test



 postgres=# SELECT xml_is_well_formed('pg:foo
 xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo');
  xml_is_well_formed
 
  f
 (1 row)

 this xml is broken - but in regress tests is ok

 [pa...@pavel-stehule ~]$ xmllint xxx
 xxx:1: parser error : error parsing attribute name
 pg:foo xmlns:pg=http://postgresql.org/stuff;;bar/pg:foo


 This is a problem that was observered recently by Thom Brown - the commit
 fest webapp adds the semicolon after the quote. If you look at the
 attachment I sent in a email client you'll find there is no semicolon. The
 patch attached here will also not have the semicolon.


ok - attached patch is correct, ... please, can you remove a broken patch?


 * xml_is_well_formed returns true for simple text

 postgres=# SELECT xml_is_well_formed('');
  xml_is_well_formed
 
  t
 (1 row)

 it is probably wrong result - is it ok??


 Yes this is OK, pure text is valid XML content.

It interesting for me - is somewhere some documentation about it?

My colleagues speak some else :)

http://zvon.org/comp/r/tut-XML.html#Pages~MinimalQ20XnumberQ20XofQ20Xelements
http://www.w3.org/TR/REC-xml/#sec-prolog-dtd

I am not a specialist on XML - so just don't know


 * I don't understand to this fragment

        PG_TRY();
 +       {
 +               size_t          count;
 +               xmlChar    *version = NULL;
 +               int                     standalone = -1;
 +.
 +               res_code = parse_xml_decl(string,count,version,
 NULL,standalone);
 +               if (res_code != 0)
 +                       xml_ereport_by_code(ERROR,
 ERRCODE_INVALID_XML_CONTENT,
 +                                                 invalid XML
 content: invalid XML declaration,
 +                                                       res_code);
 +.
 +               doc = xmlNewDoc(version);
 +               doc-encoding = xmlStrdup((const xmlChar *) UTF-8);
 +               doc-standalone = 1;

 why? This function can raise exception when declaration is wrong. It
 is wrong - I think, this function should returns false instead
 exception.



 You're quite right,  I should be returning false here and not causing an
 exception. I have corrected this in the attached patch.


ok





 Regards,

 --
 Mike Fowler
 Registered Linux user: 379787



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-07-30 Thread Alexander Korotkov
I think that queries like this:
select * from test where val - 500  1 order by val - 500;
can also be optimized using knngist. In case of btree_gist this query can be
easily rewritten:
select * from test where val  499 and val  501 order by val - 500;
But, in pg_trgm it makes it possible to combine different similarity levels
in one query. For example:
select * from test_trgm order by t - 'asdf'  0.5 or t - 'qwer'  0.4;
Is there any chance to handle this syntax also?


With best regards,
Alexander Korotkov.


Re: [HACKERS] patch (for 9.1) string functions

2010-07-30 Thread Robert Haas
On Mon, Jul 26, 2010 at 8:48 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure mmonc...@gmail.com wrote:
 concat() is not a variadic text function. it is variadic any that
 happens to do text coercion (not casting) inside the function. The
 the assumption that concat is casting internally is probably wrong.
 Suppose I had hacked the int-text cast to call a custom function -- I
 would very much expect concat() not to produce output from that
 function, just the vanilla output text (I could always force the cast
 if I wanted to).

 concat is just a function that does something highly similar to
 casting.  suppose I had a function count_memory(variadic any) that
 summed memory usage of input args -- forcing casts would make no sense
 in that context (I'm not suggesting that you think so -- just bringing
 up a case that illustrates how forcing cast into the function can
 change behavior in subtle ways).

 Right, but I already said I wasn't objecting to the use of variadic
 ANY in cases like that - only in cases where, as here, you were
 basically taking any old arguments and forcing them all to text.

I believe that another unpleasant side effect of this is that CONCAT()
will have to be declared stable rather than immutable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: argument to pg_get_expr() must come from system catalogs

2010-07-30 Thread Tom Lane
Dave Page dp...@pgadmin.org writes:
 On Fri, Jul 30, 2010 at 12:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've applied a (rather hurried) patch for this for 9.0beta4.

 Thanks. Bruce seemed to think it affected 8.4.4 as well - would that
 be the case, or is it something else?

He's mistaken.  The bug is in all the branches, but there have been no
releases with it except 9.0beta3.  I will work on back-patching the
older branches this morning.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Perhaps, but I think you're confused on at least one point.
 numeric(2,1) has to be able to hold 2 decimal digits, not 2
 NumericDigits (which'd actually be 8 decimal digits given
 the current code).

 I get that.  The point is: if one of those 2 decimal digits is before
 the decimal point and the other is after it, then two NumericDigits
 will be used.

Ah, I see.  Maybe we should allow for one more NumericDigit in the
calculation for such cases.  I guess you could look at the scale too
to detect if the case is possible, but not sure it's worth the trouble.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote:
 We ran into a problem on 9.0beta3 with check constraints using table
 inheritance in a multi-level hierarchy with multiple inheritance.

 A test script is provided below and a proposed patch is attached to this
 email.

Thanks for the report.  This bug also appears to exist in 8.4; I'm not
sure yet how far back it goes.  I'm not so sure your proposed patch is
the right fix, though; it seems like it ought to be the job of
AddRelationNewConstraints() and MergeWithExistingConstraint() to make
sure that the right thing happens, here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote:
 We ran into a problem on 9.0beta3 with check constraints using table
 inheritance in a multi-level hierarchy with multiple inheritance.

 Thanks for the report.  This bug also appears to exist in 8.4; I'm not
 sure yet how far back it goes.  I'm not so sure your proposed patch is
 the right fix, though; it seems like it ought to be the job of
 AddRelationNewConstraints() and MergeWithExistingConstraint() to make
 sure that the right thing happens, here.

The original design idea was that coninhcount/conislocal would act
exactly like attinhcount/attislocal do for multiply-inherited columns.
Where did we fail to copy that logic?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jul 29, 2010 at 6:57 AM, Henk Enting h.d.ent...@mgrid.net wrote:
 We ran into a problem on 9.0beta3 with check constraints using table
 inheritance in a multi-level hierarchy with multiple inheritance.

 Thanks for the report.  This bug also appears to exist in 8.4; I'm not
 sure yet how far back it goes.  I'm not so sure your proposed patch is
 the right fix, though; it seems like it ought to be the job of
 AddRelationNewConstraints() and MergeWithExistingConstraint() to make
 sure that the right thing happens, here.

 The original design idea was that coninhcount/conislocal would act
 exactly like attinhcount/attislocal do for multiply-inherited columns.
 Where did we fail to copy that logic?

We didn't.  That logic is broken, too.  Using the OP's test setup:

rhaas=# alter table level_0_parent add column a_new_column integer;
NOTICE:  merging definition of column a_new_column for child level_1_child
NOTICE:  merging definition of column a_new_column for child level_2_child
NOTICE:  merging definition of column a_new_column for child level_2_child
ALTER TABLE
rhaas=# SELECT t.oid, t.relname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname = 'a_new_column'
ORDER BY t.oid;
  oid  |relname | attinhcount
---++-
 16420 | level_0_parent |   0
 16423 | level_0_child  |   1
 16429 | level_1_parent |   1
 16432 | level_1_child  |   2
 16438 | level_2_parent |   1
 16441 | level_2_child  |   3
(6 rows)

The attached patch (please review) appears to fix the coninhcount
case.  I haven't tracked down the attinhcount case yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


multiple_inheritance-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The original design idea was that coninhcount/conislocal would act
 exactly like attinhcount/attislocal do for multiply-inherited columns.
 Where did we fail to copy that logic?

 We didn't.  That logic is broken, too.

Uh, full stop there.  If you think that the multiply-inherited column
logic is wrong, it's you that is mistaken --- or at least you're going
to have to do a lot more than just assert that you don't like it.
We spent a *lot* of time hashing that behavior out, back around 7.3.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The original design idea was that coninhcount/conislocal would act
 exactly like attinhcount/attislocal do for multiply-inherited columns.
 Where did we fail to copy that logic?

 We didn't.  That logic is broken, too.

 Uh, full stop there.  If you think that the multiply-inherited column
 logic is wrong, it's you that is mistaken --- or at least you're going
 to have to do a lot more than just assert that you don't like it.
 We spent a *lot* of time hashing that behavior out, back around 7.3.

Well, I'm glad you spent a lot of time hashing it out, but you've got
a bug.  :-)

Since the output in the previous email apparently wasn't sufficient
for you to understand what the problem is, here it is in more detail.
Initial setup:

DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;
CREATE TABLE level_0_parent (i int);
CREATE TABLE level_0_child (a text) INHERITS (level_0_parent);
CREATE TABLE level_1_parent() INHERITS (level_0_parent);
CREATE TABLE level_1_child() INHERITS (level_0_child, level_1_parent);
CREATE TABLE level_2_parent() INHERITS (level_1_parent);
CREATE TABLE level_2_child() INHERITS (level_1_child, level_2_parent);

Then:

rhaas=# \d level_2_child
Table test_inheritance.level_2_child
 Column |  Type   | Modifiers
+-+---
 i  | integer |
 a  | text|
Inherits: level_1_child,
  level_2_parent

rhaas=# ALTER TABLE level_0_parent ADD COLUMN a_new_column integer;
NOTICE:  merging definition of column a_new_column for child level_1_child
NOTICE:  merging definition of column a_new_column for child level_2_child
NOTICE:  merging definition of column a_new_column for child level_2_child
ALTER TABLE
rhaas=# ALTER TABLE level_0_parent DROP COLUMN a_new_column;
ALTER TABLE
rhaas=# \d level_2_child
Table test_inheritance.level_2_child
Column|  Type   | Modifiers
--+-+---
 i| integer |
 a| text|
 a_new_column | integer |
Inherits: level_1_child,
  level_2_parent

Adding a column to the toplevel parent of the inheritance hierarchy
and then dropping it again shouldn't leave a leftover copy of the
column in the grandchild.

rhaas=# ALTER TABLE level_2_child DROP COLUMN a_new_column;
ERROR:  cannot drop inherited column a_new_column

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-07-30 Thread Marc Cousin
The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
 I fixed this by adding CheckLockTimeout() function that works like
 CheckStatementTimeout() and ensuring that the same start time is
 used for both deadlock_timeout and lock_timeout if both are active.
 The preference of errors if their timeout values are equal is:
 statement_timeout  lock_timeout  deadlock_timeout

As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
work, with this new version.

Keeping the deadlock_timeout to 1s, when lock_timeout = 1001, 
lock_timeout doesn't trigger anymore.


 
  * Consider the changes to the code in the context of the project as a whole:
* Is everything done in a way that fits together coherently with
  other features/modules?
  I have a feeling that
  enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
  very specific problem, and it gets complicated because there is no
  infrastructure in the code to handle several timeouts at the same time
  with sigalarm, so each timeout has its own dedicated and intertwined
  code. But I'm still discovering this part of the code.

 

 This WIP patch is also attached for reference, too. I would prefer
 this way, but I don't have more time to work on it and there are some
 interdependencies in the signal handler when e.g. disable_sig_alarm(true);
 means to disable ALL timers not just the statement_timeout.
 The specifically coded lock_timeout patch goes with the flow and doesn't
 change the semantics and works. If someone wants to pick up the timer
 framework patch and can make it work, fine. But I am not explicitely
 submitting it for the commitfest. The original patch with the fixes works
 and needs only a little more review.

Ok, understood. But I like the principle of this framework much more (the rest 
of the code seems simpler to me as a consequence of this framework).

But it goes far beyond the initial intent of the patch.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Uh, full stop there.  If you think that the multiply-inherited column
 logic is wrong, it's you that is mistaken --- or at least you're going
 to have to do a lot more than just assert that you don't like it.
 We spent a *lot* of time hashing that behavior out, back around 7.3.

 Since the output in the previous email apparently wasn't sufficient
 for you to understand what the problem is, here it is in more detail.
 ...
 Adding a column to the toplevel parent of the inheritance hierarchy
 and then dropping it again shouldn't leave a leftover copy of the
 column in the grandchild.

Actually, it probably should.  The inheritance rules were intentionally
designed to avoid dropping inherited columns that could conceivably
still contain valuable data.  There isn't enough information in the
inhcount/islocal data structure to recognize that a multiply-inherited
column is ultimately derived from only one distant ancestor, but it was
agreed that an exact tracking scheme would be more complication than it
was worth.  Instead, the mechanism is designed to ensure that no column
will be dropped if it conceivably is still wanted --- not that columns
might not be left behind and require another drop step.

*Please* go re-read the old discussions before you propose tampering
with this behavior.  In particular I really really do not believe that
any one-line fix is going to make things better --- almost certainly
it will break other cases.  Being materially more intelligent would
require a more complex data structure.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Yeb Havinga

Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

Since the output in the previous email apparently wasn't sufficient
for you to understand what the problem is, here it is in more detail.
...
Adding a column to the toplevel parent of the inheritance hierarchy
and then dropping it again shouldn't leave a leftover copy of the
column in the grandchild.


Actually, it probably should.  The inheritance rules were intentionally
designed to avoid dropping inherited columns that could conceivably
still contain valuable data.  There isn't enough information in the
inhcount/islocal data structure to recognize that a multiply-inherited
column is ultimately derived from only one distant ancestor, but it was
agreed that an exact tracking scheme would be more complication than it
was worth.  Instead, the mechanism is designed to ensure that no column
will be dropped if it conceivably is still wanted --- not that columns
might not be left behind and require another drop step.
This is not about dropping columns or not, but about proper maintenance 
of the housekeeping  of coninhcount, defined as The number of direct 
inheritance ancestors this constraint has. A constraint with a nonzero 
number of ancestors cannot be dropped nor renamed.


Regard the following lattice (direction from top to bottom):

1
|\
2 3
\|\
 4 5
  \|
   6

When adding a constraint to 1, the proper coninhcount for that 
constraint on relation 6 is 2. But the code currently counts to 3, since 
6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6.


This wrong number is a bug.

*Please* go re-read the old discussions before you propose tampering
with this behavior.  In particular I really really do not believe that
any one-line fix is going to make things better --- almost certainly
it will break other cases.
Our (more than one line) patch was explicitly designed to walk from a 
specific parent to a child exactly once. It passes regression tests.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 10:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Uh, full stop there.  If you think that the multiply-inherited column
 logic is wrong, it's you that is mistaken --- or at least you're going
 to have to do a lot more than just assert that you don't like it.
 We spent a *lot* of time hashing that behavior out, back around 7.3.

 Since the output in the previous email apparently wasn't sufficient
 for you to understand what the problem is, here it is in more detail.
 ...
 Adding a column to the toplevel parent of the inheritance hierarchy
 and then dropping it again shouldn't leave a leftover copy of the
 column in the grandchild.

 Actually, it probably should.  The inheritance rules were intentionally
 designed to avoid dropping inherited columns that could conceivably
 still contain valuable data.  There isn't enough information in the
 inhcount/islocal data structure to recognize that a multiply-inherited
 column is ultimately derived from only one distant ancestor, but it was
 agreed that an exact tracking scheme would be more complication than it
 was worth.  Instead, the mechanism is designed to ensure that no column
 will be dropped if it conceivably is still wanted --- not that columns
 might not be left behind and require another drop step.

While I certainly agree that accidentally dropping a column that
should be retained is a lot worse than accidentally retaining a column
that should be dropped, that doesn't make the current behavior
correct.  I don't think that's really an arguable point.  Another drop
step doesn't help: the leftover column is undroppable short of nuking
the whole table.

So let's focus on what the actual problem is here, what is required to
fix it, and whether or not and how far the fix can be back-patched.
There are two separate bugs here, so we should consider them
individually.  In each case, the problem is that with a certain
(admittedly rather strange) inheritance hierarchy, adding a column to
the top-level parent results in the inheritance count in the
bottom-most child ending up as 3 rather than 2.  2 is the correct
value because the bottom-most child has 2 immediate parents.  The
consequence of ending up with a count of 3 rather than 2 is that if
the constraint/column added at the top-level is subsequent dropped,
necessarily also from the top-level, the bottom-level child ends up
with the constraint/column still in existence and with an inheritance
count of 1.  This isn't the correct behavior, and furthermore the
child object can't be dropped, because it still believes that it's
inherited (even though its parents no longer have a copy to inherit).

In the case of coninhcount, I believe that the fix actually is quite
simple, although of course it's possible that I'm missing something.
Currently, ATAddConstraint() first calls AddRelationNewConstraints()
to add the constraint, or possibly merge it into an existing,
inherited constraint; it then adds the constraint to the phase-3 work
queue so that it will be checked; and finally it recurses to its own
children.  It seems to me that it should only recurse to its children
if the constraint was really added, rather than merged into an
existing constraint, because if it was merged into an existing
constraint, then the children already have it.  I'm still trying to
figure out why this bug doesn't show up in simpler cases, but I think
that's the root of the issue.

attinhcount exhibits analogous misbehavior, but if you're saying the
fix for that case is going to be a lot more complicated, I think I
agree with you.  In that case, it seems that we traverse the
inheritance hierarchy during the prep phase, at which point we don't
yet know whether we're going to end up merging anything.  It might be
that a fix for this part of the problem ends up being too complex to
back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Tom Lane
Yeb Havinga yebhavi...@gmail.com writes:
 Regard the following lattice (direction from top to bottom):

 1
 |\
 2 3
  \|\
   4 5
\|
 6

 When adding a constraint to 1, the proper coninhcount for that 
 constraint on relation 6 is 2. But the code currently counts to 3, since 
 6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6.

Mph.  I'm not sure that 3 is wrong.  You have to consider what happens
during a DROP CONSTRAINT, which as far as I saw this patch didn't
address.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 11:39 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeb Havinga yebhavi...@gmail.com writes:
 Regard the following lattice (direction from top to bottom):

 1
 |\
 2 3
  \|\
   4 5
    \|
     6

 When adding a constraint to 1, the proper coninhcount for that
 constraint on relation 6 is 2. But the code currently counts to 3, since
 6 is reached by paths 1-2-4-5, 1-3-4-6, 1-3-5-6.

 Mph.  I'm not sure that 3 is wrong.  You have to consider what happens
 during a DROP CONSTRAINT, which as far as I saw this patch didn't
 address.

The behavior of DROP CONSTRAINT matches the fine documentation.  The
behavior of ADD CONSTRAINT does not.  Perhaps there's a definition of
coninhcount that would make 3 the right answer, but (a) I'm not sure
what that definition would be and (b) it's certainly not the one in
the manual.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merge command - GSoC progress

2010-07-30 Thread Greg Smith

Boxuan Zhai wrote:

I create a clean patch file (no debug clauses). See the attachment.


Some general coding feedback for you on this.

It's not obvious to people who might want to try this out what exactly 
they are supposed to do.  Particularly for complicated patches like 
this, where only a subset of the final feature might actually be 
implemented, some sort of reviewer guide suggesting what should and 
shouldn't work would be extremely helpful.  I recall there was some sort 
of patch design guide in an earlier version of this patch; it doesn't 
seem to be there anymore.  Don't remember if that had any examples in it.


Ultimately, the examples of working behavior for this patch will need to 
be put into code.  The way that happens is by adding working examples 
into the regression tests for the database.  If you had those done for 
this patch, I wouldn't have to ask for code examples; I could just look 
at the source to the regression output and see how to use it against the 
standard database the regression samples create, and then execute 
against.  This at least lets you avoid having to generate some test 
data, because there will already be some in the regression database for 
you to use.  There is an intro this topic at 
http://wiki.postgresql.org/wiki/Regression_test_authoring  Another 
common way to generate test data is to run pgbench which creates 4 
tables and populates them with data.


As far as the patch itself goes, you have some work left on cleaning 
that up still you'll need to do eventually.  What I would suggest is 
actually reading the patch itself; don't just generate it and send it, 
read through the whole thing like someone new to it would do.  One way 
you can improve what you've done already is to find places where you 
have introduced changes to the code structure just through editing.  
Here's an example of what I'm talking about, from line 499 of your patch:


-break;
+break;   

This happened because you added two invisible tabs to the end of this 
line.  This makes the patch larger for no good reason and tends to 
infuriate people who work on the code.  There's quite a bit of extra 
lines added in here too that should go.  You should consider reducing 
the ultimate size of the patch in terms of lines a worthwhile use of 
your time, even if it doesn't change how things work.  There's lots of 
examples in this one where you put two or three lines between two 
statements when a single one would match the look of the code in that 
file.  A round of reading the diff looking for that sort of problem 
would be useful.


Another thing you should do is limit how long each line is when 
practical.  You have lots of seriously wide comment lines here right now 
in particular.  While there are some longer lines in the PostgreSQL code 
right now, that's code, not comments.  And when you have a long line and 
a long comment, don't tack the comment onto the end.  Put it on the line 
above instead.  Also, don't use // in the middle of comments the way 
you've done in a few places here.


Getting some examples sorted out and starting on regression tests is 
more important than the coding style parts I think, just wanted to pass 
those along while I noticed them reading the patch, so you could start 
looking out for them more as you continue to work on it.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 11:35 AM, Robert Haas robertmh...@gmail.com wrote:
 In the case of coninhcount, I believe that the fix actually is quite
 simple, although of course it's possible that I'm missing something.
 Currently, ATAddConstraint() first calls AddRelationNewConstraints()
 to add the constraint, or possibly merge it into an existing,
 inherited constraint; it then adds the constraint to the phase-3 work
 queue so that it will be checked; and finally it recurses to its own
 children.  It seems to me that it should only recurse to its children
 if the constraint was really added, rather than merged into an
 existing constraint, because if it was merged into an existing
 constraint, then the children already have it.  I'm still trying to
 figure out why this bug doesn't show up in simpler cases, but I think
 that's the root of the issue.

OK, it looks like level_2_parent is actually irrelevant to this issue.
 So here's a slightly simplified test case:

DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;

CREATE TABLE top (i int);
CREATE TABLE mid1 () INHERITS (top);
CREATE TABLE mid2 () INHERITS (top);
CREATE TABLE bottom () INHERITS (mid1, mid2);
CREATE TABLE basement () INHERITS (bottom);

ALTER TABLE top ADD CONSTRAINT a_check_constraint CHECK (i = 0);

You can also trigger it this way:

DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;

CREATE TABLE top1 (i int);
CREATE TABLE top2 (i int);
CREATE TABLE bottom () INHERITS (top1, top2);
CREATE TABLE basement () INHERITS (bottom);

ALTER TABLE top1 ADD CONSTRAINT a_check_constraint CHECK (i = 0);
ALTER TABLE top2 ADD CONSTRAINT a_check_constraint CHECK (i = 0);

In other words, the problem occurs when table A inherits a constraint
from multiple parents, and table B inherits from table A.  Most of the
code (including ALTER TABLE .. DROP CONSTRAINT and ALTER TABLE .. [NO]
INHERIT) maintains the invariant that coninhcount is the number of
direct parents from which the constraint is inherited, but the ADD
CONSTRAINT fails to do so in this particular case.  So at this point,
I'm fairly confident that this is the correct fix.  I think the reason
we haven't noticed this before is that it's most likely to occur when
you have a diamond inheritance pattern with another child hanging off
the bottom, which is not something most people probably do.

It appears that the coninhcount mechanism was introduced in 8.4; prior
to that, we didn't really make an effort to get this right.
Therefore, I believe the patch I posted upthread should be applied to
HEAD, REL9_0_STABLE, and REL8_4_STABLE.

This still leaves open the question of what to do about the similar
case involving attributes:

DROP SCHEMA IF EXISTS test_inheritance CASCADE;
CREATE SCHEMA test_inheritance;
SET search_path TO test_inheritance;

CREATE TABLE top (i int);
CREATE TABLE mid1 () INHERITS (top);
CREATE TABLE mid2 () INHERITS (top);
CREATE TABLE bottom () INHERITS (mid1, mid2);
CREATE TABLE basement () INHERITS (bottom);

ALTER TABLE top ADD COLUMN a_table_column integer;

That problem looks significantly more difficult to solve, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merge command - GSoC progress

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 12:21 PM, Greg Smith g...@2ndquadrant.com wrote:
 If you had those done for this patch, I
 wouldn't have to ask for code examples; I could just look at the source to
 the regression output and see how to use it against the standard database
 the regression samples create, and then execute against.

I agree.  While not every feature needs regression tests, something
this complex certainly does.  Also, running the regression tests
(frequently!) can help you realize when you've broken the existing
code before too much time goes by and it's no longer easy to figure
out which change is responsible for the breakage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] merge command - GSoC progress

2010-07-30 Thread Andres Freund
On Friday 30 July 2010 18:21:49 Greg Smith wrote:
 -break;
 +break;   
 
 This happened because you added two invisible tabs to the end of this 
 line.  This makes the patch larger for no good reason and tends to 
If you want to remove changes like this using:
git checkout -p HEAD
is very useful if youre using git. It allows you to selectively revert hunks 
of not-checked-in changes...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gincostestimate

2010-07-30 Thread Jan Urbański
OK, here's a review, as much as I was able to do it without 
understanding deeply how GIN works.


The patch is context, applies cleanly to HEAD, compiles without warnings 
and passes regression tests.


Using the script from 
http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php I 
was able to get an index scan with commonterm40, while with the 
unpatched source I was getting an index scan only for commonterm20, so 
it indeed improves the situation as far as cost estimation is concerned.


Codewise I have one question: the patch changes a loop in 
ginvacuumcleanup from


   for (blkno = GIN_ROOT_BLKNO + 1; blkno  npages; blkno++)

to

   for (blkno = GIN_ROOT_BLKNO; blkno  npages; blkno++)

why should it now go through all blocks? I couldn't immediately see why 
was it not OK to do it before and why is it OK to do it now, but I don't 
really get how GIN works internally. I guess a comment would be good to 
have there in any case.


The patch has lots of statements like if ( GinPageIsLeaf(page) ), that 
is with extra space between the outer parenthesis and the condition, 
which AFAIK is not the project style. I guess pgindent fixes that, so 
it's no big deal.


There are lines with elog(NOTICE) that are #if 0, they probably should 
either become elog(DEBUGX) or get removed.


As for performance, I tried running the attached script a couple of 
times. I used the standard config file, only changed checkpoint_segments 
to 30 and shared_buffers to 512MB. The timings were:


HEAD

INSERT 0 50
Time: 13487.450 ms
VACUUM
Time: 337.673 ms

INSERT 0 50
Time: 13751.110 ms
VACUUM
Time: 315.812 ms

INSERT 0 50
Time: 12691.259 ms
VACUUM
Time: 312.320 ms

HEAD + gincostestimate

INSERT 0 50
Time: 13961.894 ms
VACUUM
Time: 355.798 ms

INSERT 0 50
Time: 14114.975 ms
VACUUM
Time: 341.822 ms

INSERT 0 50
Time: 13679.871 ms
VACUUM
Time: 340.576 ms

so there is no immediate slowdown for a quick test with one client.

Since there was no stability or performance issues and it solves the 
problem, I am marking this as ready for committer, although it might be 
beneficial if someone more acquianted with GIN takes another look at it 
before the committer review.


I will be travelling during the whole August and will only have 
intermittent email access, so in case of any questions with regards to 
review the respionse time can be a few days.


Cheers,
Jan
drop table if exists x;
create table x(f tsvector);
create index x_idx on x using gin(f);
insert into x (select 'test words taken at random but a few of them indeed' 
from generate_series(1, 50) g);
vacuum;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gincostestimate

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 1:19 PM, Jan Urbański wulc...@wulczer.org wrote:
 The patch has lots of statements like if ( GinPageIsLeaf(page) ), that is
 with extra space between the outer parenthesis and the condition, which
 AFAIK is not the project style. I guess pgindent fixes that, so it's no big
 deal.

It's better if these get cleaned up.  pgindent will fix it eventually,
but the less stuff pgindent has to touch, the less likelihood there is
of breaking outstanding patches when it's run.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
   Maybe something like this,
 obviously with a suitable comment which I haven't written yet:

 numeric_digits = (precision + 6) / 4;
 return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ;

This is OK for the base-10K case, but there's still code in there
for the base-10 and base-100 cases.  Can you express this logic in
terms of DEC_DIGITS and sizeof(NumericDigit) ?  I think you might
find it was actually clearer that way, cf Polya.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Tom Lane
[ forgot to answer this part ]

Robert Haas robertmh...@gmail.com writes:
 Another question here is whether we should just fix this in CVS HEAD,
 or whether there's any reason to back-patch it.

Agreed, fixing it in HEAD seems sufficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Josh Berkus

 Is there anyone who knows whether those algorithms are linear or not?

Read the code?  It's really very accessible, and there's lots and lots
of comments.  While the person who wrote the code is around, isn't it
better to see the real implementation?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Yeb Havinga

Robert Haas wrote:

It seems to me that it should only recurse to its children
if the constraint was really added, rather than merged into an
existing constraint, because if it was merged into an existing
constraint, then the children already have it.
Yes. (then the children already have it - already have it from the 
current parent) - now I understand how AddRelationNewConstraints is 
used, this effectively blocks  1 times propagation to childs from the 
same parent, and that is what our patch was written todo too.

OK, it looks like level_2_parent is actually irrelevant to this issue.
 So here's a slightly simplified test case:

CREATE TABLE top (i int);
CREATE TABLE mid1 () INHERITS (top);
CREATE TABLE mid2 () INHERITS (top);
CREATE TABLE bottom () INHERITS (mid1, mid2);
CREATE TABLE basement () INHERITS (bottom);
  
This is, probably provable, the smallest case where this problem can 
occur. Removing any table would mean that this bug is not hit anymore.

This still leaves open the question of what to do about the similar
case involving attributes:

That problem looks significantly more difficult to solve, though.
  
I'm looking at ATPrepAddColumn right now, where there is actually some 
comments about getting the right attinhcount in the case of multiple 
inherited children, but it's the first time I'm looking at this part of 
PostgreSQL and it needs some time to sink in. It looks a bit like at the 
place of the comment /* child should see column as singly inherited */, 
maybe the recursion could be stopped there as well, i.e. not only 
setting inhcount to 1, but actually adding the child relation one time 
to the wqueue.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Vincenzo Romano
2010/7/30 Josh Berkus j...@agliodbs.com:

 Is there anyone who knows whether those algorithms are linear or not?

 Read the code?  It's really very accessible, and there's lots and lots
 of comments.  While the person who wrote the code is around, isn't it
 better to see the real implementation?

If the programmer(s) who wrote that part is around, a simple hint would suffice.
Even an hint to where look into the code would be very appreciated: the query
planner is not as simple as the ls command (which is not that simple any
more, though).

It looks like I need to go the hard way ...
Starting from postgresql-8.4.4/src/backend/optimizer

-- 
Vincenzo Romano at NotOrAnd Information Technologies
Software Hardware Networking Training Support Security
--
cel +393398083886 fix +390823454163 fax +3902700506964
gtalk. vincenzo.rom...@notorand.it skype. notorand.it
--
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Yeb Havinga

Yeb Havinga wrote:

Robert Haas wrote:

This still leaves open the question of what to do about the similar
case involving attributes:

That problem looks significantly more difficult to solve, though.
  
I'm looking at ATPrepAddColumn right now, where there is actually some 
comments about getting the right attinhcount in the case of multiple 
inherited children, but it's the first time I'm looking at this part 
of PostgreSQL and it needs some time to sink in. It looks a bit like 
at the place of the comment /* child should see column as singly 
inherited */, maybe the recursion could be stopped there as well, i.e. 
not only setting inhcount to 1, but actually adding the child relation 
one time to the wqueue.
I believe the crux is to stop double recursion from a parent in 
ATOneLevelRecursion. I did a quick test by adding a globally defined


static List *visitedrels = NIL;

together by adding in front of ATOneLevelRecursion this:

   if (list_member_oid(visitedrels, relid))
   return;
   visitedrels = lappend_oid(visitedrels, relid);

Running Roberts example gives the correct attinhcount for the basement.

SELECT t.oid, t.relname, a.attinhcount
FROM pg_class t
JOIN pg_attribute a ON (a.attrelid = t.oid)
JOIN pg_namespace n ON (t.relnamespace = n.oid)
WHERE n.nspname = 'test_inheritance' AND a.attname = 'a_table_column'
ORDER BY oid;

 oid  | relname  | attinhcount
---+--+-
16566 | top  |   0
16569 | mid1 |   1
16572 | mid2 |   1
16575 | bottom   |   2
16578 | basement |   1

regards,
Yeb Havinga

PS: forgot to say thanks for looking into this problem earlier in the 
thread. Thanks!



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Greg Smith

Vincenzo Romano wrote:

By using PREPARE I run the query planned sooner and I should use
the plan with the later execution.
You can bet that some of the PREPAREd query variables will
pertain to either the child table's CHECK contraints (for table partitions)
or to the partial index's WHERE condition (for index partitioning).
  


Prepared statements are not necessarily a cure for long query planning 
time, because the sort of planning decisions made with partitioned child 
tables and index selection can need to know the parameter values to 
execute well; that's usually the situation rather than the exception 
with partitions.  You run the risk that the generic prepared plan will 
end up looking at all the partitions, because at preparation plan time 
it can't figure out which can be excluded.  Can only figure that out 
once they're in there for some types of queries.


I think you aren't quite lined up with the people suggesting test it 
in terms of what that means.  The idea is not that you should build a 
full on application test case yet, which can be very expensive.  The 
idea is that you might explore things like when I partition this way 
increasing the partitions from 1 to n, does query time go up linearly? 
by measuring with fake data and a machine-generated schema.  What's 
happened in some of these cases is that, despite the theoretical, some 
constant or external overhead ends up dominating behavior for lower 
numbers.  As an example, it was recognized that the amount of statistics 
for a table collected with default_statistics_target had a quadratic 
impact on some aspects of performance.  But it turned out that for the 
range of interesting values to most people, the measured runtime did not 
go up with the square as feared.  Only way that was sorted out was to 
build a simple simulation.


Here's a full example from that discussion that shows the sort of tests 
you probably want to try, and comments on the perils of guessing based 
on theory rather than testing:


http://archives.postgresql.org/pgsql-hackers/2008-12/msg00601.php
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00687.php

generate_series can be very helpful here, and you can even use that to 
generate timestamps if you need them in the data set.


That said, anecdotally everyone agrees that partitions don't scale well 
into even the very low hundreds for most people, and doing multi-level 
ones won't necessarily normally drop query planning time--just the cost 
of maintaining the underlying tables and indexes.  My opinion is that 
building a simple partitioned case and watching how the EXPLAIN plans 
change as you adjust things will be more instructive for you than either 
asking about it or reading the source.  Vary the parameters, watch the 
plans, measure things and graph them if you want to visualize the 
behavior better.  Same thing goes for large numbers of partial indexes, 
which have a similar query planning impact, but unlike partitions I 
haven't seen anyone analyze them via benchmarks.  I'm sure you could get 
help here (probably the performance list is a better spot though) with 
getting your test case right if you wanted to try and nail that down.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On Scalability

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 3:50 PM, Vincenzo Romano
vincenzo.rom...@notorand.it wrote:
 2010/7/30 Josh Berkus j...@agliodbs.com:

 Is there anyone who knows whether those algorithms are linear or not?

 Read the code?  It's really very accessible, and there's lots and lots
 of comments.  While the person who wrote the code is around, isn't it
 better to see the real implementation?

 If the programmer(s) who wrote that part is around, a simple hint would 
 suffice.
 Even an hint to where look into the code would be very appreciated: the query
 planner is not as simple as the ls command (which is not that simple any
 more, though).

 It looks like I need to go the hard way ...
 Starting from postgresql-8.4.4/src/backend/optimizer

I think you're approaching this in the wrong way.  You've repeatedly
said you don't want to do all the work of setting up a test, but
trying to search the code for algorithms that might not be linear is
not going to be easier.  I've been reading this thread and I'm fairly
familiar with this code, and I even understand the algorithms pretty
well, and I don't know whether they're going to be linear for what you
want to do or not.  Certainly, the overall task of join planning is
exponential-time in the number of *tables*, but if you're just doing
SELECT statements on a single table, will that be linear?  Tough to
say.  Certainly, there are a lot of linked lists in there, so if we
have any place where we have two nested loops over the list of
indices, it won't be linear.  I can't think of a place where we do
that, but that doesn't mean there isn't one.  And even if they are
linear, or n log n or something, the coefficients could still be
lousy.  Theoretical computer science is one of my favorite subjects,
but, it doesn't always tell you what you want to know about the real
world.

It doesn't seem like it should be very hard to figure this out
empirically.  Just create a big database full of random data.  Maybe
you could populate one of the columns with something like (random() *
1000)::int.  Then you could create partial indices ON
(some_other_column) WHERE that_column = blat for blat in 0..999.
Then you could run some test queries and see how you make out.

Or, I mean, you can read the source code.  That's fine, too.  It's
just... I've already read the source code quite a few times, and I
still don't know the answer.  Admittedly, I wasn't trying to answer
this specific question, but still - I don't think it's an easy
question to answer that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for check constraints using multiple inheritance

2010-07-30 Thread Alex Hunsaker
On Fri, Jul 30, 2010 at 10:22, Robert Haas robertmh...@gmail.com wrote:

 OK, it looks like level_2_parent is actually irrelevant to this issue.
  So here's a slightly simplified test case:

 DROP SCHEMA IF EXISTS test_inheritance CASCADE;
 CREATE SCHEMA test_inheritance;
 SET search_path TO test_inheritance;

 CREATE TABLE top (i int);
 CREATE TABLE mid1 () INHERITS (top);
 CREATE TABLE mid2 () INHERITS (top);
 CREATE TABLE bottom () INHERITS (mid1, mid2);
 CREATE TABLE basement () INHERITS (bottom);

 ALTER TABLE top ADD CONSTRAINT a_check_constraint CHECK (i = 0);

The other problem with the current code is it creates a dump that is
not consistent with \d.  The dump gets it right in the sense that
basement does not have the constraint.  We could debate what
coninhcount should be, but clearly there is a bug here. [ FYI with
your patch the dump is, of course, consistent again (no
a_check_constraint) ]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 9:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Perhaps, but I think you're confused on at least one point.
 numeric(2,1) has to be able to hold 2 decimal digits, not 2
 NumericDigits (which'd actually be 8 decimal digits given
 the current code).

 I get that.  The point is: if one of those 2 decimal digits is before
 the decimal point and the other is after it, then two NumericDigits
 will be used.

 Ah, I see.  Maybe we should allow for one more NumericDigit in the
 calculation for such cases.  I guess you could look at the scale too
 to detect if the case is possible, but not sure it's worth the trouble.

Since this just needs to be a reasonable upper bound, probably not.

Another small but related issue is this comment is out-of-date:

/* Numeric stores 2 decimal digits/byte, plus header */
return (precision + 1) / 2 + NUMERIC_HDRSZ;

Currently, numeric stores 4 decimal digits/2 bytes, which is not quite
the same thing.  I think that for clarity it would make sense to break
this calculation in two parts.  First, estimate the number of
NumericDigits that could be needed; then, estimate the amount of space
that could be needed to store them.  Maybe something like this,
obviously with a suitable comment which I haven't written yet:

numeric_digits = (precision + 6) / 4;
return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ;

The first line would be incorrect for precision 0 (it would produce 1,
not 0) but precision 0 isn't allowed anyway, and overestimating just
that one case would be OK even if it did.  As far as I can see, it's
correct for all higher values.  If you have 1 decimal digit, you can't
need more than one NumericDigit, but if you have 2 decimal digits, you
can need one for each side of the decimal point.  But you can't manage
to get a third NumericDigit until you get up to 6 decimal digits,
because with only 5 you can split them 1-4 or 3-2 across the decimal
point, but getting to a second NumericDigit on either side of the
decimal point uses up all 5 digits, leaving nothing for the other
side.  Similarly, to get a fourth NumericDigit, you have to have
enough decimal digits to split them 1-4-4-1 (with the decimal point
somewhere in the middle), so 10 is the minimum.

Another question here is whether we should just fix this in CVS HEAD,
or whether there's any reason to back-patch it.  I can't see much
reason to back-patch, unless there's third-party code depending on
this for something more than what we use it for.  The only callers of
type_maximum_size are get_typavgwidth(), which doesn't need to be
exact, and needs_toast_table().  So it seems that the worst thing that
could happen as a result of this problem is that we might fail to
create a toast table when one is needed, and even that seems extremely
unlikely.  First, you'd need to have a table containing no text or
unlimited-length varchar columns, because those will force toast table
creation anyway.  Second, numerics are typically going to be stored
with a short varlena header on disk, so a 2-byte underestimate of the
max size is going to be swamped by a 3-byte savings on the varlena
header.  Third, even if you can find a case where the above doesn't
matter, it's not that hard to force a toast table to get created.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] reducing NUMERIC size for 9.1, take two

2010-07-30 Thread Robert Haas
Here's a second version of the main patch, in which I have attempted
to respond to Tom's concerns/suggestions.

(There is still a small, side issue with numeric_maximum_size() which
I plan to fix, but this patch is the good stuff.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


numeric_2b-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-07-30 Thread Alvaro Herrera
Excerpts from Boszormenyi Zoltan's message of jue jul 29 07:55:38 -0400 2010:
 Hi,
 
 Marc Cousin írta:
  Hi, I've been reviewing this patch for the last few days. Here it is :

 ...
* Are there dangers?
  As it is a guc, it could be set globally, is that a danger ?

 
 I haven't added any new code covering supplemental (e.g. autovacuum)
 processes,
 the interactions are yet to be discovered. Setting it globally is not
 recommended.

FWIW there is some code in autovacuum and other auxiliary processes that
forcibly sets statement_timeout to 0.  I think this patch should do
likewise.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-30 Thread Robert Haas
On Fri, Jul 30, 2010 at 2:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
   Maybe something like this,
 obviously with a suitable comment which I haven't written yet:

     numeric_digits = (precision + 6) / 4;
     return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ;

 This is OK for the base-10K case, but there's still code in there
 for the base-10 and base-100 cases.  Can you express this logic in
 terms of DEC_DIGITS and sizeof(NumericDigit) ?  I think you might
 find it was actually clearer that way, cf Polya.

It appears to work out to:

numeric_digits = (precision + 2 * (DEC_DIGITS - 1)) / DEC_DIGITS
return (numeric_digits * sizeof(NumericDigits)) + NUMERIC_HDRSZ;

The smallest value for precision which requires 2 numeric_digits is
always 2; and the required number of numeric_digits increases by 1
each time the number of base-10 digits increases by DEC_DIGITS.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ANALYZE versus expression indexes with nondefault opckeytype

2010-07-30 Thread Tom Lane
I've been poking at the FTS problem example recently submitted by
Artur Dabrowski.  It contains an index declared like so:

CREATE INDEX idx_keywords_ger ON search_tab
USING gin (to_tsvector('german'::regconfig, keywords));

which can be queried like so

select * from search_tab where
  to_tsvector('german', keywords) @@ to_tsquery('german', 'whatever');

which is all fine and matches suggestions in our documentation.  But I
was dismayed to discover that ANALYZE wasn't storing any statistics
for this expression index, which rather hobbles the planner's ability
to produce useful estimates for the selectivity of such WHERE clauses.

The reason for that turns out to be this bit in analyze.c:

/*
 * Can't analyze if the opclass uses a storage type
 * different from the expression result type. We'd get
 * confused because the type shown in pg_attribute for
 * the index column doesn't match what we are getting
 * from the expression. Perhaps this can be fixed
 * someday, but for now, punt.
 */
if (exprType(indexkey) !=
Irel[ind]-rd_att-attrs[i]-atttypid)
continue;

Since a tsvector index does have a storage type different from tsvector
(in fact, it's shown as TEXT), this code decides to punt and not analyze
the column.  I think it's past time we did something about this.

After a bit of study of the code, it appears to me that it's not too
difficult to fix: we just have to use the expression's result type
rather than the index column's atttypid in the subsequent processing.
ANALYZE never actually looks at the index column contents (indeed
cannot, since our index AMs provide no API for extracting the contents),
so atttypid isn't really all that relevant to it.  However, this'll
imply an API/ABI break for custom typanalyze functions: we'll need to
store the actual type OID in struct VacAttrStats, and use that rather
than looking to attr-atttypid or any of the other type-dependent
fields of the Form_pg_attribute.  So that seems to make it not practical
to back-patch.

I thought of an ugly hack that would avoid an API/ABI break: since
VacAttrStats.attr is a copy of the pg_attribute data, we could scribble
on it, changing atttypid, attlen, attbyval, etc to match the index
expression result type.  This seems pretty grotty, but it would allow
the fix to be back-patched into existing branches.

Comments?  I'm not sure which way to jump here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem

2010-07-30 Thread Tom Lane
Another misbehavior that I noted while playing with Artur's example is
that, while GIN index build seems to adhere pretty well to whatever
maintenance_work_mem limit you give it in 8.4, it blows right by that
limit in 9.0 and HEAD --- I observed it happily eating something north
of 128MB with a maintenance_work_mem of 70MB.  It looks to me like the
reason is the new rbtree.c code, which adds a whole lot of data
structure that's not being counted against the maintenance_work_mem
limit.

Now the first question that might be asked is what we'd need to do to
rbtree.c's API to allow its memory consumption to be tracked.  However,
I think there's a considerably more pressing question, which is what
is it that rbtree.c is doing that justifies a 2X bloat factor in GIN
index build --- which is pretty much what it's costing us, given the
observation above.  We know that the amount of memory available for
the build has an enormous effect on build time.  If we just do the
obvious thing of including the rbtree data structure in the
maintenance_work_mem calculation, what we're going to get is a very
substantial slowdown in build speed for the same maintenance_work_mem
setting, compared to the way 8.4 worked.

So, I would like somebody to show cause why that whole module shouldn't
be ripped out and the code reverted to where it was in 8.4.  My
recollection is that the argument for adding it was to speed things up
in corner cases, but what I think it's actually going to do is slow
things down in every case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers