Re: [HACKERS] patch: Allow \dd to show constraint comments

2011-06-19 Thread Kohei KaiGai
I think the v5 patch should be marked as 'Ready for Committer'

2011/6/18 Josh Kupershmidt schmi...@gmail.com:
 On Sat, Jun 18, 2011 at 10:53 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 In addition, this pg_comments system view supports 'access method' class, but
 we cannot set a comment on access methods using COMMENT ON statement.

 Well, there are comments for the built-in access methods, i.e.

Oops, I missed the comments on initdb stage.

 Regarding to the data-type of objnamespace, how about an idea to define a new
 data type such as 'regschema' and cast objnamespace into this type?
 If we have such data type, user can reference string expression of schema 
 name,
 and also available to use OID joins.

 Are you suggesting we leave the structure of pg_comments unchanged,
 but introduce a new 'regschema' type so that if users want to easily
 display the schema name of an object, they can just do:

  SELECT objnamespace::regschema, ...
    FROM pg_comments WHERE ... ;

 That seems reasonable to me.

Yes, however, it should be discussed in another thread as Robert said.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] DROP Reworks Part.0 - 'missing_ok' support of get_object_address

2011-06-19 Thread Kohei KaiGai
Thanks for your review.

2011/6/19 Robert Haas robertmh...@gmail.com:
 On Tue, Jun 14, 2011 at 8:06 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The attached patch is a preparation to rework implementation of DROP 
 statement
 using a common code. That intends to apply get_object_address() to resolve 
 name
 of objects to be removed, and eventually minimizes the number of places to 
 put
 permission checks.

 Its first step is an enhancement of get_object_address; to accept 
 'missing_ok'
 argument to handle cases when IF EXISTS clause is supplied.
 If 'missing_ok' was true and the supplied name was not found, the patched
 get_object_address() returns an ObjectAddress with InvalidOid as objectId.
 If 'missing_ok' was false, its behavior is not changed.

 Let's consistently make missing_ok the last argument to all of the
 functions to which we're adding it.

OK. I revised position of the 'missing_ok' argument.

 I don't think it's a good idea for get_relation_by_qualified_name() to
 be second-guessing the error message that RangeVarGetRelid() feels
 like throwing.

OK. I revised the patch to provide 'true' on RangeVarGetRelid().
Its side effect is on error messages when user gives undefined object name.

 I think that attempting to fetch the column foo.bar when foo doesn't
 exist should be an error even if missing_ok is true.  I believe that's
 consistent with what we do elsewhere.  (If it *were* necessary to open
 the relation without failing if it's not there, you could use
 try_relation_openrv instead of doing as you've done here.)

It was fixed. AlterTable() already open the relation (without missing_ok)
in the case when we drop a column, so I don't think we need to acquire
relation locks if the supplied column was missing.

 There is certainly a more compact way of writing the logic in
 get_object_address_typeobj.  Also, I think that function should be
 called get_object_address_type(); the obj on the end seems
 redundant.

I renamed the function name to get_object_address_type(), and
consolidate initialization of ObjectAddress variables.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


pgsql-v9.2-drop-reworks-part-0.2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types and extensions

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 00:23 , Jeff Davis wrote:
 On Sat, 2011-06-18 at 22:19 +0200, Florian Pflug wrote:
 Yes, that seems necessary for consistency. That leaves the question
 of what to do if someone tries to modify a textrange's collation with
 a COLLATE clause. For example,
 
 For example, whats the result of
  'Ä' in '[A,Z']::textrange_german COLLATE 'C'
 where 'Ä' is a german Umlaut-A which sorts after 'A' but before 'B'
 in locale 'de_DE' but sorts after 'Z' in locale 'C'. (I'm assuming
 that textrange_german was defined with collation 'de_DE').
 
 With the set-based definition of ranges, the only sensible thing
 is to simply ignore the COLLATE clause I think.
 
 I think rejecting it makes more sense, so a range would not be a
 collatable type; it just happens to use collations of the subtype
 internally.

Ah, crap, I put the COLLATE in the wrong place. What I actually
had in mind was
  ('Ä' COLLATE 'C') in '[A,Z]'::textrange_german

I was afraid that the in operator cannot distinguish this case
from
  field in '[A,Z]'::textrange_german
where field is declared with COLLATE 'C'.

In the seconds case, throwing an error seems a bit harsh

There's also this fun little case
  field in '[A,Z]'
(note lack of an explicit cast). Here the input function would
probably need to verify that there's a range type corresponding
to the field's type *and* that the range type's collation matches
the field's collation. I wonder if that's possible - Tom said
somewhere that input function don't receive collation information,
though I don't know if that restriction applies in this case.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hugetables question

2011-06-19 Thread Radosław Smogura
I want to implement hugepages for shared memory, to make it transparent I want 
to do in this fashion:
1. Reserve memory M of size s
2. Try to allocate hugepage memory of as big size as possible (hs), attach at 
M.
3. Allocate normal shared memory of size hs - s, and attach it at M+hs.
This soulution should work for Linux and Windows, and make no difference for 
usage of such shared memory in application.

(...and this actually works)

But in sysv_shmem i saw some checking for memory belonging to other (probably 
failed) processes, because I can't put new header in step 3, i would like to 
ask if will be suefficient to:
1. Check creator pid by shmctl.
2. Remove checking of shmem magic
3. Or maybe instead of this better will be to split shared memory, header will 
be stored under one key and it will contain keys to other individually 
allocated blocks?

Ofocourse moving to POSIX may be much more better, but according to commit 
feast thread it may be impossible.

Maybe some other ideas.

Regards,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hugetables question

2011-06-19 Thread Martijn van Oosterhout
On Sun, Jun 19, 2011 at 11:56:15AM +0200, Rados??aw Smogura wrote:
 I want to implement hugepages for shared memory, to make it transparent I 
 want 
 to do in this fashion:
 1. Reserve memory M of size s
 2. Try to allocate hugepage memory of as big size as possible (hs), attach at 
 M.
 3. Allocate normal shared memory of size hs - s, and attach it at M+hs.
 This soulution should work for Linux and Windows, and make no difference for 
 usage of such shared memory in application.

At least in Linux they're trying to make hugepages transparent, so I'm
wondering if this is going to make a difference for Linux in the long
term.

As for your other problem, Perhaps you can put the shmem block first,
before the hugemem block? Would require some pointer fiddling, but
seems doable.

Habe a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Hugetables question

2011-06-19 Thread Radosław Smogura
Martijn van Oosterhout klep...@svana.org Sunday 19 of June 2011 12:35:18
 On Sun, Jun 19, 2011 at 11:56:15AM +0200, Rados??aw Smogura wrote:
  I want to implement hugepages for shared memory, to make it transparent I
  want to do in this fashion:
  1. Reserve memory M of size s
  2. Try to allocate hugepage memory of as big size as possible (hs),
  attach at M.
  3. Allocate normal shared memory of size hs - s, and attach it at M+hs.
  This soulution should work for Linux and Windows, and make no difference
  for usage of such shared memory in application.
 
 At least in Linux they're trying to make hugepages transparent, so I'm
 wondering if this is going to make a difference for Linux in the long
 term.
 
 As for your other problem, Perhaps you can put the shmem block first,
 before the hugemem block? Would require some pointer fiddling, but
 seems doable.
 
 Habe a nice day,
 
  Patriotism is when love of your own people comes first; nationalism,
  when hate for people other than your own comes first.
  
- Charles de Gaulle
Yes shmem will be transparent in Linux, but in any case, currently is only for 
anonymous memory, and has some disadvantages over explicit hugepages.

Regards,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types and extensions

2011-06-19 Thread Martijn van Oosterhout
On Sun, Jun 19, 2011 at 11:21:28AM +0200, Florian Pflug wrote:
  I think rejecting it makes more sense, so a range would not be a
  collatable type; it just happens to use collations of the subtype
  internally.
 
 Ah, crap, I put the COLLATE in the wrong place. What I actually
 had in mind was
   ('Ä' COLLATE 'C') in '[A,Z]'::textrange_german

Operators don't have to be collation sensetive. If they're not then the
COLLATE in the above statement is redundant.  You can decide that an
interval needs an implicit collation and you can just use that.

 I was afraid that the in operator cannot distinguish this case
 from
   field in '[A,Z]'::textrange_german
 where field is declared with COLLATE 'C'.

It should be able to, after all in the first case the collation is
explicit, in the latter implicit.

 There's also this fun little case
   field in '[A,Z]'
 (note lack of an explicit cast). Here the input function would
 probably need to verify that there's a range type corresponding
 to the field's type *and* that the range type's collation matches
 the field's collation. I wonder if that's possible - Tom said
 somewhere that input function don't receive collation information,
 though I don't know if that restriction applies in this case.

Collation checking is generally done by the planner. I don't see why
the input function should check, the result of an input function is by
definition DEFAULT. It's up to the 'in' operator to check.

Note that the whole idea of collation is not really supposed to be
assigned to object for storage.  How that can be resolved I'm not sure.

Mvg,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Pavel Stehule
Hello

2011/6/19 Steve Singer ssinger...@sympatico.ca:
 On 11-06-08 04:14 PM, Pavel Stehule wrote:

 Hello

 Attached patch implements a new erros's fields that describes table,
 colums related to error. This enhanced info is limited to constraints
 and RI.


  ...

 I think that both the CONSTRAINT, and TABLE name should be double quoted
 like A_pkey is above.  When doing this make sure you don't break the
 quoting in the CSV format log.


I agree so quoting must be used in CSV log - the result have to be
valid CSV and I'll ensure this. I am not sure about implicit quoting
and using some quote_ident operation early. This is result of some
operation - not input. Quoting in message is used not like SQL
quoting, but as plain text quoting - it is just border between human
readable text and data. But fields like TABLE_NAME or COLUMN_NAME
contains just data - so quoting is useless.

Next argument - the quoting is more simple than remove quoting. If
somebody needs to quoting, then can use a quoting_ident function, but
there are no inverse function - so I prefer a names in raw format. It
is more simply and usual to add quoting than remove quoting.

What do you think about?



 Performance Review
 -
 I don't see this patch impacting performance, I did not conduct any
 performance tests.


 Coding Review
 -


 In tupdesc.c

 line 202 the existing code is performing a deep copy of ConstrCheck.  Do you
 need to copy nkeys and conkey here as well?

 Then at line 250 ccname is freed but not conkey


I have to look on this


 postgres_ext.h line 55
 + #define PG_DIAG_SCHEMA_NAME    's'
 + #define PG_DIAG_TABLE_NAME    't'
 + #define PG_DIAG_COLUMN_NAMES    'c'
 + #define PG_DIAG_CONSTRAINT_NAME 'n'

 The assignment of letters to parameters seems arbitrary to me, I don't have
 a different non-arbitrary mapping in mind but if anyone else does they
 should speak up.  I think it will be difficult to change this after 9.2 goes
 out.


 elog.c:
 ***
 *** 2197,2202 
 --- 2299,2319 
      if (application_name)
          appendCSVLiteral(buf, application_name);

 +     /* constraint_name */
 +     appendCSVLiteral(buf, edata-constraint_name);
 +     appendStringInfoChar(buf, ',');
 +
 +     /* schema name */
 +     appendCSVLiteral(buf, edata-schema_name);
 +     appendStringInfoChar(buf, ',');

 You need to update config.sgml at the same time you update this format.
 You need to append a , after application name but before constraintName.
 As it stands the CSV log has something like:
 .nbtinsert.c:433,psqla_pkey,public,a,a


ok


 nbtinsert.c

 pg_get_indrelation is named differently than everything else in this file
 (ie _bt...).  My guess is that this function belongs somewhere else but I
 don't know the code well enough to say where you should move it too.


I'll try to get better name, but I would not use a public name like _bt



 Everything I've mentioned above is a minor issue, I will move the patch to
 'waiting for author' and wait for you to release an updated patch.

 Steve Singer


ok

Thank you very much

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Pavel Stehule
2011/6/19 Steve Singer ssinger...@sympatico.ca:
 On 11-06-18 06:36 PM, Steve Singer wrote:

 On 11-06-08 04:14 PM, Pavel Stehule wrote:

 Here is my review of this patch

 Submission Review:
 
 The patch applies cleanly against master
 The patch does not include any documentation updates (see note below to
 update config.sgml)
 The patch does not include any unit tests. At a minimum it should add a
 few tests with verbosity set to verbose


 On second thought tests might not work. Is the only way to get the
 constraint details are in verbose mode where line numbers from the c file
 are also included? If so then this won't work for the regression tests.
 Having the diff comparison fail every time someone makes an unrelated change
 to a source file isn't what we want.


it is reason why patch doesn't any regress test changes. I have to
look, if verbose mode is documented somewhere.

Regards

Pavel Stehule



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql performance - SearchCatCache issue

2011-06-19 Thread Pavel Stehule
2011/6/19 Robert Haas robertmh...@gmail.com:
 On Sat, Jun 18, 2011 at 9:21 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Is this profile expected?

 I've certainly seen profiles before where the catcache overhead was
 significant.  I don't think that I've seen SearchCatCache() quite this
 high on any of the profiling I've done, but then again I don't tend to
 profile the same things you do, so maybe that's not surprising.  I
 think the interesting question is probably where are all those calls
 coming from? and can we optimize any of them away? rather than how
 do we make SearchCatCache() run faster?.   I would be willing to bet
 money that the latter is largely an exercise in futility.

I would not to attack on SearchCatCache. This is relative new area for
me, so I just asked.

The suspect part should be inside exec_assign_value

case PLPGSQL_DTYPE_ARRAYELEM:
{

 

/* Fetch current value of array datum */
exec_eval_datum(estate, target,

arraytypeid, arraytypmod,

oldarraydatum, oldarrayisnull);

/* If target is domain over array,
reduce to base type */
arraytypeid =
getBaseTypeAndTypmod(arraytypeid, arraytypmod);

/* ... and identify the element type */
arrayelemtypeid = get_element_type(arraytypeid);
if (!OidIsValid(arrayelemtypeid))
ereport(ERROR,

(errcode(ERRCODE_DATATYPE_MISMATCH),

errmsg(subscripted object is not an array)));

get_typlenbyvalalign(arrayelemtypeid,

  elemtyplen,

  elemtypbyval,

  elemtypalign);
arraytyplen = get_typlen(arraytypeid);


so any update of array means a access to CatCache.

These data should be cached in some referenced data type info
structure and should be accessed via new exec_eval_array_datum()
function.

Regards

Pavel Stehule



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Identifying no-op length coercions

2011-06-19 Thread Noah Misch
On Sat, Jun 18, 2011 at 11:32:20PM -0400, Robert Haas wrote:
 On Sat, Jun 18, 2011 at 11:12 PM, Robert Haas robertmh...@gmail.com wrote:
  On Sat, Jun 18, 2011 at 11:06 PM, Noah Misch n...@leadboat.com wrote:
  On Sat, Jun 18, 2011 at 10:57:13PM -0400, Robert Haas wrote:
  On Sat, Jun 11, 2011 at 5:13 PM, Noah Misch n...@leadboat.com wrote:
   Sounds good. ?Updated patch attached, incorporating your ideas. ?Before 
   applying
   it, run this command to update the uninvolved pg_proc.h DATA entries:
   ?perl -pi -e 's/PGUID(\s+\d+){4}/$ 0/' src/include/catalog/pg_proc.h
 
  This doesn't quite apply any more. ?I think the pgindent run broke it 
  slightly.
 
  Hmm, I just get two one-line offsets when applying it to current master. 
  ?Note
  that you need to run the perl invocation before applying the patch. ?Could 
  you
  provide full output of your `patch' invocation, along with any reject 
  files?
 
  Ah, crap. ?You're right. ?I didn't follow your directions for how to
  apply the patch. ?Sorry.
 
 I think you need to update the comment in simplify_function() to say
 that we have three strategies, rather than two.  I think it would also
 be appropriate to add a longish comment just before the test that
 calls protransform, explaining what the charter of that function is
 and why the mechanism exists.

Good idea.  See attached.

 Documentation issues aside, I see very little not to like about this.

Great!  Thanks for reviewing.
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 24d7d98..7a380ce 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 4338,4343 
--- 4338,4350 
   /row
  
   row
+   entrystructfieldprotransform/structfield/entry
+   entrytyperegproc/type/entry
+   entryliterallink 
linkend=catalog-pg-procstructnamepg_proc/structname/link.oid/literal/entry
+   entryCalls to function can be simplified by this other 
function/entry
+  /row
+ 
+  row
entrystructfieldproisagg/structfield/entry
entrytypebool/type/entry
entry/entry
diff --git a/src/backend/catalog/index 6250b07..92be0a7 100644
*** a/src/backend/catalog/pg_proc.c
--- b/src/backend/catalog/pg_proc.c
***
*** 304,309  ProcedureCreate(const char *procedureName,
--- 304,310 
values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost);
values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows);
values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType);
+   values[Anum_pg_proc_protransform - 1] = ObjectIdGetDatum(InvalidOid);
values[Anum_pg_proc_proisagg - 1] = BoolGetDatum(isAgg);
values[Anum_pg_proc_proiswindow - 1] = BoolGetDatum(isWindowFunc);
values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer);
diff --git a/src/backend/commands/taindex 912f45c..4dffd64 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 56,61 
--- 56,62 
  #include nodes/nodeFuncs.h
  #include nodes/parsenodes.h
  #include optimizer/clauses.h
+ #include optimizer/planner.h
  #include parser/parse_clause.h
  #include parser/parse_coerce.h
  #include parser/parse_collate.h
***
*** 3495,3501  ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, 
LOCKMODE lockmode)
{
NewColumnValue *ex = lfirst(l);
  
!   ex-exprstate = ExecPrepareExpr((Expr *) ex-expr, estate);
}
  
notnull_attrs = NIL;
--- 3496,3503 
{
NewColumnValue *ex = lfirst(l);
  
!   /* expr already planned */
!   ex-exprstate = ExecInitExpr((Expr *) ex-expr, NULL);
}
  
notnull_attrs = NIL;
***
*** 4398,4404  ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, 
Relation rel,
  
newval = (NewColumnValue *) 
palloc0(sizeof(NewColumnValue));
newval-attnum = attribute.attnum;
!   newval-expr = defval;
  
tab-newvals = lappend(tab-newvals, newval);
tab-rewrite = true;
--- 4400,4406 
  
newval = (NewColumnValue *) 
palloc0(sizeof(NewColumnValue));
newval-attnum = attribute.attnum;
!   newval-expr = expression_planner(defval);
  
tab-newvals = lappend(tab-newvals, newval);
tab-rewrite = true;
***
*** 6707,6712  ATPrepAlterColumnType(List **wqueue,
--- 6709,6717 
/* Fix collations after all else */
assign_expr_collations(pstate, transform);
  
+   /* Plan the expr now so we can accurately assess the need to 
rewrite. */
+   transform = (Node *) expression_planner((Expr *) transform);
+ 
/*
 * Add a work queue item to make ATRewriteTable update the 

Re: [HACKERS] plpgsql performance - SearchCatCache issue

2011-06-19 Thread Pavel Stehule
2011/6/19 Pavel Stehule pavel.steh...@gmail.com:
 2011/6/19 Robert Haas robertmh...@gmail.com:
 On Sat, Jun 18, 2011 at 9:21 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Is this profile expected?

 I've certainly seen profiles before where the catcache overhead was
 significant.  I don't think that I've seen SearchCatCache() quite this
 high on any of the profiling I've done, but then again I don't tend to
 profile the same things you do, so maybe that's not surprising.  I
 think the interesting question is probably where are all those calls
 coming from? and can we optimize any of them away? rather than how
 do we make SearchCatCache() run faster?.   I would be willing to bet
 money that the latter is largely an exercise in futility.

 I would not to attack on SearchCatCache. This is relative new area for
 me, so I just asked.

 The suspect part should be inside exec_assign_value

                case PLPGSQL_DTYPE_ARRAYELEM:
                        {

         

                                /* Fetch current value of array datum */
                                exec_eval_datum(estate, target,

 arraytypeid, arraytypmod,

 oldarraydatum, oldarrayisnull);

                                /* If target is domain over array,
 reduce to base type */
                                arraytypeid =
 getBaseTypeAndTypmod(arraytypeid, arraytypmod);

                                /* ... and identify the element type */
                                arrayelemtypeid = 
 get_element_type(arraytypeid);
                                if (!OidIsValid(arrayelemtypeid))
                                        ereport(ERROR,

 (errcode(ERRCODE_DATATYPE_MISMATCH),

 errmsg(subscripted object is not an array)));

                                get_typlenbyvalalign(arrayelemtypeid,

  elemtyplen,

  elemtypbyval,

  elemtypalign);
                                arraytyplen = get_typlen(arraytypeid);


 so any update of array means a access to CatCache.

 These data should be cached in some referenced data type info
 structure and should be accessed via new exec_eval_array_datum()
 function.

Using a cache for these values increased speed about 30% - I'll
prepare patch to next commitfest.

Regards

Pavel Stehule


 Regards

 Pavel Stehule



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] DROP Reworks Part.0 - 'missing_ok' support of get_object_address

2011-06-19 Thread Kohei KaiGai
Sorry, the previous revision did not update regression test part
towards the latest one.

2011/6/19 Kohei KaiGai kai...@kaigai.gr.jp:
 Thanks for your review.

 2011/6/19 Robert Haas robertmh...@gmail.com:
 On Tue, Jun 14, 2011 at 8:06 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The attached patch is a preparation to rework implementation of DROP 
 statement
 using a common code. That intends to apply get_object_address() to resolve 
 name
 of objects to be removed, and eventually minimizes the number of places to 
 put
 permission checks.

 Its first step is an enhancement of get_object_address; to accept 
 'missing_ok'
 argument to handle cases when IF EXISTS clause is supplied.
 If 'missing_ok' was true and the supplied name was not found, the patched
 get_object_address() returns an ObjectAddress with InvalidOid as objectId.
 If 'missing_ok' was false, its behavior is not changed.

 Let's consistently make missing_ok the last argument to all of the
 functions to which we're adding it.

 OK. I revised position of the 'missing_ok' argument.

 I don't think it's a good idea for get_relation_by_qualified_name() to
 be second-guessing the error message that RangeVarGetRelid() feels
 like throwing.

 OK. I revised the patch to provide 'true' on RangeVarGetRelid().
 Its side effect is on error messages when user gives undefined object name.

 I think that attempting to fetch the column foo.bar when foo doesn't
 exist should be an error even if missing_ok is true.  I believe that's
 consistent with what we do elsewhere.  (If it *were* necessary to open
 the relation without failing if it's not there, you could use
 try_relation_openrv instead of doing as you've done here.)

 It was fixed. AlterTable() already open the relation (without missing_ok)
 in the case when we drop a column, so I don't think we need to acquire
 relation locks if the supplied column was missing.

 There is certainly a more compact way of writing the logic in
 get_object_address_typeobj.  Also, I think that function should be
 called get_object_address_type(); the obj on the end seems
 redundant.

 I renamed the function name to get_object_address_type(), and
 consolidate initialization of ObjectAddress variables.

 Thanks,
 --
 KaiGai Kohei kai...@kaigai.gr.jp




-- 
KaiGai Kohei kai...@kaigai.gr.jp


pgsql-v9.2-drop-reworks-part-0.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 05:10 , Steve Singer wrote:
 On 11-06-18 06:36 PM, Steve Singer wrote:
 On 11-06-08 04:14 PM, Pavel Stehule wrote:
 
 Here is my review of this patch
 
 Submission Review:
 
 The patch applies cleanly against master
 The patch does not include any documentation updates (see note below to 
 update config.sgml)
 The patch does not include any unit tests. At a minimum it should add a few 
 tests with verbosity set to verbose
 
 
 On second thought tests might not work. Is the only way to get the constraint 
 details are in verbose mode where line numbers from the c file are also 
 included? If so then this won't work for the regression tests.   Having the 
 diff comparison fail every time someone makes an unrelated change to a source 
 file isn't what we want.

Speaking as someone who's wished for the feature that Pavel's patch provides
many times in the past - shouldn't there also be a field containing the
offending value? If we had that, it'd finally be possible to translate 
constraint-related error messages to informative messages for the user.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't use cp -i in the example WAL archive_command.

2011-06-19 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Perhaps we should recommend:
  cd /path  test ! -f %f  cp %p %f
  That is shorter and removes the duplicate problem.
 
 Um ... %p is relative to the database directory.

Oh, I see now.  I had thought it was an absolute path, but good thing it
isn't because of the possible need for quoting characters in the path
name.

The only other idea I have is:

NEW=/path  test ! -f $NEW/%f  cp %p $NEW/%f

but that is not going to work with csh-based shells, while I think the
original is fine.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Pavel Stehule
2011/6/19 Florian Pflug f...@phlo.org:
 On Jun19, 2011, at 05:10 , Steve Singer wrote:
 On 11-06-18 06:36 PM, Steve Singer wrote:
 On 11-06-08 04:14 PM, Pavel Stehule wrote:

 Here is my review of this patch

 Submission Review:
 
 The patch applies cleanly against master
 The patch does not include any documentation updates (see note below to 
 update config.sgml)
 The patch does not include any unit tests. At a minimum it should add a few 
 tests with verbosity set to verbose


 On second thought tests might not work. Is the only way to get the 
 constraint details are in verbose mode where line numbers from the c file 
 are also included? If so then this won't work for the regression tests.   
 Having the diff comparison fail every time someone makes an unrelated change 
 to a source file isn't what we want.

 Speaking as someone who's wished for the feature that Pavel's patch provides
 many times in the past - shouldn't there also be a field containing the
 offending value? If we had that, it'd finally be possible to translate
 constraint-related error messages to informative messages for the user.

The value is available in almost cases. There is only one issue - this
should not be only one value - it could be list of values - so basic
question is about format and property name. PostgreSQL doesn't hold
relation between column and column constraint - all column constraints
are transformed to table constrains. All column informations are
derived from constraint - so when constraint is a  b and this
constraint is false, we have two values.

Maybe there is second issue (little bit - performance - you have to
call a output function), But I agree, so this information is very
interesting and can help.

I am open for any ideas in this direction.

Regards

Pavel


 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 14:03 , Pavel Stehule wrote:
 2011/6/19 Florian Pflug f...@phlo.org:
 Speaking as someone who's wished for the feature that Pavel's patch provides
 many times in the past - shouldn't there also be a field containing the
 offending value? If we had that, it'd finally be possible to translate
 constraint-related error messages to informative messages for the user.
 
 The value is available in almost cases. There is only one issue - this
 should not be only one value - it could be list of values - so basic
 question is about format and property name. PostgreSQL doesn't hold
 relation between column and column constraint - all column constraints
 are transformed to table constrains. All column informations are
 derived from constraint - so when constraint is a  b and this
 constraint is false, we have two values.

Hm, you could rename COLUMN to VALUE, make it include the value,
and repeat it for every column in the constraint or index that caused
the error. For example, you'd get

VALUE: a:5
VALUE: b:3

if you violated a CHECK constraint asserting that a  b.

You could also use that in custom constraint enforcement triggers -
i.e. I'm maintaining an application that enforces foreign key
constraints for arrays. With VALUE fields available, I could emit
one value field for every offending array member.

If repeating the same field multiple times is undesirable, the
information could of course be packed into one field, giving

VALUES: (a:5, b:3)

for the example from above. My array FK constraint trigger would
the presumably report

VALUES: (array_field:42, array_field:23)

if array members 42 and 23 lacked a corresponding row in the
referenced table.

That'd also work work for foreign keys and unique constraints. Exclusion
constraints are harder, because there the conflicting value might also
be of interest. (Hm, actually it might even be for unique indices if
some columns are NULL - not sure right now if there's a mode where we
treat NULL as a kind of wildcard...).

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't use cp -i in the example WAL archive_command.

2011-06-19 Thread Andrew Dunstan



On 06/19/2011 08:00 AM, Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjianbr...@momjian.us  writes:

Perhaps we should recommend:
cd /path  test ! -f %f  cp %p %f
That is shorter and removes the duplicate problem.

Um ... %p is relative to the database directory.

Oh, I see now.  I had thought it was an absolute path, but good thing it
isn't because of the possible need for quoting characters in the path
name.

The only other idea I have is:

NEW=/path  test ! -f $NEW/%f  cp %p $NEW/%f

but that is not going to work with csh-based shells, while I think the
original is fine.


Isn't this invoked via system()? AFAIK that should always invoke a 
Bourne shell or equivalent on Unix. But in any case, I think you're 
trying to solve a problem that doesn't really exist.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Stark
On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
our interface for telling postgres to gather stats and we generally
agree that having stats and modelling the system behaviour as
accurately as practical is the right direction so we need a specific
reason why this stat and this bit of modeling is a bad idea before we
dismiss it.

I think the kernel of truth in these concerns is simply that
everything else ANALYZE looks at mutates only on DML. If you load the
same data into two databases and run ANALYZE you'll get (modulo random
sampling) the same stats. And if you never modify it and analyze it
again a week later you'll get the same stats again. So autovacuum can
guess when to run analyze based on the number of DML operations, it
can run it without regard to how busy the system is, and it can hold
off on running it if the data hasn't changed.

In the case of the filesystem buffer cache the cached percentage will
vary over time regardless of whether the data changes. Plain select
queries will change it, even other activity outside the database will
change it. There are a bunch of strategies for mitigating this
problem: we might want to look at the cache situation more frequently,
discount the results we see since more aggressively, and possibly
maintain a kind of running average over time.

There's another problem which I haven't seen mentioned. Because the
access method will affect the cache there's the possibility of
feedback loops. e.g. A freshly loaded system prefers sequential scans
for a given table because without the cache the seeks of random reads
are too expensive... causing it to never load that table into cache...
causing that table to never be cached and never switch to an index
method. It's possible there are mitigation strategies for this as well
such as keeping a running average over time and discounting the
estimates with some heuristic values.







-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [WIP] Support for ANY/ALL(array) op scalar (Was: Re: Boolean operators without commutators vs. ALL/ANY)

2011-06-19 Thread Greg Stark
On Thu, Jun 16, 2011 at 6:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  (I will refrain for
 the moment from speculating whether we'll ever have an index type that
 supports regexp match directly as an indexable operator...)

Fwiw I looked into this at one point and have some ideas if anyone is
keen to try it.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Pavel Stehule
2011/6/19 Florian Pflug f...@phlo.org:
 On Jun19, 2011, at 14:03 , Pavel Stehule wrote:
 2011/6/19 Florian Pflug f...@phlo.org:
 Speaking as someone who's wished for the feature that Pavel's patch provides
 many times in the past - shouldn't there also be a field containing the
 offending value? If we had that, it'd finally be possible to translate
 constraint-related error messages to informative messages for the user.

 The value is available in almost cases. There is only one issue - this
 should not be only one value - it could be list of values - so basic
 question is about format and property name. PostgreSQL doesn't hold
 relation between column and column constraint - all column constraints
 are transformed to table constrains. All column informations are
 derived from constraint - so when constraint is a  b and this
 constraint is false, we have two values.

 Hm, you could rename COLUMN to VALUE, make it include the value,
 and repeat it for every column in the constraint or index that caused
 the error. For example, you'd get

 VALUE: a:5
 VALUE: b:3


I don't have a idea. These data should be available via GET
DIAGNOSTICS statement, so you can't use a repeated properties. I would
to use a simple access to column names because it is in ANSI SQL.


 if you violated a CHECK constraint asserting that a  b.

 You could also use that in custom constraint enforcement triggers -
 i.e. I'm maintaining an application that enforces foreign key
 constraints for arrays. With VALUE fields available, I could emit
 one value field for every offending array member.

 If repeating the same field multiple times is undesirable, the
 information could of course be packed into one field, giving

 VALUES: (a:5, b:3)

 for the example from above. My array FK constraint trigger would
 the presumably report

 VALUES: (array_field:42, array_field:23)


there should be some similar, but probably we need to have some
dictionary type in core before. If we are too hurry, then we can have
a problem with backing compatibility :(. Theoretically we have a know
columns in COLUMNS property, so we can serialize values in same order
in serialized array format.

COLUMNS: a, b, c
VALUES: some, else, some with \ or , 

Regards

Pavel


 if array members 42 and 23 lacked a corresponding row in the
 referenced table.

 That'd also work work for foreign keys and unique constraints. Exclusion
 constraints are harder, because there the conflicting value might also
 be of interest. (Hm, actually it might even be for unique indices if
 some columns are NULL - not sure right now if there's a mode where we
 treat NULL as a kind of wildcard...).

 best regards,
 Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Florian Pflug
Hi

It looks like we've failed to reach an agreement on how to
proceed on the issue with missing commutators for the various
text matching operators (~, ~~, and their case-insensitive
variants). We do seem to have agreed, however, that adding
commutators for the non-deprecated operators which lack them
is generally a Good Idea.

Amidst the discussion, Alvaro suggested that we resolve the issue
by adding a distinct type for patterns as opposed to text. That'd
allow us to make ~ it's own commutator by defining both
  text ~ pattern
and
  pattern ~ text.

We'd of course need to keep the operator
  text ~ text
and make it behave like
  text ~ pattern.
Thus, if someone wrote
  'a_pattern' ~ 'some_text'
(i.e. forgot to cast 'a_pattern' to type pattern), he wouldn't
get an error but instead unintended behaviour. If we want to avoid
that too, we'd have to name the new operators something other than
~.

There's also the question of how we deal with ~~ (the operator
behind LIKE). We could either re-use the type pattern for that,
meaning that values of type pattern would represent any kind of
text pattern, not necessarily a regular expression. Alternatively,
we could represent LIKE pattern by a type distinct from pattern,
say likepattern. Finally, we could handle LIKE like we handle
SIMILAR TO, i.e. define a function that transforms a LIKE pattern
into a regular expression, and deprecate the ~~ operator and friends.

The last option looks appealing from a code complexity point of view,
but might severely harm performance of LIKE and ILIKE comparisons.

Comments? Opinions?

best regards,
Florian Pflug



Someone 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Cédric Villemain
2011/6/19 Greg Stark st...@mit.edu:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

Yes.


 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well

Yeah, that's one of the problem to solve. So far I've tried to keep a
planner which behave as currently when the rel_oscache == 0. So that
fresh server will have the same planning than a server without
rel_oscache.

Those points are to be solved in costestimates (and selfunc). For this
case, there is a balance between page filtering cost and index access
cost. *And* once  the table is in cache, the index cost less and can
be better because it need less filtering (less rows, less pages, less
work). there is also a possible issue here (if using the index remove
the table from cache) but I am not too much afraid of that right now.

 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

yes, definitively something to think about. My biggest fear here is
for shared servers (when there is competition between services to use
the OS cache, shooting down kernel cache strategies).

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add GiST support for BOX @ POINT queries

2011-06-19 Thread Hitoshi Harada
2011/6/17 Andrew Tipton andrew.t.tip...@gmail.com:
 On Fri, Jun 10, 2011 at 22:16, Hitoshi Harada umi.tan...@gmail.com wrote:

 Isn't it worth adding new consistent function for those purposes? The
 approach in the patch as stands looks kludge to me.

 Thanks for your review.  Coming back to this patch after a few months'
 time, I have to say it looks pretty hackish to my eyes as well. :)

 I've attempted to add a new consistent function,
 gist_boxpoint_consistent(), but the GiST subsystem doesn't call it --
 it continues to call gist_box_consistent().  My very simple testcase
 is:

 CREATE TABLE test (key TEXT PRIMARY KEY, boundary BOX NOT NULL);
 CREATE INDEX ON test USING gist (boundary);
 INSERT INTO test VALUES ('a', '(2,2,5,5)'), ('b', '(4,4,8,8)'), ('c',
 '(7,7,11,11)');
 SELECT * FROM test WHERE boundary @ '(4,4)'::POINT;

 Prior to my patch, this query is executed as a straightforward seqscan.

 Once I add a new strategy to pg_amop.h:
 + DATA(insert ( 2593   603 600 7 s  433 783 0 ));

 (603 is the BOX oid, 600 is the POINT oid, and 433 is the @ operator oid):
 ...the plan switches to an index scan and gist_box_consistent() is
 called;  at this point, the query fails to return the correct results.

 But even after adding the new consistent proc to pg_proc.h:
 + DATA(insert OID = 8000 (  gist_boxpoint_consistent    PGNSP PGUID 12
 1 0 0 f f f t f i 5 0 16 2281 600 23 26 2281 _null_ _null_ _null_
 _null_   gist_boxpoint_consistent _null_ _null_ _null_ ));

 And adding it as a new support function in pg_amproc.h:
 + DATA(insert ( 2593   603 600 1 8000 ));
 + DATA(insert ( 2593   603 600 2 2583 ));
 + DATA(insert ( 2593   603 600 3 2579 ));
 + DATA(insert ( 2593   603 600 4 2580 ));
 + DATA(insert ( 2593   603 600 5 2581 ));
 + DATA(insert ( 2593   603 600 6 2582 ));
 + DATA(insert ( 2593   603 600 7 2584 ));

 ...my gist_boxpoint_consistent() function still doesn't get called.

 At this point I'm a bit lost -- while pg_amop.h has plenty of examples
 of crosstype comparison operators for btree index methods, there are
 none for GiST.  Is GiST somehow a special case in this regard?

It was I that was lost. As Tom mentioned, GiST indexes have records in
pg_amop in their specialized way. I found gist_point_consistent has
some kind of hack for that and pg_amop for point_ops records have
multiple crosstype for that. So, if I understand correctly your first
approach modifying gist_box_consistent was the right way, although
trivial issues should be fixed. Also, you may want to follow point_ops
when you are worried if the counterpart operator of commutator should
be registered or not.

Looking around those mechanisms, it occurred to me that you mentioned
only box @ point. Why don't you add circly @ point, poly @ point as
well as box? Is that hard?

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Cédric Villemain
2011/5/14 Robert Haas robertmh...@gmail.com:
 On Fri, May 13, 2011 at 6:34 PM, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 If reviewers agree it is safe and benchmarks make evidences that no
 basic performance  issue are raised, then let's see if next items have
 blockers or can be done.

 Sounds right to me.

and recent stuff on VM will allow us to move forward it seems !

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Cédric Villemain
2011/5/11 Bruce Momjian br...@momjian.us:
 Cédric Villemain wrote:
 2011/5/10 Kevin Grittner kevin.gritt...@wicourts.gov:
  Simon Riggs si...@2ndquadrant.com wrote:
  The typical speed up for non-covered indexes will come when we
  access a very large table (not in cache) via an index scan that is
  smaller than a bitmapindex scan. Will we be able to gauge
  selectivities sufficiently accurately to be able to pinpoint that
  during optimization? How will we know that the table is not in
  cache? Or is this an optimisation in the executor for a bitmapheap
  scan?
 
  I would continue to object to using current cache contents for plan
  choice because of plan instability and the fact that an odd initial
  cache load could skew plans in a bad direction indefinitely. ?I do
  agree (and have already posted) that I think the hardest part of
  this might be developing a good cost model. ?I doubt that's an
  insoluble problem, especially since it is something we can refine
  over time as we gain experience with the edge cases.

 you will have the same possible instability in planning with the
 index(-only?) scan because we may need to access heap anyway and this
 needs is based on estimation, or I miss something ? I understood the
 idea was just to bypass the heap access *if* we can for *this*
 heap-page.

 In reality, I am not really scared by plan instability because of a
 possible PG/OS cache estimation. The percentages remain stable in my
 observations ... I don't know yet how it will go for vis map.

 And, we already have plan instability currently, which is *good* : at
 some point a seq scan is better than an bitmap heap scan. Because the
 relation size change and because ANALYZE re-estimate the distribution
 of the data. I will be very happy to issue ANALYZE CACHE as I have to
 ANALYZE temp table for large query if it allows the planner to provide
 me the best plan in some scenario...but this is another topic, sorry
 for the digression..

 Good point --- we would be making plan decisions based on the visibility
 map coverage.  The big question is whether visibility map changes are
 more dynamic than the values we already plan against, like rows in the
 table, table size, and value distributions.  I don't know the answer.


Robert, I though of Covered-Index as just a usage of the vis map:
don't take the heap block if not needed. This look easier to do and
better in the long term (because read-only DB may quickly turn into a
no-heap access DB for example). Thus this is not real covered-index.
Did you want to implement real covered-index and did you have ideas on
how to do that ? Or just an optimization of the current
planner/executor on index usage ?
___

I don't know VM internals:

 * do we have a counter of ALL_VISIBLE flag set on a relation ? (this
should be very good for planner)
 * do we need a pg_class.rel_vmvisible ?! (I have hands up, don't
shoot pleeaase)
 * is it ok to parse VM for planning (I believe it is not) ?

Ideas ?

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: Allow \dd to show constraint comments

2011-06-19 Thread Josh Kupershmidt
On Sun, Jun 19, 2011 at 3:25 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2011/6/18 Josh Kupershmidt schmi...@gmail.com:
 I think the v5 patch should be marked as 'Ready for Committer'

I think we still need to handle my Still TODO concerns noted
upthread. I don't have a lot of time this weekend due to a family
event, but I was mulling over putting in a is_system boolean column
into pg_comments and then fixing psql's \dd[S] to use pg_comments. But
I am of course open to other suggestions.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Make external_pid_file world readable

2011-06-19 Thread David Fetter
On Sat, Jun 18, 2011 at 10:18:50PM +, Peter Eisentraut wrote:
 Make external_pid_file world readable

Should this be back-patched?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI tuning points

2011-06-19 Thread Kevin Grittner
Robert Haas  wrote:
 Kevin Grittner  wrote:
 
 I'm certainly open to suggestions for better wording.
 
 How about something like this:
 
 When the system is forced to combine multiple page-level predicate
 locks into a single relation-level predicate lock because the
 predicate lock table is short of memory, an increase in the rate of
 serialization failures may occur. You can avoid this by increasing
 max_pred_locks_per_transaction.
 
 A sequential scan will always necessitate a table-level predicate
 lock. This can result in an increased rate of serialization failures.
 It may be helpful to encourage the use of index scans by reducing
 random_page_cost or increasing cpu_tuple_cost. Be sure to 
 
That does seem better.  Thanks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Libpq enhancement

2011-06-19 Thread Jeff Shanab
I am wondering If I am missing something obvious. If not, I have a suggestion 
for plpgsql.

Stored procedures can accept rows.
Libpq can receive rows (PQResult).

Wouldn't it be a great interface if PQResult was bi-directional? Create a 
result set on the client then call the database with a command.

Perhaps...
PQinsert(PQResult,schema.table);  //iterate thru rows inserting
PQupdate(PQResult,schema.table); //iterate thru rows updateing

PQexec(connection,scheme.function,PQResult) //iterate thru rows passing row 
as arg to stored procedure.



Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

2011-06-19 Thread Peter Geoghegan
I took another look at this this evening, and realised that my
comments could be a little clearer.

Attached revision cleans them up a bit.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index aa0b029..691ac42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10161,7 +10161,7 @@ retry:
 	/*
 	 * Wait for more WAL to arrive, or timeout to be reached
 	 */
-	WaitLatch(XLogCtl-recoveryWakeupLatch, 500L);
+	WaitLatch(XLogCtl-recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, 500L);
 	ResetLatch(XLogCtl-recoveryWakeupLatch);
 }
 else
diff --git a/src/backend/port/unix_latch.c b/src/backend/port/unix_latch.c
index 6dae7c9..f65d389 100644
--- a/src/backend/port/unix_latch.c
+++ b/src/backend/port/unix_latch.c
@@ -93,7 +93,9 @@
 #endif
 
 #include miscadmin.h
+#include postmaster/postmaster.h
 #include storage/latch.h
+#include storage/pmsignal.h
 #include storage/shmem.h
 
 /* Are we currently in WaitLatch? The signal handler would like to know. */
@@ -188,22 +190,25 @@ DisownLatch(volatile Latch *latch)
  * backend-local latch initialized with InitLatch, or a shared latch
  * associated with the current process by calling OwnLatch.
  *
- * Returns 'true' if the latch was set, or 'false' if timeout was reached.
+ * Returns bit field indicating which condition(s) caused the wake-up.
+ *
+ * Note that there is no guarantee that callers will have all wake-up conditions
+ * returned, but we will report at least one.
  */
-bool
-WaitLatch(volatile Latch *latch, long timeout)
+int
+WaitLatch(volatile Latch *latch, int wakeEvents, long timeout)
 {
-	return WaitLatchOrSocket(latch, PGINVALID_SOCKET, false, false, timeout)  0;
+	return WaitLatchOrSocket(latch, wakeEvents, PGINVALID_SOCKET, timeout);
 }
 
 /*
  * Like WaitLatch, but will also return when there's data available in
- * 'sock' for reading or writing. Returns 0 if timeout was reached,
- * 1 if the latch was set, 2 if the socket became readable or writable.
+ * 'sock' for reading or writing.
+ *
+ * Returns same bit mask and makes same guarantees as WaitLatch.
  */
 int
-WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
-  bool forWrite, long timeout)
+WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock, long timeout)
 {
 	struct timeval tv,
 			   *tvp = NULL;
@@ -211,12 +216,13 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
 	fd_set		output_mask;
 	int			rc;
 	int			result = 0;
+	bool		found = false;
 
 	if (latch-owner_pid != MyProcPid)
 		elog(ERROR, cannot wait on a latch owned by another process);
 
 	/* Initialize timeout */
-	if (timeout = 0)
+	if (timeout = 0  (wakeEvents  WL_TIMEOUT))
 	{
 		tv.tv_sec = timeout / 100L;
 		tv.tv_usec = timeout % 100L;
@@ -224,7 +230,7 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
 	}
 
 	waiting = true;
-	for (;;)
+	do
 	{
 		int			hifd;
 
@@ -235,16 +241,30 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
 		 * do that), and the select() will return immediately.
 		 */
 		drainSelfPipe();
-		if (latch-is_set)
+		if (latch-is_set  (wakeEvents  WL_LATCH_SET))
 		{
-			result = 1;
+			result |= WL_LATCH_SET;
+			found = true;
+			/* Leave loop immediately, avoid blocking again.
+			 *
+			 * Don't attempt to report any other reason
+			 * for returning to callers that may have
+			 * happened to coincide.
+			 */
 			break;
 		}
 
 		FD_ZERO(input_mask);
 		FD_SET(selfpipe_readfd, input_mask);
+
+		if (wakeEvents  WL_POSTMASTER_DEATH)
+		{
+			FD_SET(postmaster_alive_fds[POSTMASTER_FD_WATCH], input_mask);
+			if (postmaster_alive_fds[POSTMASTER_FD_WATCH]  hifd)
+hifd = postmaster_alive_fds[POSTMASTER_FD_WATCH];
+		}
 		hifd = selfpipe_readfd;
-		if (sock != PGINVALID_SOCKET  forRead)
+		if (sock != PGINVALID_SOCKET  (wakeEvents  WL_SOCKET_READABLE))
 		{
 			FD_SET(sock, input_mask);
 			if (sock  hifd)
@@ -252,7 +272,7 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
 		}
 
 		FD_ZERO(output_mask);
-		if (sock != PGINVALID_SOCKET  forWrite)
+		if (sock != PGINVALID_SOCKET  (wakeEvents  WL_SOCKET_WRITEABLE))
 		{
 			FD_SET(sock, output_mask);
 			if (sock  hifd)
@@ -268,20 +288,35 @@ WaitLatchOrSocket(volatile Latch *latch, pgsocket sock, bool forRead,
 	(errcode_for_socket_access(),
 	 errmsg(select() failed: %m)));
 		}
-		if (rc == 0)
+		if (rc == 0  (wakeEvents  WL_TIMEOUT))
 		{
 			/* timeout exceeded */
-			result = 0;
-			break;
+			result |= WL_TIMEOUT;
+			found = true;
 		}
-		if (sock != PGINVALID_SOCKET 
-			((forRead  FD_ISSET(sock, input_mask)) ||
-			 (forWrite  FD_ISSET(sock, output_mask
+		if (sock != PGINVALID_SOCKET)
 		{
-			result = 2;
-			break;/* data available in socket */
+			if ((wakeEvents  

Re: [HACKERS] Libpq enhancement

2011-06-19 Thread Andrew Chernow

On 6/19/2011 11:04 AM, Jeff Shanab wrote:

I am wondering If I am missing something obvious. If not, I have a suggestion
for plpgsql.

Stored procedures can accept rows.

Libpq can receive rows (PQResult).

Wouldn’t it be a great interface if PQResult was “bi-directional”? Create a
result set on the client then call the database with a command.

Perhaps…

PQinsert(PQResult,”schema.table”); //iterate thru rows inserting

PQupdate(PQResult,”schema.table”); //iterate thru rows updateing

PQexec(connection,”scheme.function”,PQResult) //iterate thru rows passing row as
arg to stored procedure.



Have you looked into libpqtypes?  It allows you to pack nested structures/arrays 
and pass them as query/function parameters.


http://pgfoundry.org/projects/libpqtypes/
http://libpqtypes.esilo.com/ (docs)

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Jeff Davis
On Mon, 2011-06-06 at 14:03 -0400, Robert Haas wrote:
 The attached patch refactors heap_hot_search_buffer() so that
 index_getnext() can use it, and modifies index_getnext() to do so.

Attached is a version of the patch that applies cleanly to master.

Regards,
Jeff Davis


heap-hot-search-buffer.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Jeff Davis
On Sun, 2011-06-19 at 10:50 -0700, Jeff Davis wrote:
 On Mon, 2011-06-06 at 14:03 -0400, Robert Haas wrote:
  The attached patch refactors heap_hot_search_buffer() so that
  index_getnext() can use it, and modifies index_getnext() to do so.
 
 Attached is a version of the patch that applies cleanly to master.

In heap_hot_search_buffer:

  +   /* If this is not the first call, previous call returned
 a (live!) tuple */
  if (all_dead)
  -   *all_dead = true;
  +   *all_dead = !first_call;

I think that's a typo: it should be:

  +   *all_dead = first_call;

Right?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types and extensions

2011-06-19 Thread Jeff Davis
On Sun, 2011-06-19 at 12:24 +0200, Martijn van Oosterhout wrote:
 Collation checking is generally done by the planner. I don't see why
 the input function should check, the result of an input function is by
 definition DEFAULT. It's up to the 'in' operator to check.
 
 Note that the whole idea of collation is not really supposed to be
 assigned to object for storage.  How that can be resolved I'm not sure.

I think if we just say that it's a property of the range type
definition, then that's OK. It's similar to specifying a non-default
btree opclass for the range type -- it just changes which total order
the range type adheres to.

If you meant that the collation shouldn't be stored along with the value
itself, then I agree.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 2:01 PM, Jeff Davis pg...@j-davis.com wrote:
 On Sun, 2011-06-19 at 10:50 -0700, Jeff Davis wrote:
 On Mon, 2011-06-06 at 14:03 -0400, Robert Haas wrote:
  The attached patch refactors heap_hot_search_buffer() so that
  index_getnext() can use it, and modifies index_getnext() to do so.

 Attached is a version of the patch that applies cleanly to master.

 In heap_hot_search_buffer:

  +       /* If this is not the first call, previous call returned
             a (live!) tuple */
          if (all_dead)
  -               *all_dead = true;
  +               *all_dead = !first_call;

 I think that's a typo: it should be:

  +               *all_dead = first_call;

Yikes.  I think you are right.  It's kind of scary that the regression
tests passed with that mistake.

New patch attached, with that one-line change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


heap-hot-search-refactoring-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 10:44 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 and recent stuff on VM will allow us to move forward it seems !

Yep, looks promising.  The heap_hot_search_buffer refactoring patch is
related as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Yikes.  I think you are right.  It's kind of scary that the regression
 tests passed with that mistake.

Can we add a test that exposes that mistake?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 11:12 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 Good point --- we would be making plan decisions based on the visibility
 map coverage.  The big question is whether visibility map changes are
 more dynamic than the values we already plan against, like rows in the
 table, table size, and value distributions.  I don't know the answer.

 Robert, I though of Covered-Index as just a usage of the vis map:
 don't take the heap block if not needed. This look easier to do and
 better in the long term (because read-only DB may quickly turn into a
 no-heap access DB for example). Thus this is not real covered-index.
 Did you want to implement real covered-index and did you have ideas on
 how to do that ? Or just an optimization of the current
 planner/executor on index usage ?

If by a real covered index you mean one that includes visibility
info in the index - I have no plans to work on anything like that.  If
we were to do that, the index would become much larger and less
efficient, whereas the approach of just optimizing the way our
existing indexes are used doesn't have that disadvantage.  It also
sounds like a lot of work. Now, if someone else wants to demonstrate
that it has advantages that are worth the costs and go do it, more
power to said person, but I'm unexcited about it.

 I don't know VM internals:

  * do we have a counter of ALL_VISIBLE flag set on a relation ? (this
 should be very good for planner)
  * do we need a pg_class.rel_vmvisible ?! (I have hands up, don't
 shoot pleeaase)

Evidently I'm developing a more frightening reputation than I would hope.  :-(

Anyway, yes, I do believe we need a table-level statistic for the
percentage of the visibility map bits that are believed to be set.
Having said that I think we need it, let me also say that I'm a bit
skeptical about how well it will work.  There are two problems:

1. Consider a query like SELECT a, b FROM foo WHERE a = 1.  To
accurately estimate the cost of executing this query via an index-only
scan (on an index over foo (a, b)), we need to know (i) the percentage
of rows in the table for which a = 1 and (ii) the percentage *of those
rows* which are on an all-visible page.  We can assume that if 80% of
the rows in the table are on all-visible pages, then 80% of the rows
returned by this query will be on all-visible pages also, but that
might be wildly wrong.  This is similar to the problem of costing
SELECT * FROM foo WHERE a = 1 AND b = 1 - we know the fraction of
rows where a = 1 and the fraction where b = 1, but there's no
certainty that multiplying those values will produce an accurate
estimate for the conjunction of those conditions.  The problem here is
not as bad as the general multi-column statistics problem because a
mistake will only bollix the cost, not the row count estimate, but
it's still not very nice.

2. Since VACUUM and ANALYZE often run together, we will be estimating
the percentage of rows on all-visible pages just at the time when that
percentage is highest.  This is not exactly wonderful, either...

I have a fair amount of hope that even with these problems we can come
up with some adjustment to the planner that is better than just
ignoring the problem, but I am not sure how difficult it will be.

  * is it ok to parse VM for planning (I believe it is not) ?

It doesn't seem like a good idea to me, but I just work here.  I'm not
sure what that would buy us.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 2:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Yikes.  I think you are right.  It's kind of scary that the regression
 tests passed with that mistake.

 Can we add a test that exposes that mistake?

Not sure.  We'd have to figure out how to reliably tickle it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well
 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

*scratches head*

Well, yeah.  I completely agree with you that these are the things we
need to worry about.  Maybe I did a bad job explaining myself, because
ISTM you said my concerns were FUD and then went on to restate them in
different words.

I'm not bent out of shape about using ANALYZE to try to gather the
information.  That's probably a reasonable approach if it turns out we
actually need to do it at all.  I am not sure we do.  What I've argued
for in the past is that we start by estimating the percentage of the
relation that will be cached based on its size relative to
effective_cache_size, and allow the administrator to override the
percentage on a per-relation basis if it turns out to be wrong.  That
would avoid all of these concerns and allow us to focus on the issue
of how the caching percentages impact the choice of plan, and whether
the plans that pop out are in fact better when you provide information
on caching as input.  If we have that facility in core, then people
can write scripts or plug-in modules to do ALTER TABLE .. SET
(caching_percentage = XYZ) every hour or so based on the sorts of
statistics that Cedric is gathering here, and users will be able to
experiment with a variety of algorithms and determine which ones work
the best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflug f...@phlo.org wrote:
 Amidst the discussion, Alvaro suggested that we resolve the issue
 by adding a distinct type for patterns as opposed to text. That'd
 allow us to make ~ it's own commutator by defining both
  text ~ pattern
 and
  pattern ~ text.

That's kind of a neat idea.  There might be an efficiency benefit to
having a regex type that is precompiled by the input function.

 There's also the question of how we deal with ~~ (the operator
 behind LIKE). We could either re-use the type pattern for that,
 meaning that values of type pattern would represent any kind of
 text pattern, not necessarily a regular expression. Alternatively,
 we could represent LIKE pattern by a type distinct from pattern,
 say likepattern. Finally, we could handle LIKE like we handle
 SIMILAR TO, i.e. define a function that transforms a LIKE pattern
 into a regular expression, and deprecate the ~~ operator and friends.

 The last option looks appealing from a code complexity point of view,
 but might severely harm performance of LIKE and ILIKE comparisons.

I don't believe it would be a very good idea to try to shoehorn
multiple kinds of patterns into a single pattern type.

I do think this may be the long route to solving this problem, though.
 Is it really this hard to agree on a commutator name?  I mean, I'm
not in love with anything that's been suggested so far, but I could
live with any of them.  An unintuitive operator name for
matches-with-the-arguments-reversed is not going to be the worst wart
we have, by a long shot...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Cédric Villemain
2011/6/19 Robert Haas robertmh...@gmail.com:
 On Sun, Jun 19, 2011 at 11:12 AM, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 Good point --- we would be making plan decisions based on the visibility
 map coverage.  The big question is whether visibility map changes are
 more dynamic than the values we already plan against, like rows in the
 table, table size, and value distributions.  I don't know the answer.

 Robert, I though of Covered-Index as just a usage of the vis map:
 don't take the heap block if not needed. This look easier to do and
 better in the long term (because read-only DB may quickly turn into a
 no-heap access DB for example). Thus this is not real covered-index.
 Did you want to implement real covered-index and did you have ideas on
 how to do that ? Or just an optimization of the current
 planner/executor on index usage ?

 If by a real covered index you mean one that includes visibility
 info in the index - I have no plans to work on anything like that.  If
 we were to do that, the index would become much larger and less
 efficient, whereas the approach of just optimizing the way our
 existing indexes are used doesn't have that disadvantage.  It also
 sounds like a lot of work. Now, if someone else wants to demonstrate
 that it has advantages that are worth the costs and go do it, more
 power to said person, but I'm unexcited about it.

Yes I was thinking of that, and agree with you.


 I don't know VM internals:

  * do we have a counter of ALL_VISIBLE flag set on a relation ? (this
 should be very good for planner)
  * do we need a pg_class.rel_vmvisible ?! (I have hands up, don't
 shoot pleeaase)

 Evidently I'm developing a more frightening reputation than I would hope.  :-(

Nah, I was joking :) don't worry !
Probably because I have already proposing 1 new GUC and at least one
new column to pg_class recently. (and we're not used to change that
frequently)


 Anyway, yes, I do believe we need a table-level statistic for the
 percentage of the visibility map bits that are believed to be set.
 Having said that I think we need it, let me also say that I'm a bit
 skeptical about how well it will work.  There are two problems:

 1. Consider a query like SELECT a, b FROM foo WHERE a = 1.  To
 accurately estimate the cost of executing this query via an index-only
 scan (on an index over foo (a, b)), we need to know (i) the percentage
 of rows in the table for which a = 1 and (ii) the percentage *of those
 rows* which are on an all-visible page.  We can assume that if 80% of
 the rows in the table are on all-visible pages, then 80% of the rows
 returned by this query will be on all-visible pages also, but that
 might be wildly wrong.  This is similar to the problem of costing
 SELECT * FROM foo WHERE a = 1 AND b = 1 - we know the fraction of
 rows where a = 1 and the fraction where b = 1, but there's no
 certainty that multiplying those values will produce an accurate
 estimate for the conjunction of those conditions.  The problem here is
 not as bad as the general multi-column statistics problem because a
 mistake will only bollix the cost, not the row count estimate, but
 it's still not very nice.

 2. Since VACUUM and ANALYZE often run together, we will be estimating
 the percentage of rows on all-visible pages just at the time when that
 percentage is highest.  This is not exactly wonderful, either...

 I have a fair amount of hope that even with these problems we can come
 up with some adjustment to the planner that is better than just
 ignoring the problem, but I am not sure how difficult it will be.

  * is it ok to parse VM for planning (I believe it is not) ?

 It doesn't seem like a good idea to me, but I just work here.  I'm not
 sure what that would buy us.

All true, and I won't be unhappy to have the feature as a bonus, not
expected by the planner(for the cost part) but handled by the
executor.

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 20:56 , Robert Haas wrote:
 On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflug f...@phlo.org wrote:
 Amidst the discussion, Alvaro suggested that we resolve the issue
 by adding a distinct type for patterns as opposed to text. That'd
 allow us to make ~ it's own commutator by defining both
  text ~ pattern
 and
  pattern ~ text.
 
 That's kind of a neat idea.  There might be an efficiency benefit to
 having a regex type that is precompiled by the input function.

Hm, yeah, that though crossed my mind too. A distinct type is only
a first step in that direction though - we'd also need a way to 
attach a parsed representation of a value to a varlena. If you have
an idea how to accomplish that, by all means, out with it! ;-)
The XML would also benefit greatly...

 There's also the question of how we deal with ~~ (the operator
 behind LIKE). We could either re-use the type pattern for that,
 meaning that values of type pattern would represent any kind of
 text pattern, not necessarily a regular expression. Alternatively,
 we could represent LIKE pattern by a type distinct from pattern,
 say likepattern. Finally, we could handle LIKE like we handle
 SIMILAR TO, i.e. define a function that transforms a LIKE pattern
 into a regular expression, and deprecate the ~~ operator and friends.
 
 The last option looks appealing from a code complexity point of view,
 but might severely harm performance of LIKE and ILIKE comparisons.
 
 I don't believe it would be a very good idea to try to shoehorn
 multiple kinds of patterns into a single pattern type.

That depends on whether we expect to eventually make LIKE
use the regex matching machinery. If we do, then it's not really
shoehorning. If we don't, then yeah, using a single type seems
unwise, especially in the light of your idea of keeping a parsed
representation of regexp's around.

 I do think this may be the long route to solving this problem, though.

Yeah - but maybe also the one with the largest benefit in the long run.
We're also just at the beginning of a release cycle, so I think we
have time enough to figure this out...

 Is it really this hard to agree on a commutator name?

So far, every suggestion has been met with fierce opposition, so, um,
yeah it is I'd say...

 I mean, I'm
 not in love with anything that's been suggested so far, but I could
 live with any of them.  An unintuitive operator name for
 matches-with-the-arguments-reversed is not going to be the worst wart
 we have, by a long shot...

Maybe not. But then, if the name is unintuitive enough to impair
readability anyway, then people might just as well define a custom
operator in their database. Since we're capable of inlining SQL
functions, there won't even be a difference in performance. The only
real benefit of having this is core is that you don't have to
go search the catalog to find the meaning of such an operator if
you encounter it in an SQL statement.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hugetables question

2011-06-19 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Sun, Jun 19, 2011 at 11:56:15AM +0200, Rados??aw Smogura wrote:
 I want to implement hugepages for shared memory, to make it transparent I 
 want 

 At least in Linux they're trying to make hugepages transparent, so I'm
 wondering if this is going to make a difference for Linux in the long
 term.

It's already done, at least in Red Hat's kernels, and I would assume
everywhere pretty soon.  I see no percentage in our getting involved
in that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types and extensions

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 20:08 , Jeff Davis wrote:
 On Sun, 2011-06-19 at 12:24 +0200, Martijn van Oosterhout wrote:
 Collation checking is generally done by the planner. I don't see why
 the input function should check, the result of an input function is by
 definition DEFAULT. It's up to the 'in' operator to check.
 
 Note that the whole idea of collation is not really supposed to be
 assigned to object for storage.  How that can be resolved I'm not sure.
 
 I think if we just say that it's a property of the range type
 definition, then that's OK. It's similar to specifying a non-default
 btree opclass for the range type -- it just changes which total order
 the range type adheres to.

In fact, it's exactly the same, because what we *actually* need to
specify is not an opclass but a comparison operator. Which is only
well-defined if you know *both* an opclass *and* a collation.

That reminds me - the conclusion there was that we cannot have
two range types with the same base type but different opclasses,
wasn't it?

AFAIR precisely because otherwise there's no sensible way to handle
  'text' in '[lower,upper]'

If I'm not mistaken about this, that would imply that we also cannot
have two range types with the same base type, the same opclass,
but different collations. Which seems rather unfortunate... In fact,
if that's true, maybe restricing range types to the database collation
would be best...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Cédric Villemain
2011/6/19 Robert Haas robertmh...@gmail.com:
 On Sun, Jun 19, 2011 at 9:38 AM, Greg Stark st...@mit.edu wrote:
 On Tue, Jun 14, 2011 at 4:04 PM, Robert Haas robertmh...@gmail.com wrote:
 1. ANALYZE happens far too infrequently to believe that any data taken
 at ANALYZE time will still be relevant at execution time.
 2. Using data gathered by ANALYZE will make plans less stable, and our
 users complain not infrequently about the plan instability we already
 have, therefore we should not add more.
 3. Even if the data were accurate and did not cause plan stability, we
 have no evidence that using it will improve real-world performance.

 I feel like this is all baseless FUD. ANALYZE isn't perfect but it's
 our interface for telling postgres to gather stats and we generally
 agree that having stats and modelling the system behaviour as
 accurately as practical is the right direction so we need a specific
 reason why this stat and this bit of modeling is a bad idea before we
 dismiss it.

 I think the kernel of truth in these concerns is simply that
 everything else ANALYZE looks at mutates only on DML. If you load the
 same data into two databases and run ANALYZE you'll get (modulo random
 sampling) the same stats. And if you never modify it and analyze it
 again a week later you'll get the same stats again. So autovacuum can
 guess when to run analyze based on the number of DML operations, it
 can run it without regard to how busy the system is, and it can hold
 off on running it if the data hasn't changed.

 In the case of the filesystem buffer cache the cached percentage will
 vary over time regardless of whether the data changes. Plain select
 queries will change it, even other activity outside the database will
 change it. There are a bunch of strategies for mitigating this
 problem: we might want to look at the cache situation more frequently,
 discount the results we see since more aggressively, and possibly
 maintain a kind of running average over time.

 There's another problem which I haven't seen mentioned. Because the
 access method will affect the cache there's the possibility of
 feedback loops. e.g. A freshly loaded system prefers sequential scans
 for a given table because without the cache the seeks of random reads
 are too expensive... causing it to never load that table into cache...
 causing that table to never be cached and never switch to an index
 method. It's possible there are mitigation strategies for this as well
 such as keeping a running average over time and discounting the
 estimates with some heuristic values.

 *scratches head*

 Well, yeah.  I completely agree with you that these are the things we
 need to worry about.  Maybe I did a bad job explaining myself, because
 ISTM you said my concerns were FUD and then went on to restate them in
 different words.

 I'm not bent out of shape about using ANALYZE to try to gather the
 information.  That's probably a reasonable approach if it turns out we
 actually need to do it at all.  I am not sure we do.  What I've argued
 for in the past is that we start by estimating the percentage of the
 relation that will be cached based on its size relative to
 effective_cache_size, and allow the administrator to override the
 percentage on a per-relation basis if it turns out to be wrong.  That
 would avoid all of these concerns and allow us to focus on the issue
 of how the caching percentages impact the choice of plan, and whether
 the plans that pop out are in fact better when you provide information
 on caching as input.  If we have that facility in core, then people
 can write scripts or plug-in modules to do ALTER TABLE .. SET
 (caching_percentage = XYZ) every hour or so based on the sorts of
 statistics that Cedric is gathering here, and users will be able to
 experiment with a variety of algorithms and determine which ones work
 the best.

Robert, I am very surprised.
My patch does offer that.

1st, I used ANALYZE because it is the way to update pg_class I found.
You are suggesting ALTER TABLE instead, that is fine, but give me that
lock-free :) else we have the ahem.. Alvaro's pg_class_ng (I find this
one interesting because it will be lot easier to have different values
on standby server if we find a way to have pg_class_ng 'updatable' per
server)
So, as long as the value can be change without problem, I don't care
where it resides.

2nd, I provided the patches on the last CF, exactly to allow to go to
the exciting part: the cost-estimates changes. (after all, we can work
on the cost estimate, and if later we find a way to use ALTER TABLE or
pg_class_ng, just do it instead of via the ANALYZE magic)

3nd, you can right now write a plugin to set the value of rel_oscache
(exactly like the one you'll do for a ALTER TABLE SET reloscache...)

RelationGetRelationOSCacheInFork(Relation relation, ForkNumber forkNum)
{
   float4 percent = 0;
   /* if a plugin is present, let it manage things */
   if (OSCache_hook)
   

[HACKERS] Re: [WIP] Support for ANY/ALL(array) op scalar (Was: Re: Boolean operators without commutators vs. ALL/ANY)

2011-06-19 Thread David Fetter
On Sun, Jun 19, 2011 at 02:48:58PM +0100, Greg Stark wrote:
 On Thu, Jun 16, 2011 at 6:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  (I will refrain for the moment from speculating whether we'll ever
  have an index type that supports regexp match directly as an
  indexable operator...)
 
 Fwiw I looked into this at one point and have some ideas if anyone
 is keen to try it.

Please post them :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FOR KEY LOCK foreign keys

2011-06-19 Thread Jesper Krogh
I hope this hasn't been forgotten. But I cant see it has been committed 
or moved

into the commitfest process?

Jesper


On 2011-03-11 16:51, Noah Misch wrote:

On Fri, Feb 11, 2011 at 02:13:22AM -0500, Noah Misch wrote:

Automated tests would go a long way toward building confidence that this patch
does the right thing.  Thanks to the SSI patch, we now have an in-tree test
framework for testing interleaved transactions.  The only thing it needs to be
suitable for this work is a way to handle blocked commands.  If you like, I can
try to whip something up for that.

[off-list ACK followed]

Here's a patch implementing that.  It applies to master, with or without your
KEY LOCK patch also applied, though the expected outputs reflect the
improvements from your patch.  I add three isolation test specs:

   fk-contention: blocking-only test case from your blog post
   fk-deadlock: the deadlocking test case I used during patch review
   fk-deadlock2: Joel Jacobson's deadlocking test case

When a spec permutation would have us run a command in a currently-blocked
session, we cannot implement that permutation.  Such permutations represent
impossible real-world scenarios, anyway.  For now, I just explicitly name the
valid permutations in each spec file.  If the test harness detects this problem,
we abort the current test spec.  It might be nicer to instead cancel all
outstanding queries, issue rollbacks in all sessions, and continue with other
permutations.  I hesitated to do that, because we currently leave all
transaction control in the hands of the test spec.

I only support one waiting command at a time.  As long as one commands continues
to wait, I run other commands to completion synchronously.  This decision has no
impact on the current test specs, which all have two sessions.  It avoided a
touchy policy decision concerning deadlock detection.  If two commands have
blocked, it may be that a third command needs to run before they will unblock,
or it may be that the two commands have formed a deadlock.  We won't know for
sure until deadlock_timeout elapses.  If it's possible to run the next step in
the permutation (i.e., it uses a different session from any blocked command), we
can either do so immediately or wait out the deadlock_timeout first.  The latter
slows the test suite, but it makes the output more natural -- more like what one
would typically after running the commands by hand.  If anyone can think of a
sound general policy, that would be helpful.  For now, I've punted.

With a default postgresql.conf, deadlock_timeout constitutes most of the run
time.  Reduce it to 20ms to accelerate things when running the tests repeatedly.

Since timing dictates which query participating in a deadlock will be chosen for
cancellation, the expected outputs bearing deadlock errors are unstable.  I'm
not sure how much it will come up in practice, so I have not included expected
output variations to address this.

I think this will work on Windows as well as pgbench does, but I haven't
verified that.

Sorry for the delay on this.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Libpq enhancement

2011-06-19 Thread Dmitriy Igrishin
Hey Jeff,

2011/6/19 Jeff Shanab jsha...@smartwire.com

 I am wondering If I am missing something obvious. If not, I have a
 suggestion for plpgsql.

 ** **

 Stored procedures can accept rows.

 Libpq can receive rows (PQResult).

 ** **

 Wouldn’t it be a great interface if PQResult was “bi-directional”? Create a
 result set on the client then call the database with a command. 

 ** **

 Perhaps…

 PQinsert(PQResult,”schema.table”);  //iterate thru rows inserting

 PQupdate(PQResult,”schema.table”); //iterate thru rows updateing


IMO, mapping C functions to SQL operators is bad idea.
If I understood you correctly, you want to make libpq ORM. But
without implementing a functional like C++ virtual functions on
the _backend_ side, it is impossible or ugly.

-- 
// Dmitriy.


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Andrew Dunstan



On 06/19/2011 02:56 PM, Robert Haas wrote:

On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflugf...@phlo.org  wrote:

Amidst the discussion, Alvaro suggested that we resolve the issue
by adding a distinct type for patterns as opposed to text. That'd
allow us to make ~ it's own commutator by defining both
  text ~ pattern
and
  pattern ~ text.

That's kind of a neat idea.  There might be an efficiency benefit to
having a regex type that is precompiled by the input function.



What do we do when we get text or unknown in place of pattern? How are 
we going to know if the pattern is supposed to be the left or right operand?


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 06/19/2011 02:56 PM, Robert Haas wrote:
 On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflugf...@phlo.org  wrote:
 Amidst the discussion, Alvaro suggested that we resolve the issue
 by adding a distinct type for patterns as opposed to text. That'd
 allow us to make ~ it's own commutator by defining both
 text ~ pattern
 and
 pattern ~ text.

 That's kind of a neat idea.  There might be an efficiency benefit to
 having a regex type that is precompiled by the input function.

 What do we do when we get text or unknown in place of pattern? How are 
 we going to know if the pattern is supposed to be the left or right operand?

Yeah, this would result in
SELECT 'something' ~ 'something';
failing outright.  I don't think it's a good substitute for biting
the bullet and choosing distinct operator names.

(I do think a distinct regex datatype might be a good idea, but it
doesn't eliminate this particular problem.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Make external_pid_file world readable

2011-06-19 Thread Peter Eisentraut
On sön, 2011-06-19 at 08:59 -0700, David Fetter wrote:
 On Sat, Jun 18, 2011 at 10:18:50PM +, Peter Eisentraut wrote:
  Make external_pid_file world readable
 
 Should this be back-patched?

I wasn't planning to.  It's a new feature.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] heap_hot_search_buffer refactoring

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 2:41 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Jun 19, 2011 at 2:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Yikes.  I think you are right.  It's kind of scary that the regression
 tests passed with that mistake.

 Can we add a test that exposes that mistake?

 Not sure.  We'd have to figure out how to reliably tickle it.

*thinks a bit*

When using an MVCC snapshot, we always have first_call = true, so the
effect of this mistake was just to disable the opportunistic killing
of dead tuples, which doesn't affect correctness.

When using a non-MVCC snapshot, we call heap_hot_search_buffer()
repeatedly until it returns false.  For so long as it returns true, it
does not matter how all_dead is set, because index_getnext() will
return the tuple without examining all_dead.  So only the final call
matters.  If the final call happens to also be the first call, then
all_dead might end up being false when it really ought to be true, but
that will once again just miss killing a dead tuple.  If the final
call isn't the first call, then we've got a problem, because now
all_dead will be true when it really ought to be false, and we'll nuke
an index tuple that we shouldn't nuke.

But if this is happening in the context of CLUSTER, then there might
still be no user-visible failure, because we're going to rebuild the
indexes anyway.  There could be a problem if CLUSTER aborts part-way
though.

A system catalog might get scanned with SnapshotNow, but to exercise
the bug you'd need to HOT update a system catalog and then have the
updating transaction commit between the time it sees the first row and
the time it sees the second one.

So I don't quite see how to construct a test case, ATM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 22:10 , Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 06/19/2011 02:56 PM, Robert Haas wrote:
 On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflugf...@phlo.org  wrote:
 Amidst the discussion, Alvaro suggested that we resolve the issue
 by adding a distinct type for patterns as opposed to text. That'd
 allow us to make ~ it's own commutator by defining both
 text ~ pattern
 and
 pattern ~ text.
 
 That's kind of a neat idea.  There might be an efficiency benefit to
 having a regex type that is precompiled by the input function.
 
 What do we do when we get text or unknown in place of pattern? How are 
 we going to know if the pattern is supposed to be the left or right operand?
 
 Yeah, this would result in
   SELECT 'something' ~ 'something';
 failing outright.  I don't think it's a good substitute for biting
 the bullet and choosing distinct operator names.

Yeah, well, the complaint (put forward mainly by Alvaro) that lead to
this approach in the first place was precisely that
  'something' ~ 'anything'
*doesn't* give any indication of what constitutes the pattern and
what the text.

So I consider that to be a feature, not a bug.

BTW, arithmetical operators currently show exactly the same behaviour
postgres# select '1' + '1'
ERROR:  operator is not unique: unknown + unknown at character 12

The only argument against that I can see is that it poses
a compatibility problem if ~ remains the pattern matching
operator. I do believe, however, that the chance of
  unknown ~ unknown
appearing in actual applications is rather small - that'd only
happen if people used postgresql's regexp engine together with
purely external data.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 20:40 , Robert Haas wrote:
 2. Since VACUUM and ANALYZE often run together, we will be estimating
 the percentage of rows on all-visible pages just at the time when that
 percentage is highest.  This is not exactly wonderful, either...

Hm, doesn't autovacuum run ANALYZE quite a bit more frequently than
VACUUM by default?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2011-06-19 Thread Simon Riggs
On Fri, Jun 17, 2011 at 11:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Jun 16, 2011 at 6:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 4. Backend #2 visits the new, about-to-be-committed version of
 pgbench_accounts' pg_class row just before backend #3 commits.
 It sees the row as not good and keeps scanning.  By the time it
 reaches the previous version of the row, however, backend #3
 *has* committed.  So that version isn't good according to SnapshotNow
 either.

 thinks some more

 Why isn't this a danger for every pg_class update?  For example, it
 would seem that if VACUUM updates relpages/reltuples, it would be
 prone to this same hazard.

 VACUUM does that with an in-place, nontransactional update.  But yes,
 this is a risk for every transactional catalog update.

Well, after various efforts to fix the problem, I notice that there
are two distinct problems brought out by your test case.

One of them is caused by my patch, one of them was already there in
the code - this latter one is actually the hardest to fix.

It took me about an hour to fix the first bug, but its taken a while
of thinking about the second before I realised it was a pre-existing
bug.

The core problem is, as you observed that a pg_class update can cause
rows to be lost with concurrent scans.
We scan pg_class in two ways: to rebuild a relcache entry based on a
relation's oid (easy fix). We also scan pg_class to resolve the name
to oid mapping. The name to oid mapping is performed *without* a lock
on the relation, since we don't know which relation to lock. So the
name lookup can fail if we are in the middle of a pg_class update.
This is an existing potential bug in Postgres unrelated to my patch.
Ref: SearchCatCache()

I've been looking at ways to lock the relation name and namespace
prior to the lookup (or more precisely the hash), but its worth
discussing whether we want that at all?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 5:10 PM, Florian Pflug f...@phlo.org wrote:
 On Jun19, 2011, at 20:40 , Robert Haas wrote:
 2. Since VACUUM and ANALYZE often run together, we will be estimating
 the percentage of rows on all-visible pages just at the time when that
 percentage is highest.  This is not exactly wonderful, either...

 Hm, doesn't autovacuum run ANALYZE quite a bit more frequently than
 VACUUM by default?

The autoanalyze threshold is, by default, 10%; and the autovacuum
threshold, 20%.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Smith

On 06/19/2011 09:38 AM, Greg Stark wrote:

There's another problem which I haven't seen mentioned. Because the
access method will affect the cache there's the possibility of
feedback loops. e.g. A freshly loaded system prefers sequential scans
for a given table because without the cache the seeks of random reads
are too expensive...


Not sure if it's been mentioned in this thread yet, but he feedback 
issue has popped up in regards to this area plenty of times.  I think 
everyone who's producing regular input into this is aware of it, even if 
it's not mentioned regularly.  I'm not too concerned about the specific 
case you warned about because I don't see how sequential scan vs. index 
costing will be any different on a fresh system than it is now.  But 
there are plenty of cases like it to be mapped out here, and many are 
not solvable--they're just something that needs to be documented as a risk.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Kevin Grittner
Greg Smith  wrote:
 
 I'm not too concerned about the specific case you warned about
 because I don't see how sequential scan vs. index costing will be
 any different on a fresh system than it is now.
 
I think the point is that if, on a fresh system, the first access to
a table is something which uses a tables scan -- like select count(*)
-- that all indexed access would then  tend to be suppressed for that
table.  After all, for each individual query, selfishly looking at
its own needs in isolation, it likely *would* be faster to use the
cached heap data.
 
I see two ways out of that -- one hard and one easy.  One way would
be to somehow look at the impact on the cache of potential plans and
the resulting impact on overall throughput of the queries being run
with various cache contents.  That's the hard one, in case anyone
wasn't clear.  ;-)  The other way would be to run some percentage of
the queries *without* considering current cache contents, so that the
cache can eventually adapt to the demands.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench--new transaction type

2011-06-19 Thread Greg Smith
I applied Jeff's patch but changed this to address concerns about the 
program getting stuck running for too long in the function:


#define plpgsql_loops   512

This would be better named as plpgsql_batch_size or something similar 
instead, the current name suggests it's how many loops to run which is 
confusing.


My main performance concern here was whether this change really matter 
so much once a larger number of clients were involved.  Some of the 
other things you can do to optimize single-client performance aren't as 
useful with lots of them.  Here's how the improvements in this mode 
worked for me on a server with 4 Hyper-Threaded cores (i870); 
shared_buffers=256MB, scale=100:


1 client:
-S: 11533
-S -M prepared: 19498
-P: 49547

12 clients, 4 workers:
-S:  56052
-S -M prepared: 82043
-P: 159443

96 clients, 8 workers:
-S: 49940
-S -M prepared: 76099
-P: 137942

I think this is a really nice new workload to demonstrate.  One of the 
things we tell people is that code works much faster when moved 
server-side, but how much faster isn't always easy to show.  Having this 
mode available lets them see how dramatic that can be quite easily.  I 
know I'd like to be able to run performance tests for clients of new 
hardware using PostgreSQL and tell them something like this:  With 
simple clients executing a statement at a time, this server reaches 56K 
SELECTs/section.  But using server-side functions to execute them in 
larger batches it can do 159K.


The value this provides for providing an alternate source for benchmark 
load generation, with a very different profile for how it exercises the 
server, is good too.


Things to fix in the patch before it would be a commit candidate:

-Adjust the loop size/name, per above
-Reformat some of the longer lines to try and respect the implied right 
margin in the code formatting

-Don't include the plgsql function created. line unless in debugging mode.
-Add the docs.  Focus on how this measures how fast the database can 
execute SELECT statements using server-side code.  An explanation that 
the transaction block size is 512 is important to share.  It also 
needs a warning that time based runs (-T) may have to wait for a block 
to finish and go beyond its normally expected end time.
-The word via in the transaction type output description is probably 
not the best choice.  Changing to SELECT only using PL/pgSQL would 
translate better, and follow the standard case use for the name of that 
language.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make relation_openrv atomic wrt DDL

2011-06-19 Thread Greg Stark
So I was the victim assigned to review this patch.

The code is pretty much impeccable as usual from Noah. But I have
questions about the semantics of it.

Firstly this bit makes me wonder:

+   /* Not-found is always final. */
+   if (!OidIsValid(relOid1))
+   return relOid1;

If someone does

BEGIN; DROP TABLE foo; CREATE TABLE foo; COMMIT;

Then what prevents this logic from finding the doomed relation,
blocking until the transaction commits, then finding it's deleted and
returning InvalidOid?
RangeVarGetRelid is just going to complete its index scan of pg_class
and may not come across the newly inserted row.

Am I missing something? I would have expected to have to loop around
and retry if no valid record is found. But this raises the question --
if no lock was acquired then what would have triggered an
AcceptInvalidatationMessages and how would we know we waited long
enough to find out about the newly created table?

As a side note, if there are a long stream of such concurrent DDL then
this code will leave all the old versions locked. This is consistent
with our hold locks until end of transaction semantics but it seems
weird for tables that we locked accidentally and didn't really end
up using at all. I'm not sure it's really bad though.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Andrew Dunstan



On 06/19/2011 05:02 PM, Florian Pflug wrote:

On Jun19, 2011, at 22:10 , Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 06/19/2011 02:56 PM, Robert Haas wrote:

On Sun, Jun 19, 2011 at 9:53 AM, Florian Pflugf...@phlo.org   wrote:

Amidst the discussion, Alvaro suggested that we resolve the issue
by adding a distinct type for patterns as opposed to text. That'd
allow us to make ~ it's own commutator by defining both
text ~ pattern
and
pattern ~ text.

That's kind of a neat idea.  There might be an efficiency benefit to
having a regex type that is precompiled by the input function.

What do we do when we get text or unknown in place of pattern? How are
we going to know if the pattern is supposed to be the left or right operand?

Yeah, this would result in
SELECT 'something' ~ 'something';
failing outright.  I don't think it's a good substitute for biting
the bullet and choosing distinct operator names.

Yeah, well, the complaint (put forward mainly by Alvaro) that lead to
this approach in the first place was precisely that
   'something' ~ 'anything'
*doesn't* give any indication of what constitutes the pattern and
what the text.

So I consider that to be a feature, not a bug.

BTW, arithmetical operators currently show exactly the same behaviour
postgres# select '1' + '1'
ERROR:  operator is not unique: unknown + unknown at character 12

The only argument against that I can see is that it poses
a compatibility problem if ~ remains the pattern matching
operator. I do believe, however, that the chance of
   unknown ~ unknown
appearing in actual applications is rather small - that'd only
happen if people used postgresql's regexp engine together with
purely external data.




People can store regular expressions in text fields now, and do, let me 
assure you. So the chances you'll encounter text ~ unknown or unknown ~ 
text  or text ~ text are 100%


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small SSI issues

2011-06-19 Thread Kevin Grittner
Heikki Linnakangas wrote:

 * The oldserxid code is broken for non-default BLCKSZ.
 o The warning will come either too early or too late
 o There is no safeguard against actually wrapping around the
 SLRU, just the warning
 o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
 with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
 than necessary to cover the whole range of 2^32 transactions,
 so at high XIDs, say 2^32-1, doesn't it incorrectly think that
 low XIDs, say 10, are in the past, not the future?

It took me a while to see these problems because somehow I had
forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather than
being based on BLCKSZ. After I rediscovered that, your concern was
clear enough.

I think the attached patch addresses the problem with the
OldSerXidPagePrecedesLogically() function, which strikes me as the
most serious issue.

Based on the calculation from the attached patch, it would be easy to
adjust the warning threshold, but I'm wondering if we should just rip
it out instead. As I mentioned in a recent post based on reviewing
your concerns, with an 8KB BLCKSZ the SLRU system will start thinking
we're into wraparound at one billion transactions, and refuse to
truncate segment files until we get down to less than that, but we
won't actually overwrite anything and cause SSI misbehaviors until we
eat through two billion (2^31 really) transactions while holding open
a single read-write transaction. At that point I think PostgreSQL
has other defenses which come into play. With the attached patch I
don't think we can have any such problems with a *larger* BLCKSZ, so
the only point of the warning would be for a BLCKSZ of 4KB or less.
Is it worth carrying the warning code (with an appropriate adjustment
to the thresholds) or should we drop it?

-Kevin


ssi-slru-maxpage.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a distinct pattern type to resolve the ~ commutator stalemate

2011-06-19 Thread Florian Pflug
On Jun20, 2011, at 00:56 , Andrew Dunstan wrote:
 On 06/19/2011 05:02 PM, Florian Pflug wrote:
 The only argument against that I can see is that it poses
 a compatibility problem if ~ remains the pattern matching
 operator. I do believe, however, that the chance of
   unknown ~ unknown
 appearing in actual applications is rather small - that'd only
 happen if people used postgresql's regexp engine together with
 purely external data.
 
 People can store regular expressions in text fields now, and do,
 let me assure you. So the chances you'll encounter text ~ unknown
 or unknown ~ text  or text ~ text are 100%

Hm, it seems we either all have different idea about how such
a pattern type would be be defined, or have grown so accustomed to
pg's type system that we've forgotten how powerful it really
is ;-) (For me, the latter is surely true...).

I've now created a primitive prototype that uses a composite
type for pattern. That changes the input syntax for patterns
(you need to enclose them in brackets), but should model all
the implicit and explicit casting rules and operator selection
correctly. It also uses ~~~ in place of ~, for obvious
reasons and again without changing the casting and overloading
rules.

The prototype defines
  text ~~~ text
  text ~~~ pattern
  pattern ~~~ text
and can be found at end of this mail.

With that prototype, *all* the cases (even unknown ~~~ unknown)
work as today, i.e. all of the statements below return true

postgres=# select 'abc' ~~~ '^ab+c$';
postgres=# select 'abc'::text ~~~ '^ab+c$';
postgres=# select 'abc' ~~~ '^ab+c$'::text;
postgres=# select 'abc' ~~~ '(^ab+c$)'::pattern;
postgres=# select '(^ab+c$)'::pattern ~~~ 'abc';

(The same happens with and without setting pattern's typcategory
to 'S'. Not really sure if the category has any effect here
at all).

That's not exactly what I had in mind - I'd have preferred
  unknown ~~~ unknown
to return an error but
  text ~~~ unknown
and
  unknown ~~~ text
to work, but it looks that that's not easily done.

Still, I believe the behaviour of the prototype is acceptable.

BTW, The reason that 'unknown ~~~ unknown' works is, I believe
the following comment func_select_candidate, together with the
fact that 'text' is the preferred type in the string category.

  If any candidate has an input datatype of STRING category,
  use STRING category (this bias towards STRING is appropriate
  since unknown-type literals look like strings).

best regards,
Florian Pflug

create type pattern as (p text);

create function match_right(l text, r text) returns boolean as $$
  select $1 ~ $2
$$ language sql strict immutable;

create operator ~~~ (
  procedure = match_right,
  leftarg = text, rightarg = text
);

create function match_right(l text, r pattern) returns boolean as $$
  select $1 ~ $2.p
$$ language sql strict immutable;

create operator ~~~ (
  procedure = match_right, commutator = '~~~',
  leftarg = text, rightarg = pattern
);

create function match_left(l pattern, r text) returns boolean as $$
  select $2 ~ $1.p
$$ language sql strict immutable;

create operator ~~~ (
  procedure = match_left, commutator = '~~~',
  leftarg = pattern, rightarg = text
);

update pg_type set typcategory = 'S' where oid = 'pattern'::regtype;



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Florian Pflug
On Jun19, 2011, at 23:16 , Robert Haas wrote:
 On Sun, Jun 19, 2011 at 5:10 PM, Florian Pflug f...@phlo.org wrote:
 On Jun19, 2011, at 20:40 , Robert Haas wrote:
 2. Since VACUUM and ANALYZE often run together, we will be estimating
 the percentage of rows on all-visible pages just at the time when that
 percentage is highest.  This is not exactly wonderful, either...
 
 Hm, doesn't autovacuum run ANALYZE quite a bit more frequently than
 VACUUM by default?
 
 The autoanalyze threshold is, by default, 10%; and the autovacuum
 threshold, 20%.

Hm, so you could ignore (or rather dampen) the results of 
VACUUM+ANALYZE and rely on the ANALYZE-only runs to keep
the estimate correct. Still doesn't sound that bad...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Greg Smith

On 06/19/2011 06:15 PM, Kevin Grittner wrote:

I think the point is that if, on a fresh system, the first access to
a table is something which uses a tables scan -- like select count(*)
-- that all indexed access would then  tend to be suppressed for that
table.  After all, for each individual query, selfishly looking at
its own needs in isolation, it likely *would* be faster to use the
cached heap data.
   


If those accesses can compete with other activity, such that the data 
really does stay in the cache rather than being evicted, then what's 
wrong with that?  We regularly have people stop by asking for how to pin 
particular relations to the cache, to support exactly this sort of scenario.


What I was would expect on any mixed workload is that the table would 
slowly get holes shot in it, as individual sections were evicted for 
more popular index data.  And eventually there'd be little enough left 
for it to win over an index scan.  But if people keep using the copy of 
the table in memory instead, enough so that it never really falls out of 
cache, well that's not necessarily even a problem--it could be 
considered a solution for some.


The possibility that people can fit their entire table into RAM and it 
never leaves there is turning downright probable in some use cases now.  
A good example are cloud instances using EC2, where people often 
architect their systems such that the data set put onto any one node 
fits into RAM.  As soon as that's not true you suffer too much from disk 
issues, so breaking the databases into RAM sized pieces turns out to be 
very good practice.  It's possible to tune fairly well for this case 
right now--just make the page costs all low.  The harder case that I see 
a lot is where all the hot data fits into cache, but there's a table or 
two of history/archives that don't.  And that would be easier to do the 
right thing with given this bit of what's in the cache? percentages.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 7:59 PM, Florian Pflug f...@phlo.org wrote:
 On Jun19, 2011, at 23:16 , Robert Haas wrote:
 On Sun, Jun 19, 2011 at 5:10 PM, Florian Pflug f...@phlo.org wrote:
 On Jun19, 2011, at 20:40 , Robert Haas wrote:
 2. Since VACUUM and ANALYZE often run together, we will be estimating
 the percentage of rows on all-visible pages just at the time when that
 percentage is highest.  This is not exactly wonderful, either...

 Hm, doesn't autovacuum run ANALYZE quite a bit more frequently than
 VACUUM by default?

 The autoanalyze threshold is, by default, 10%; and the autovacuum
 threshold, 20%.

 Hm, so you could ignore (or rather dampen) the results of
 VACUUM+ANALYZE and rely on the ANALYZE-only runs to keep
 the estimate correct. Still doesn't sound that bad...

Yeah, there are a lots of possible approaches.  You could try to keep
a count of how many visibility map bits had been cleared since the
last run...  and either adjust the estimate directly or use it to
trigger an ANALYZE (or some limited ANALYZE that only looks at
visibility map bits).  You could gather statistics on how often the
queries that are actually running are finding the relevant visibility
map bits set, and use that to plan future queries.  You could do what
you're suggesting... and there are probably other options as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] cache estimates, cache access cost

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 3:32 PM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2nd, I provided the patches on the last CF, exactly to allow to go to
 the exciting part: the cost-estimates changes. (after all, we can work
 on the cost estimate, and if later we find a way to use ALTER TABLE or
 pg_class_ng, just do it instead of via the ANALYZE magic)

We're talking past each other here, somehow.  The cost-estimating part
does not require this patch in order to something useful, but this
patch, AFAICT, absolutely does require the cost-estimating part to do
something useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Libpq enhancement

2011-06-19 Thread Robert Haas
On Sun, Jun 19, 2011 at 11:04 AM, Jeff Shanab jsha...@smartwire.com wrote:
 I am wondering If I am missing something obvious. If not, I have a
 suggestion for plpgsql.

 Stored procedures can accept rows.

 Libpq can receive rows (PQResult).

 Wouldn’t it be a great interface if PQResult was “bi-directional”? Create a
 result set on the client then call the database with a command.

For insert, we have something like this already - this is what copy is for.

For update, it's a bit more complex - we don't have a replace into operator...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pika buildfarm member failure on isolationCheck tests

2011-06-19 Thread Robert Haas
On Sat, Jun 18, 2011 at 11:57 AM, Rémi Zara remi_z...@mac.com wrote:
 Pika failed at the isolationCheck phase, hitting an assertion:

 TRAP: FailedAssertion(!(!((oldSerXidControl-tailXid) != ((TransactionId) 
 0)) || TransactionIdFollows(xid, oldSerXidControl-tailXid)), File: 
 predicate.c, Line: 991)

 see 
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=pikadt=2011-06-17%2015%3A45%3A30

 It seems that for this stage, the server log (which contains the failed 
 assertion) is not sent back to the buildfarm server. Maybe that should be 
 fixed ?

Is this an open item for 9.1beta3?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Boolean operators without commutators vs. ALL/ANY

2011-06-19 Thread Greg Stark
On Fri, Jun 17, 2011 at 3:49 PM, Florian Pflug f...@phlo.org wrote:
 The regex is always to the right of the operator.

 Which is something you have to remember... It's not in any
 way deducible from foo ~ bar alone.

Except that it's always been this way, going back to perl4 or tcl or
their predecessors. The regexp binding operator always has the regexp
on the right.


 How is that worse than the situation with =~ and ~=?

 With =~ it is to the right, with ~= it is to the left.

 It's always where the tilde is. Yeah, you have to remember that.

And when you get it wrong it will fail silently. No errors, just wrong results.

While I've never accidentally written /foo/ =~ $_ in perl I have
*frequently* forgotten whether the operator is ~= or =~. Actually I
forget that pretty much every time I start writing some perl. I just
put whichever comes first and if I get an error I reverse it.

I can see the temptation to make it symmetric but it's going to cause
an awful lot of confusion.

Perhaps we could name the operators ~~= and =~~ and then have a =~
short-cut for compatibility? (and ~ too I guess?)


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch - Debug builds without optimization

2011-06-19 Thread Greg Stark
On Thu, Jun 16, 2011 at 9:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, if you're hacking Postgres code and don't already have a
 reinstall script, you need one.  Mine is basically

        pg_ctl stop
        cd $PGBLDROOT/src/backend
        make install-bin
        pg_ctl start

I've always wondered what other people do to iterate quickly. It's a
bit of a pain that you can't just run the binary out of the build
tree. This looks a lot safer than some of the things I was considering
doing with symlinks.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't use cp -i in the example WAL archive_command.

2011-06-19 Thread Fujii Masao
On Sat, Jun 18, 2011 at 10:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 On 18 June 2011 04:13, Bruce Momjian br...@momjian.us wrote:
 I tested on FreeBSD 7.4 and saw a 1 error return:

 And on a Mac (so through Darwin 10.7.0 a BSD version too):

 Yeah, see yesterday's discussion on pgsql-admin.  I think the behavior
 with the error return may be a BSD-ism.  In any case, GNU cp does *not*
 do what we want, and that accounts for a sufficiently large fraction of
 machines in the field that I think it's just unsafe to suggest using
 cp -i so prominently.  There are too many people who'll just copy and
 paste the first example provided, especially if the warning to test it
 is buried several paragraphs later.

Anyway, you seem to have forgotten to fix the example of archive_command
in 24.3.5.1. Standalone Hot Backups.

 archive_command = 'test ! -f /var/lib/pgsql/backup_in_progress ||
cp -i %p /var/lib/pgsql/archive/%f  /dev/null'

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-19 Thread Fujii Masao
On Fri, Jun 17, 2011 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 16, 2011 at 10:10 PM, Fujii Masao masao.fu...@gmail.com wrote:
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his 
 wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote 
 nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

 I like the idea of a new promote mode;

 Are you going to implement that mode in this CF? or next one?

 I wasn't really planning on it - I thought you might want to take a
 crack at it.  The feature is usable without that, just maybe a bit
 less cool.

Right.

 Certainly, it's too late for any more formal submissions
 to this CF, but I wouldn't mind reviewing a patch if you want to write
 one.

Okay, I add that into my TODO list. But I might not have enough time
to develop that.
So, everyone, please feel free to implement that if you want!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Steve Singer

On Sun, 19 Jun 2011, Pavel Stehule wrote:


Maybe there is second issue (little bit - performance - you have to
call a output function), But I agree, so this information is very
interesting and can help.


I am concerned about the performance impact of doing that.  Not all 
constraints are on int4 columns.  Some constraints might be on a geometry 
type that is megabytes in side taking a substantial chunk of CPU and 
bandwith to convert it into a text representation and then send it back to 
the client.





I am open for any ideas in this direction.

Regards

Pavel



best regards,
Florian Pflug







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make relation_openrv atomic wrt DDL

2011-06-19 Thread Noah Misch
Hi Greg,

On Sun, Jun 19, 2011 at 11:44:35PM +0100, Greg Stark wrote:
 So I was the victim assigned to review this patch.

Thanks for doing so.

 The code is pretty much impeccable as usual from Noah. But I have
 questions about the semantics of it.
 
 Firstly this bit makes me wonder:
 
 +   /* Not-found is always final. */
 +   if (!OidIsValid(relOid1))
 +   return relOid1;
 
 If someone does
 
 BEGIN; DROP TABLE foo; CREATE TABLE foo; COMMIT;
 
 Then what prevents this logic from finding the doomed relation,
 blocking until the transaction commits, then finding it's deleted and
 returning InvalidOid?
 RangeVarGetRelid is just going to complete its index scan of pg_class
 and may not come across the newly inserted row.

RangeVarGetRelid() always runs its index scan to completion, and the blocking
happens in LockRelationOid().  You will get a sequence like this:

RangeVarGetRelid(foo) = 2
LockRelationOid(2) [... blocks ...]
AcceptInvalidationMessages() [process a message]
RangeVarGetRelid(foo) = 20001
[restart loop]
LockRelationOid(20001)
AcceptInvalidationMessages() [no new messages - done]

RangeVarGetRelid() *is* vulnerable to the problem Simon just reported in the
ALTER TABLE lock strength reduction patch is unsafe thread, which arises
when the DDL transaction actually commits in the middle of a concurrent system
table scan.  I don't think this patch makes things better or worse in that
regard, but I haven't thought it through in great depth.

 Am I missing something? I would have expected to have to loop around
 and retry if no valid record is found. But this raises the question --
 if no lock was acquired then what would have triggered an
 AcceptInvalidatationMessages and how would we know we waited long
 enough to find out about the newly created table?

Good question.  I think characterizing long enough quickly leads to defining
one or more sequence points after which all backends must recognize a new
table as existing.  My greenfield definition would be a command should see
precisely the tables visible to its MVCC snapshot, but that has practical
problems.  Let's see what implementation concerns would suggest...

This leads to a case I had not considered explicitly: CREATE TABLE on a name
that has not recently mapped to any table.  If the catcache has a negative
entry on the key in question, we will rely on that and miss the new table
until we call AcceptInvalidationMessages() somehow.  To hit this, you need a
command that dynamically chooses to query a table that has been created since
the command started running.  DROP/CREATE of the same name in a single
transaction can't hit the problem.  Consider this test script:

psql -X \_EOSQL 
-- Cleanup from last run
DROP TABLE IF EXISTS public.foo;

BEGIN;

-- Create the neg catcache entry.
SAVEPOINT q;
SELECT 1 FROM public.foo;
ROLLBACK to q;

--SET client_min_messages = debug5; -- use with CACHEDEBUG for insight
DO $$
BEGIN
EXECUTE 'SELECT 1 FROM pg_am'; -- prime basic catcache entries
PERFORM pg_sleep(11);
EXECUTE 'SELECT 1 FROM public.foo';
END
$$;
_EOSQL

sleep 1
psql -Xc 'CREATE TABLE public.foo ()'

wait

The first backend fails to see the new table despite its creating transaction
having committed ~10s ago.  Starting a transaction, beginning to process a new
client-issued command, or successfully locking any relation prevents the miss.
We could narrow the window in most cases by re-adding a call to
AcceptInvalidationMessages() before RangeVarLockRelid()'s first call to
RangeVarGetRelid().  My current thinking is that it's not worth adding that
cost to every RangeVarLockRelid().  Thus, specify that, minimally, each
client-issued command will see all tables whose names were occupied at the
time the command started.  I would add a comment to that effect.  Thoughts?

 As a side note, if there are a long stream of such concurrent DDL then
 this code will leave all the old versions locked. This is consistent
 with our hold locks until end of transaction semantics but it seems
 weird for tables that we locked accidentally and didn't really end
 up using at all. I'm not sure it's really bad though.

Yes.  If that outcome were more common, this would be a good place to try
relaxing the rule.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for 9.2: enhanced errors

2011-06-19 Thread Alvaro Herrera
Excerpts from Pavel Stehule's message of dom jun 19 06:51:13 -0400 2011:
 Hello
 
 2011/6/19 Steve Singer ssinger...@sympatico.ca:
  On 11-06-08 04:14 PM, Pavel Stehule wrote:

  nbtinsert.c
 
  pg_get_indrelation is named differently than everything else in this file
  (ie _bt...).  My guess is that this function belongs somewhere else but I
  don't know the code well enough to say where you should move it too.
 
 
 I'll try to get better name, but I would not use a public name like _bt

lsyscache.c?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch - Debug builds without optimization

2011-06-19 Thread Greg Smith

Greg Stark wrote:

I've always wondered what other people do to iterate quickly.


I'd have bet money you had an elisp program for this by now!

The peg utility script I use makes a reinstall as simple as:

stop
peg build

The UI for peg is still is a little rough around switching to another 
project when using git, and the PGDATA handling could be better.  Being 
able to give each patch I want to play with its own binary+data tree 
with a couple of simple commands is the time consuming part to setup I 
wanted to automate completely, and for that it works great:  
https://github.com/gregs1104/peg


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench--new transaction type

2011-06-19 Thread Itagaki Takahiro
On Mon, Jun 20, 2011 at 07:30, Greg Smith g...@2ndquadrant.com wrote:
 I applied Jeff's patch but changed this to address concerns about the
 program getting stuck running for too long in the function:

 #define plpgsql_loops   512

Is it OK to define the value as constant?

Also, it executes much more queries if -t option (transactions) specified;
Of course it runs the specified number of transactions, but actually
runs plpgsql_loops times than other modes.

 I think this is a really nice new workload to demonstrate.  One of the
 things we tell people is that code works much faster when moved server-side,

What is the most important part of the changes? The proposal includes
3 improvements. It might defocus the most variable tuning point.

 #1 Execute multiple queries in one transaction.
 #2 Run multiple queries in the server with stored procedure.
 #3 Return only one value instead of 512.

Anyway, I'm not sure we need to include the query mode into the pgbench's
codes. Instead, how about providing a sample script as a separate sql
file? pgbench can execute any script files with -f option.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch implementation that wakes on postmaster death on both win32 and Unix

2011-06-19 Thread Fujii Masao
On Mon, Jun 20, 2011 at 1:00 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 I took another look at this this evening, and realised that my
 comments could be a little clearer.

 Attached revision cleans them up a bit.

Since I'm not familiar with Windows, I haven't read the code related
to Windows. But
the followings are my comments on your patch.

+   if (wakeEvents  WL_POSTMASTER_DEATH)
+   {
+   FD_SET(postmaster_alive_fds[POSTMASTER_FD_WATCH], 
input_mask);
+   if (postmaster_alive_fds[POSTMASTER_FD_WATCH]  hifd)
+   hifd = 
postmaster_alive_fds[POSTMASTER_FD_WATCH];
+   }
hifd = selfpipe_readfd;

'hifd' should be initialized to 'selfpipe_readfd' before the above
'if' block. Otherwise,
'hifd = postmaster_alive_fds[POSTMASTER_FD_WATCH]' might have no effect.

+   time_t   curtime = time(NULL);
+   unsigned int timeout_secs  = (unsigned int) 
PGARCH_AUTOWAKE_INTERVAL -
+   (unsigned int) (curtime - 
last_copy_time);
+   WaitLatch(mainloop_latch, WL_LATCH_SET | WL_TIMEOUT |
WL_POSTMASTER_DEATH, timeout_secs * 100L);

Why does the archive still need to wake up periodically?

+   flags |= FNONBLOCK;
+   if (fcntl(postmaster_alive_fds[POSTMASTER_FD_WATCH], F_SETFL, 
FNONBLOCK))

Is the variable 'flag' really required? It's not used by fcntl() to
set the fd nonblocking.

Is FNONBLOCK equal to O_NONBLOCK? If yes, we should use O_NONBLOCK
for the sake of consistency? In other code (e.g., noblock.c), O_NONBLOCK is used
rather than FNONBLOCK.

+   WaitLatchOrSocket(MyWalSnd-latch,
+ WL_LATCH_SET | 
WL_SOCKET_READABLE | (pq_is_send_pending()?
WL_SOCKET_WRITEABLE:0) |  WL_TIMEOUT,
+ MyProcPort-sock,

I think that it's worth that walsender checks the postmaster death event. No?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers