date:20120302

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

2012/3/3 Alvaro Herrera :
>
> Excerpts from Pavel Stehule's message of sáb mar 03 02:45:06 -0300 2012:
>
>> > Without correct registration you cannot to call PL check function
>> > directly simply. I don't thing so this is good price for removing a
>> > few SPI lines. I don't understand why you don't like SPI.
>
> I don't dislike SPI in general.  I just dislike using it internally in
> the backend.  Other than RI, it's not used anywhere.

>
>> > It is used more times in code for similar purpose.
>>
>> this disallow direct PL check function call - so any more complex
>> situation cannot be solved by SQL, but must be solved by PL/pgSQL with
>> dynamic SQL
>
> Nonsense.  Where did you get this idea?  I did not touch the plpgsql
> code at all, it'd still work exactly as in your original patch.


ok

I am sorry

Pavel

>
> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Alvaro Herrera


Excerpts from Pavel Stehule's message of sáb mar 03 02:45:06 -0300 2012:

> > Without correct registration you cannot to call PL check function
> > directly simply. I don't thing so this is good price for removing a
> > few SPI lines. I don't understand why you don't like SPI.

I don't dislike SPI in general.  I just dislike using it internally in
the backend.  Other than RI, it's not used anywhere.

> > It is used more times in code for similar purpose.
> 
> this disallow direct PL check function call - so any more complex
> situation cannot be solved by SQL, but must be solved by PL/pgSQL with
> dynamic SQL

Nonsense.  Where did you get this idea?  I did not touch the plpgsql
code at all, it'd still work exactly as in your original patch.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

2012/3/3 Pavel Stehule :
> Hello
>
>>
>> It wasn't all that difficult -- see below.  While at this, I have a
>> question: how attached you are to the current return format for CHECK
>> FUNCTION?
>
> TupleDescr is created by language creator. This ensure exactly
> expected format, because there are no possible registry check function
> with other output tuple descriptor.
>
>>
>> check function f1();
>>                       CHECK FUNCTION
>> -
>>  In function: 'f1()'
>>  error:42804:5:assignment:subscripted object is not an array
>> (2 rows)
>>
>> It seems to me that it'd be trivial to make it look like this instead:
>>
>> check function f1();
>> function | lineno | statement  | sqlstate |              message             
>>   | detail | hint | level | position | query
>> -+++--+++--+---+--+---
>> f1()     |      5 | assignment | 42804    | subscripted object is not an 
>> array |        |      | error |          |
>> (1 row)
>>
>> This looks much nicer to me.

This was similar to first design - it is near to result of direct PL
check function call. But Tom and Albe had different opinion - check a
thread three months ago, and I had to agree with them - they proposed
better behave for using in psql console - and result is more similar
to usual output when exception is raised.

>
> I am strongly disagree.
>
> 1. This format is not consistent with other commands,
> 2. This format is difficult for copy/paste
> 3. THE ARE NOT CARET - this is really important
> 5. This form is bad for terminals - there are long rows, and with \x
> outout, there are lot of "garbage" for multicommand
> 4. When you would to this form, you can to directly call SRF PL check
> functions.
>
>>
>> One thing we lose is the caret marking the position of the error -- but
>> I'm wondering if that really works well.  I didn't test it but from the
>> code it looks to me like it'd misbehave if you had a multiline statement.
>>
>> Opinions?
>
>
>
>>
>>
>> /*
>>  * Search and execute the checker function.
>>  *
>>  *   returns true, when checked function is valid
>>  */
>> static bool
>> CheckFunctionById(Oid funcOid, Oid relid, ArrayType *options,
>>                                  bool fatal_errors, TupOutputState *tstate)
>> {
>>        HeapTuple               tup;
>>        Form_pg_proc    proc;
>>        HeapTuple               languageTuple;
>>        Form_pg_language lanForm;
>>        Oid                             languageChecker;
>>        char               *funcname;
>>        int                             result;
>>
>>        tup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid));
>>        if (!HeapTupleIsValid(tup)) /* should not happen */
>>                elog(ERROR, "cache lookup failed for function %u", funcOid);
>>
>>        proc = (Form_pg_proc) GETSTRUCT(tup);
>>
>>        languageTuple = SearchSysCache1(LANGOID, 
>> ObjectIdGetDatum(proc->prolang));
>>        Assert(HeapTupleIsValid(languageTuple));
>>
>>        lanForm = (Form_pg_language) GETSTRUCT(languageTuple);
>>        languageChecker = lanForm->lanchecker;
>>
>>        funcname = format_procedure(funcOid);
>>
>>        /* We're all set to call the checker */
>>        if (OidIsValid(languageChecker))
>>        {
>>                TupleDesc               tupdesc;
>>                Datum                   checkret;
>>                FmgrInfo                flinfo;
>>                ReturnSetInfo   rsinfo;
>>                FunctionCallInfoData fcinfo;
>>
>>                /* create the tuple descriptor that the checker is supposed 
>> to return */
>>                tupdesc = CreateTemplateTupleDesc(10, false);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 1, "functionid", 
>> REGPROCOID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 2, "lineno", 
>> INT4OID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 3, "statement", 
>> TEXTOID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 4, "sqlstate", 
>> TEXTOID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 5, "message", 
>> TEXTOID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 6, "detail", 
>> TEXTOID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 7, "hint", TEXTOID, 
>> -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 8, "level", TEXTOID, 
>> -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 9, "position", 
>> INT4OID, -1, 0);
>>                TupleDescInitEntry(tupdesc, (AttrNumber) 10, "query", 
>> TEXTOID, -1, 0);
>>
>>                fmgr_info(languageChecker, &flinfo);
>>
>>                rsinfo.type = T_ReturnSetInfo;
>>                rsinfo.econtext = CreateStandaloneExprContext();
>>                rsinfo.expectedDesc = tupdesc;
>>                rsinfo.allowedModes = (int) SF

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

Hello

>
> It wasn't all that difficult -- see below.  While at this, I have a
> question: how attached you are to the current return format for CHECK
> FUNCTION?

TupleDescr is created by language creator. This ensure exactly
expected format, because there are no possible registry check function
with other output tuple descriptor.

>
> check function f1();
>                       CHECK FUNCTION
> -
>  In function: 'f1()'
>  error:42804:5:assignment:subscripted object is not an array
> (2 rows)
>
> It seems to me that it'd be trivial to make it look like this instead:
>
> check function f1();
> function | lineno | statement  | sqlstate |              message              
>  | detail | hint | level | position | query
> -+++--+++--+---+--+---
> f1()     |      5 | assignment | 42804    | subscripted object is not an 
> array |        |      | error |          |
> (1 row)
>
> This looks much nicer to me.

I am strongly disagree.

1. This format is not consistent with other commands,
2. This format is difficult for copy/paste
3. THE ARE NOT CARET - this is really important
5. This form is bad for terminals - there are long rows, and with \x
outout, there are lot of "garbage" for multicommand
4. When you would to this form, you can to directly call SRF PL check
functions.

>
> One thing we lose is the caret marking the position of the error -- but
> I'm wondering if that really works well.  I didn't test it but from the
> code it looks to me like it'd misbehave if you had a multiline statement.
>
> Opinions?

-1

>
>
> /*
>  * Search and execute the checker function.
>  *
>  *   returns true, when checked function is valid
>  */
> static bool
> CheckFunctionById(Oid funcOid, Oid relid, ArrayType *options,
>                                  bool fatal_errors, TupOutputState *tstate)
> {
>        HeapTuple               tup;
>        Form_pg_proc    proc;
>        HeapTuple               languageTuple;
>        Form_pg_language lanForm;
>        Oid                             languageChecker;
>        char               *funcname;
>        int                             result;
>
>        tup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid));
>        if (!HeapTupleIsValid(tup)) /* should not happen */
>                elog(ERROR, "cache lookup failed for function %u", funcOid);
>
>        proc = (Form_pg_proc) GETSTRUCT(tup);
>
>        languageTuple = SearchSysCache1(LANGOID, 
> ObjectIdGetDatum(proc->prolang));
>        Assert(HeapTupleIsValid(languageTuple));
>
>        lanForm = (Form_pg_language) GETSTRUCT(languageTuple);
>        languageChecker = lanForm->lanchecker;
>
>        funcname = format_procedure(funcOid);
>
>        /* We're all set to call the checker */
>        if (OidIsValid(languageChecker))
>        {
>                TupleDesc               tupdesc;
>                Datum                   checkret;
>                FmgrInfo                flinfo;
>                ReturnSetInfo   rsinfo;
>                FunctionCallInfoData fcinfo;
>
>                /* create the tuple descriptor that the checker is supposed to 
> return */
>                tupdesc = CreateTemplateTupleDesc(10, false);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 1, "functionid", 
> REGPROCOID, -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 2, "lineno", INT4OID, 
> -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 3, "statement", 
> TEXTOID, -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 4, "sqlstate", 
> TEXTOID, -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 5, "message", 
> TEXTOID, -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 6, "detail", TEXTOID, 
> -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 7, "hint", TEXTOID, 
> -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 8, "level", TEXTOID, 
> -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 9, "position", 
> INT4OID, -1, 0);
>                TupleDescInitEntry(tupdesc, (AttrNumber) 10, "query", TEXTOID, 
> -1, 0);
>
>                fmgr_info(languageChecker, &flinfo);
>
>                rsinfo.type = T_ReturnSetInfo;
>                rsinfo.econtext = CreateStandaloneExprContext();
>                rsinfo.expectedDesc = tupdesc;
>                rsinfo.allowedModes = (int) SFRM_Materialize;
>                /* returnMode is set by the checker, hopefully ... */
>                /* isDone is not relevant, since not using ValuePerCall */
>                rsinfo.setResult = NULL;
>                rsinfo.setDesc = NULL;
>
>                InitFunctionCallInfoData(fcinfo, &flinfo, 4, InvalidOid, NULL, 
> (Node *) &rsinfo);
>                fcinfo.arg[0] = ObjectIdGetDatum(funcOid);
>                fcinfo.arg[1] = ObjectIdGetDatum(relid);
>

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Alvaro Herrera


Excerpts from Pavel Stehule's message of mar feb 28 16:30:58 -0300 2012:
> Hello
> 
> Dne 28. února 2012 17:48 Alvaro Herrera  
> napsal(a):
> >
> >
> > I have a few comments about this patch:
> >
> > I didn't like the fact that the checker calling infrastructure uses
> > SPI instead of just a FunctionCallN to call the checker function.  I
> > think this should be easily avoidable.
> 
> It is not possible - or it has not simple solution (I don't how to do
> it). PLpgSQL_checker is SRF function. SPI is used for processing
> returned resultset. I looked to pg source code, and I didn't find any
> other pattern than using SPI for SRF function call. It is probably
> possible, but it means some code duplication too. I invite any ideas.

It wasn't all that difficult -- see below.  While at this, I have a
question: how attached you are to the current return format for CHECK
FUNCTION?

check function f1();
   CHECK FUNCTION
-
 In function: 'f1()'
 error:42804:5:assignment:subscripted object is not an array
(2 rows)

It seems to me that it'd be trivial to make it look like this instead:

check function f1();
function | lineno | statement  | sqlstate |  message   
| detail | hint | level | position | query 
-+++--+++--+---+--+---
f1() |  5 | assignment | 42804| subscripted object is not an array 
||  | error |  | 
(1 row)

This looks much nicer to me.

One thing we lose is the caret marking the position of the error -- but
I'm wondering if that really works well.  I didn't test it but from the
code it looks to me like it'd misbehave if you had a multiline statement.

Opinions?


/*
 * Search and execute the checker function.
 *
 *   returns true, when checked function is valid
 */
static bool
CheckFunctionById(Oid funcOid, Oid relid, ArrayType *options,
  bool fatal_errors, TupOutputState *tstate)
{
HeapTuple   tup;
Form_pg_procproc;
HeapTuple   languageTuple;
Form_pg_language lanForm;
Oid languageChecker;
char   *funcname;
int result;

tup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid));
if (!HeapTupleIsValid(tup)) /* should not happen */
elog(ERROR, "cache lookup failed for function %u", funcOid);

proc = (Form_pg_proc) GETSTRUCT(tup);

languageTuple = SearchSysCache1(LANGOID, 
ObjectIdGetDatum(proc->prolang));
Assert(HeapTupleIsValid(languageTuple));

lanForm = (Form_pg_language) GETSTRUCT(languageTuple);
languageChecker = lanForm->lanchecker;

funcname = format_procedure(funcOid);

/* We're all set to call the checker */
if (OidIsValid(languageChecker))
{
TupleDesc   tupdesc;
Datum   checkret;
FmgrInfoflinfo;
ReturnSetInfo   rsinfo;
FunctionCallInfoData fcinfo;

/* create the tuple descriptor that the checker is supposed to 
return */
tupdesc = CreateTemplateTupleDesc(10, false);
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "functionid", 
REGPROCOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 2, "lineno", INT4OID, 
-1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 3, "statement", 
TEXTOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 4, "sqlstate", 
TEXTOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 5, "message", TEXTOID, 
-1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 6, "detail", TEXTOID, 
-1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 7, "hint", TEXTOID, 
-1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 8, "level", TEXTOID, 
-1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 9, "position", 
INT4OID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 10, "query", TEXTOID, 
-1, 0);

fmgr_info(languageChecker, &flinfo);

rsinfo.type = T_ReturnSetInfo;
rsinfo.econtext = CreateStandaloneExprContext();
rsinfo.expectedDesc = tupdesc;
rsinfo.allowedModes = (int) SFRM_Materialize;
/* returnMode is set by the checker, hopefully ... */
/* isDone is not relevant, since not using ValuePerCall */
rsinfo.setResult = NULL;
rsinfo.setDesc = NULL;

InitFunctionCallInfoData(fcinfo, &flinfo, 4, InvalidOid, NULL, 
(Node *) &rsinfo);
fcinfo.arg[0] = ObjectIdGetDatum(funcOid)

Re: [HACKERS] Command Triggers, patch v11

2012-03-02 Thread Thom Brown

On 3 March 2012 00:08, Thom Brown  wrote:
> On 2 March 2012 23:33, Thom Brown  wrote:
>> On 2 March 2012 22:32, Dimitri Fontaine  wrote:
> test=# CREATE TABLE badname (id int, a int, b text);
> ERROR:  invalid relation name: badname
> test=# CREATE TABLE badname AS SELECT 1::int id, 1::int a, ''::text b;
> SELECT 1
>
> This doesn't even get picked up by ANY COMMAND.
>>>
>>> I think Andres should add an entry for his patch on the commitfest.  Is
>>> it ok?
>>
>> I'll try Andres' patch this weekend while I test yours.
>
> Actually no matter which patch I apply first, they cause the other to
> fail to apply.

And having tried building it, it appears to fail.

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -I../../../src/include
-D_GNU_SOURCE -I/usr/include/libxml2   -c -o alter.o alter.c -MMD -MP
-MF .deps/alter.Po
alter.c: In function ‘ExecRenameStmt’:
alter.c:69:4: warning: passing argument 1 of ‘RenameCmdTrigger’ from
incompatible pointer type [enabled by default]
../../../src/include/commands/cmdtrigger.h:45:13: note: expected
‘const char *’ but argument is of type ‘struct List *’
alter.c:69:4: error: too many arguments to function ‘RenameCmdTrigger’
../../../src/include/commands/cmdtrigger.h:45:13: note: declared here

You've changed the signature of RenameCmdTrigger but still pass in the
old arguments in alter.c.  If I fix this locally, I then get:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -I../../../src/include
-D_GNU_SOURCE -I/usr/include/libxml2   -c -o copyfuncs.o copyfuncs.c
-MMD -MP -MF .deps/copyfuncs.Po
copyfuncs.c: In function ‘_copyAlterCmdTrigStmt’:
copyfuncs.c:3479:2: error: ‘AlterCmdTrigStmt’ has no member named ‘command’
copyfuncs.c:3479:2: error: ‘AlterCmdTrigStmt’ has no member named ‘command’
copyfuncs.c:3479:2: error: ‘AlterCmdTrigStmt’ has no member named ‘command’

There's an erroneous "COPY_STRING_FIELD(command)" in there, as well as
in equalfuncs.c.  After removing both those instances it builds.

Are you sure you don't have any uncommitted changes?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers, patch v11

2012-03-02 Thread anara...@anarazel.de



"anara...@anarazel.de"  schrieb:

>
>
>Thom Brown  schrieb:
>
>>On 2 March 2012 23:33, Thom Brown  wrote:
>>> On 2 March 2012 22:32, Dimitri Fontaine 
>>wrote:
>> test=# CREATE TABLE badname (id int, a int, b text);
>> ERROR:  invalid relation name: badname
>> test=# CREATE TABLE badname AS SELECT 1::int id, 1::int a,
>>''::text b;
>> SELECT 1
>>
>> This doesn't even get picked up by ANY COMMAND.

 I think Andres should add an entry for his patch on the commitfest.
>> Is
 it ok?
>>>
>>> I'll try Andres' patch this weekend while I test yours.
>>
>>Actually no matter which patch I apply first, they cause the other to
>>fail to apply.
>I will try to fix that on the plain tomorrow (to NYC) but I am not yet
>sure when/where I will have internet access again.
One more try: To the list.

Andres

Please excuse the brevity and formatting - I am writing this on my mobile phone.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers, patch v11

2012-03-02 Thread Robert Haas

On Tue, Feb 28, 2012 at 10:09 AM, Kevin Grittner
 wrote:
> Tom Lane  wrote:
>> This seems over-complicated.  Triggers on tables do not have
>> alterable properties, why should command triggers?  I vote for
>>
>>       CREATE COMMAND TRIGGER name ... properties ...;
>>
>>       DROP COMMAND TRIGGER name;
>>
>> full stop.  If you want to run the same trigger function on some
>> more commands, add another trigger name.
>
> +1

+1.  I suggested the same thing a while back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers, patch v11

2012-03-02 Thread Thom Brown

On 2 March 2012 23:33, Thom Brown  wrote:
> On 2 March 2012 22:32, Dimitri Fontaine  wrote:
 test=# CREATE TABLE badname (id int, a int, b text);
 ERROR:  invalid relation name: badname
 test=# CREATE TABLE badname AS SELECT 1::int id, 1::int a, ''::text b;
 SELECT 1

 This doesn't even get picked up by ANY COMMAND.
>>
>> I think Andres should add an entry for his patch on the commitfest.  Is
>> it ok?
>
> I'll try Andres' patch this weekend while I test yours.

Actually no matter which patch I apply first, they cause the other to
fail to apply.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Robert Haas

On Fri, Mar 2, 2012 at 4:56 PM, Simon Riggs  wrote:
> Checksums patch isn't sucking much attention at all but admittedly
> there are some people opposed to the patch that want to draw out the
> conversation until the patch is rejected,

Wow.  Sounds like a really shitty thing for those people to do -
torpedoing a perfectly good patch for no reason.

I have an alternative theory, though: they have sincere objections and
don't accept your reasons for discounting those objections.

> I'm not sure how this topic is even raised here, since the patches are
> wholly and completely separate, apart from the minor and irrelevant
> point that the patch authors both work for 2ndQuadrant. If that
> matters at all, I'll be asking how and why.

It came up because Josh pointed out that this patch is, in his
opinion, in better shape than the checksum patch.  I don't believe
anyone's employment situation comes into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers, patch v11

2012-03-02 Thread Thom Brown

On 2 March 2012 22:32, Dimitri Fontaine  wrote:
> Hi,
>
> Please find attached v13 of the command trigger patch, fixing most of
> known items and rebased against master. Two important items remain to be
> done, but I figured I should keep you posted in the meantime.

Thanks Dimitri.  I'll give it a spin this weekend.

> Tom Lane  writes:
>> This seems over-complicated.  Triggers on tables do not have alterable
>> properties, why should command triggers?  I vote for
>>
>>       CREATE COMMAND TRIGGER name ... properties ...;
>>
>>       DROP COMMAND TRIGGER name;
>>
>> full stop.  If you want to run the same trigger function on some more
>> commands, add another trigger name.
>
> I reworked ALTER COMMAND TRIGGER and DROP COMMAND TRIGGER in the
> attached, and the catalog too so that the command trigger's name is now
> unique there.  I added separate index on the command name.

I see you now only allow a single command to be attached to a trigger
at creation time.  I was actually fine with how it worked before, just
not with the DROP COMMAND.  So, CREATE COMMAND TRIGGER could add a
trigger against several commands, but DROP COMMAND TRIGGER would drop
the whole lot as a single trigger rather than having the ability to
drop separate commands.  This is how regular triggers work, in that
you can attach to several types of SQL statement, but you have to drop
the trigger as a whole rather than dropping individual statements.

Tom, could you clarify your suggestion for this?  By "properties", do
you mean possibly several commands?

> Thom Brown  writes:
>> Just so it's easy to scan.  If someone is looking for CREATE CAST,
>> they'd kind of expect it near the drop of the CREATE list, but it's
>> actually toward the bottom.  It just looks random at the moment.
>
> I did M-x sort-lines in the documentation.  Still have to add entries
> for the new catalog though.
>
>>> test=# CREATE TABLE badname (id int, a int, b text);
>>> ERROR:  invalid relation name: badname
>>> test=# CREATE TABLE badname AS SELECT 1::int id, 1::int a, ''::text b;
>>> SELECT 1
>>>
>>> This doesn't even get picked up by ANY COMMAND.
>
> I think Andres should add an entry for his patch on the commitfest.  Is
> it ok?

I'll try Andres' patch this weekend while I test yours.

> Tom Lane  writes:
>> FWIW, I agree with Thom on this.  If we do it as you suggest, I
>> confidently predict that it will be less than a year before we seriously
>> regret it.  Given all the discussion around this, it's borderline insane
>> to believe that the set of parameters to be passed to command triggers
>> is nailed down and won't need to change in the future.
>
> That too will need to wait for the next revision, it's just about
> finding enough cycles, which is definitely happening very soon.

Could you give your thoughts on the design?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY with hints, rebirth

2012-03-02 Thread Kevin Grittner

Simon Riggs  wrote:

> I like Marti's idea. At present, making his idea work could easily
> mean checksums sink, so not sure whether to attempt to make that
> work in detail.

For my part, improving bulk load performance and TRUNCATE
transactional semantics would trump checksums.  If we have to defer
one or the other to a later release, I would prefer that we bump
checksums.

While I understand the desire for checksums, and understand others
have had situations where they would have helped, so far I have yet
to run into a situation where I feel they would have helped me.  The
hint bit and freeze issues require ongoing attention and have real
performance consequences on a regular basis.  And closing the window
for odd transactional semantics on TRUNCATE versus DELETE of all
rows in a table would be one less thing to worry about, in my world.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY with hints, rebirth

2012-03-02 Thread Simon Riggs

On Fri, Mar 2, 2012 at 8:58 PM, Noah Misch  wrote:
> On Fri, Mar 02, 2012 at 08:46:45AM +, Simon Riggs wrote:
>> On Thu, Mar 1, 2012 at 8:49 PM, Heikki Linnakangas 
>>  wrote:
>> > It's still broken:
>
> [BEGIN;TRUNCATE;SAVEPOINT;COPY;ROLLBACK TO]
>
>> So this approach isn't the one...
>>
>> The COPY FREEZE patch provides a way for the user to say explicitly
>> that they don't really care about these MVCC corner cases and as a
>> result allows us to avoid touching XidInMVCCSnapshot() at all. So
>> there is still a patch on the table.
>
> You can salvage the optimization by tightening its prerequisite: use it when
> the current subtransaction or a child thereof created or truncated the table.
> A parent subtransaction having done so is acceptable for the WAL avoidance
> optimization but not for this.

That's already what it does. The problem is what happens after the COPY.

If we said "you can't see the rows you just loaded" it would be
problem over, but in my opinion that needs an explicit request from
the user to show they accept that.

> A COPY FREEZE ignoring that stronger restriction would be an interesting
> feature in its own right, but it brings up other problems.  For example,
> suppose you write a tuple, then fail while writing its index entries.  The
> tuple is already frozen and visible, but it lacks a full set of live index
> entries.  The subtransaction aborts, but the top transaction commits and makes
> the situation permanent.

The optimisation only works in the subtransaction that created the
relfilenode so that isn't an issue.

Thanks for your input.

> Incidentally, I contend that we should write frozen tuples to new/truncated
> tables unconditionally.  The current behavior of making old snapshots see the
> table as empty violates atomicity at least as badly as letting those snapshots
> see the future-snapshot contents.  But Marti has a sound proposal that would
> interact with your efforts here to avoid violating atomicity at all:
> http://archives.postgresql.org/message-id/cabrt9rbrmdsoz8kxgehfb4lg-ev9u67-6dlqvoiibpkkhtl...@mail.gmail.com

Personally, that seems a pretty reasonable thing.

I like Marti's idea. At present, making his idea work could easily
mean checksums sink, so not sure whether to attempt to make that work
in detail.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible new option for wal_sync_method

2012-03-02 Thread Dan Scales

Hi,

> Got any result so far?

I measured the results with barrier=0, and yes, you are correct -- it seems 
that most of the benefit of the open_direct wal_sync_method is probably from 
not doing the barrier operation at the end of fsync():

  wal_sync_method
   fdatasync   open_direct  open_sync
no archive, barrier=1:  1730918507   17138
no archive, barrier=0:  1777118369   18045

archive, barrier=1   :  1578916592   15645
archive, barrier=0   :  1661616785   16547

It took me a while to look through Linux, and understand why barrier=1 had such 
an effect, even for disks with battery-backed caches.  As you
pointed out, the barrier operation not only flushes the disk cache, but also 
has some queue implications, particularly for Linux releases below
2.6.37.  I've been using 2.6.32, and in that case, the barrier at the end of 
fsync requires that all previously-queued operations be finished before the 
barrier occurs and flushes the disk cache.  This means that each fsync of the 
WAL log is likely waiting for completely unrelated in-flight operations of the 
data files.  That is why getting rid of the fsync of the WAL log has such a 
good performance win, even for disks that don't have a disk cache flush 
(because the cache is battery backed).  This option will probably have less 
benefit for Linux 2.6.37 and above, where
barriers are eliminated, and operations are written more specifically in terms 
of disk cache flushes.

fsync() on ext3 (even for Linux 2.6.37 and above) does still wait for any 
outstanding meta-data transaction to commit.  So, there is still another
reason to put the WAL log and data files on different logical disks (even if 
backed by the same physical disk).

It does still seem to me the sync_file_range() is unsafe in the case of 
non-battery backed disk write caches, since it doesn't sync the disk
cache.  However, if sync_file_range() was being used to optimize checkpoint 
fsyncs, then one final fsync() to an unused file on the same block
device would do the trick of flushing the disk cache.

Dan

- Original Message -
From: "Andres Freund" 
To: pgsql-hackers@postgresql.org
Cc: "Dan Scales" 
Sent: Monday, February 27, 2012 12:43:49 PM
Subject: Re: [HACKERS] possible new option for wal_sync_method

Hi,

On Friday, February 17, 2012 01:17:27 AM Dan Scales wrote:
> Good point, thanks.  From the ext3 source code, it looks like
> ext3_sync_file() does a blkdev_issue_flush(), which issues a flush to the
> block device, whereas simple direct IO does not.  So, that would make
> this wal_sync_method option less useful, since, as you say, the user
> would have to know if the block device is doing write caching.
The experiments I know which played with disabling write caches nearly always 
had the result that write caching as worth the overhead of syncing.

> For the numbers I reported, I don't think the performance gain is from
> not doing the block device flush.  The system being measured is a Fibre
> Channel disk which should have a fully-nonvolatile disk array.  And
> measurements using systemtap show that blkdev_issue_flush() always takes
> only in the microsecond range.
Well, I think it has some io queue implications which could explain some of 
the difference. With that regard I think it heavily depends on the kernel 
version as thats an area which had loads of pretty radical changes in nearly 
every release since 2.6.32.

> I think the overhead is still from the fact that ext3_sync_file() waits
> for the current in-flight transaction if there is one (and does an
> explicit device flush if there is no transaction to wait for.)  I do
> think there are lots of meta-data operations happening on the data files
> (especially for a growing database), so the WAL log commit is waiting for
> unrelated data operations.  It would be nice if there a simple file
> system operation that just flushed the cache of the block device
> containing the filesystem (i.e. just does the blkdev_issue_flush() and
> not the other things in ext3_sync_file()).
I think you are right there. I think the metadata issue could be relieved a 
lot by doing the growing of files in way much larger bits than currently. I 
have seen profiles which indicated that lots of time was spent on increasing 
the file size. I would be very interested in seing how much changes in that 
area would benefit real-world benchmarks.

> The ext4_sync_file() code looks fairly similar, so I think it may have
> the same problem, though I can't be positive.  In that case, this
> wal_sync_method option might help ext4 as well.
The journaling code for ext4 is significantly different so I think it very 
well might play a role here - although youre probably right and it wont be in 
*_sync_file.

> With respect to sync_file_range(), the Linux code that I'm looking at
> doesn't really seem to indicate that there is a device flush (since it
> never calls a f_op->f

Re: [HACKERS] Runtime SHAREDIR for testing CREATE EXTENSION

2012-03-02 Thread Daniel Farina

On Fri, Mar 2, 2012 at 1:50 PM, Robert Haas  wrote:
>> But is it unsurmountable? -- dlsym returns a function pointer, and one
>> would build up the operator table for the version of the extension at
>> hand, so one might have ltree "version 1.01" and ltree "version 2.3"
>> fields in the same database.
>
> That might be possible, but it seems largely unrelated to the topic of
> this thread, unless I am missing something.

It might be, and might not be, because if we start getting into where
to put shared objects for functionality linked to the database (rather
just letting the dynamic linker do "whatever") it could crash head-on
into data field versioning and replica  issues. If someone thinks
about it for a few minutes and says "nope, orthogonal," I'm without
objection, but I did want to raise this spectre to other minds because
it seems to be on the edge of that maelstrom.

I also look at the inability for pg_upgrade to help when certain
contribs are used between versions and can only say "phew, I'm glad
that doesn't affect most people" (a sad flip side: people aren't using
contribs and database extension as much as I like, but things are
moving towards more of that and not less)  because then the expected
duration of the upgrade process becomes a very ugly compatibility
matrix that nobody wants to see...somewhat dissatisfying for a service
provider.   Yet, formats can and will change requiring a lot of
gradual I/O, and a version byte is not always practical, so the best
option I can think of is to support multiple versions of formats,
operators and types at the same time, and then gradually rewrite
things.

I'm not intending to open discussion about any of that right now
(unless someone else is interested and wants to start a new thread),
but I wanted to provide insight as to to why I'm especially
preoccupied about this.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Simon Riggs

On Fri, Mar 2, 2012 at 8:13 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> This is exactly why I'm not keen on checksums for 9.2.  We've reached
>> the point where the attention on the checksum patch is pushing aside
>> other patches which are more ready and have had more work.
>
> IMO the reason why it's sucking so much attention is precisely that it's
> not close to being ready to commit.  But this is well off topic for the
> thread we're on.  If you want to propose booting checksums from
> consideration for 9.2, let's have that discussion on the checksum
> thread.

Checksums patch isn't sucking much attention at all but admittedly
there are some people opposed to the patch that want to draw out the
conversation until the patch is rejected, but that's not the same
thing. The main elements of the patch have been working for around 7
weeks by now.

I'm not sure how this topic is even raised here, since the patches are
wholly and completely separate, apart from the minor and irrelevant
point that the patch authors both work for 2ndQuadrant. If that
matters at all, I'll be asking how and why.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Runtime SHAREDIR for testing CREATE EXTENSION

2012-03-02 Thread Robert Haas

On Fri, Mar 2, 2012 at 1:53 PM, Daniel Farina  wrote:
> On Fri, Mar 2, 2012 at 10:37 AM, Peter Eisentraut  wrote:
>> On tis, 2012-02-28 at 11:00 -0800, Daniel Farina wrote:
>>> I'd really like to support libraries (C or otherwise) of multiple
>>> versions at the same time, when the underlying library permits.
>>
>> What's preventing you from doing that now?  You need to name all the
>> symbols differently, of course.
>
> That's the problem: not really practical in a wider ecosystem of C
> libraries, especially if the library produces multiple versions.  (Or,
> not practical unless someone writes some credible
> symbol-version-mangling-magic)
>
> But is it unsurmountable? -- dlsym returns a function pointer, and one
> would build up the operator table for the version of the extension at
> hand, so one might have ltree "version 1.01" and ltree "version 2.3"
> fields in the same database.

That might be possible, but it seems largely unrelated to the topic of
this thread, unless I am missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY with hints, rebirth

2012-03-02 Thread Kevin Grittner

Noah Misch  wrote:

> Incidentally, I contend that we should write frozen tuples to
> new/truncated tables unconditionally.

+1

> The current behavior of making old snapshots see the table as
> empty violates atomicity at least as badly as letting those 
> snapshots see the future-snapshot contents.

Right, there was no point where the table existed as empty at the
end of a transaction, so it is quite broken as things stand now.

> But Marti has a sound proposal that would interact with your
> efforts here to avoid violating atomicity at all

Well, getting it right is certainly better than moving from a slow
non-conforming behavior to a fast non-conforming behavior.  ;-)

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY with hints, rebirth

2012-03-02 Thread Noah Misch

On Fri, Mar 02, 2012 at 08:46:45AM +, Simon Riggs wrote:
> On Thu, Mar 1, 2012 at 8:49 PM, Heikki Linnakangas 
>  wrote:
> > It's still broken:

[BEGIN;TRUNCATE;SAVEPOINT;COPY;ROLLBACK TO]

> So this approach isn't the one...
> 
> The COPY FREEZE patch provides a way for the user to say explicitly
> that they don't really care about these MVCC corner cases and as a
> result allows us to avoid touching XidInMVCCSnapshot() at all. So
> there is still a patch on the table.

You can salvage the optimization by tightening its prerequisite: use it when
the current subtransaction or a child thereof created or truncated the table.
A parent subtransaction having done so is acceptable for the WAL avoidance
optimization but not for this.

A COPY FREEZE ignoring that stronger restriction would be an interesting
feature in its own right, but it brings up other problems.  For example,
suppose you write a tuple, then fail while writing its index entries.  The
tuple is already frozen and visible, but it lacks a full set of live index
entries.  The subtransaction aborts, but the top transaction commits and makes
the situation permanent.

Incidentally, I contend that we should write frozen tuples to new/truncated
tables unconditionally.  The current behavior of making old snapshots see the
table as empty violates atomicity at least as badly as letting those snapshots
see the future-snapshot contents.  But Marti has a sound proposal that would
interact with your efforts here to avoid violating atomicity at all:
http://archives.postgresql.org/message-id/cabrt9rbrmdsoz8kxgehfb4lg-ev9u67-6dlqvoiibpkkhtl...@mail.gmail.com

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] sortsupport for text

2012-03-02 Thread Robert Haas

I decided to investigate the possible virtues of allowing "text" to
use the sortsupport infrastructure, since strings are something people
often want to sort.  I generated 100,000 random alphanumeric strings,
each 30 characters in length, and loaded them into a single-column
table, froze it, ran SELECT SUM(1) FROM tablename on it until it was
fully cached, and then repeatedly quicksorted the table contents using
my default locale (en_US.UTF8).  I repeated this test a number of
times, removing and recreating the data directory via initdb each
time.  The test was performed on my home desktop, which is running
Fedora 14 (yeah, I know I should reinstall) and equipped with an AMD
Athlon 5000 Dual-Core Processor.  Here's the exact test query:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t) x;

On unpatched master, this takes about 416 ms (plus or minus a few).
With the attached patch, it takes about 389 ms (plus or minus a very
few), a speedup of about 7%.

I repeated the experiment using the C locale, like this:

SELECT SUM(1) FROM (SELECT * FROM randomtext ORDER BY t COLLATE "C") x;

Here, it takes about 202 ms with the patch, and about 231 ms on
unpatched master, a savings of about 13%.

I also tried on a larger data set of 5 million strings, with a heap
sort using work_mem=256MB.  Unfortunately, there was so much noise
there that it was hard to get any useful measurements: the exact same
code base, using the exact same test script (that started with an
initdb) could perform quite differently on consecutive runs, perhaps
because the choice of blocks chosen to contain the database itself
affected the efficiency of reading and writing temporary files.  I
think it may be faster, and the results on the smaller data set argue
that it should be faster, but I was unable to gather reproducible
numbers.  I did observe the following oprofile results on a run on
this larger data set, on master:

1278928.2686  libc-2.13.so strcoll_l
6802 15.0350  postgres text_cmp
5081 11.2310  postgres comparetup_heap
3510  7.7584  postgres comparison_shim
2892  6.3924  postgres lc_collate_is_c
2722  6.0167  no-vmlinux   /no-vmlinux
2596  5.7382  postgres varstr_cmp
2517  5.5635  libc-2.13.so __strlen_sse2
2515  5.5591  libc-2.13.so __memcpy_sse2
968   2.1397  postgres tuplesort_heap_siftup
710   1.5694  postgres bttextcmp
664   1.4677  postgres pg_detoast_datum_packed

Clearly, a lot of that is unnecessary.  Doing lc_collate_is_c for
every tuple is a complete waste; as is translating the collation OID
to collate_t; this patch arranges to do those things just once per
sort.  The comparison_shim is also a waste.  Considering all that, I
had hoped for more like a 15-20% gain from this approach, but it
didn't happen, I suppose because some of the instructions saved just
resulted in more processor stalls.  All the same, I'm inclined to
think it's still worth doing.

I didn't attempt to handle the weirdness that is UTF-8 on Windows,
since I don't develop on Windows.  I thought when I wrote this code
that I could just leave the comparator uninitialized and let the
caller default to the shim implementation if the sort-support function
didn't do anything.  But I see now that
PrepareSortSupportFromOrderingOp() feels that it's entitled to assume
that the sort-support function will always fill in a comparator.
Either that assumption needs to be changed, or the corresponding
Windows code needs to be written, or the sort support function needs
to call PrepareSortSupportComparisonShim() in this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sortsupport-text-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-02 Thread Tom Lane

I wrote:
> ... So my preference is to align the two
> definitions of STATISTIC_KIND_MCELEM by adding a null-element frequency
> to tsvector's usage (where it'll always be zero) and getting rid of the
> average distinct element count here.

Actually, there's a way we can do this without code changes in the
tsvector stuff.  Since the number of MCELEM stanumber items that provide
frequencies of stavalue items is obviously equal to the length of
stavalues, we could define stanumbers as containing those matching
entries, then two min/max entries, then an *optional* entry for the
frequency of null elements (with the frequency presumed to be zero if
omitted).  This'd be non-ambiguous given access to stavalues.  I'm not
sure though if making the null frequency optional wouldn't introduce
complexity elsewhere that outweighs not having to touch the tsvector
code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only quals vs. security_barrier views

2012-03-02 Thread Noah Misch

On Thu, Feb 09, 2012 at 12:02:29PM -0500, Robert Haas wrote:
> When Heikki worked up his original index-only scan patches (which
> didn't end up looking much like what eventually got committed), he had
> the notion of an index-only qual.  That is, given a query like this:
> 
> select sum(1) from foo where substring(a,1,3) = 'abc';
> 
> We could evaluate the substring qual before performing the heap fetch,
> and fetch the tuple from the heap only if the qual passes.

> Now, there's a fly in the ointment here, which is that applying
> arbitrary user-defined functions to tuples that might not be visible
> doesn't sound very safe.  The user-defined function in question might
> perform some action based on those invisible tuples that has side
> effects, which would be bad, because now we're violating MVCC
> semantics.  Or it might throw an error, killing the scan dead on the
> basis of the contents of some tuple that the scan shouldn't even see.
>  However, there is certainly a class of functions for which this type
> of optimization would be safe, and it's an awful lot like the set of
> functions that can be safely pushed down through a security_barrier
> view - namely, things that don't have side effects or throw errors.
> So it's possible that the proleakproof flag KaiGai is proposing to add
> to pg_proc could do double duty, serving also to identify when it's
> safe to apply a qual to an index tuple when the corresponding heap
> tuple might be invisible.  However, I have some reservations about
> assuming that the two concepts are exactly the same.  For one thing,
> people are inevitably going to want to cheat a little bit more here
> than is appropriate for security views, and encouraging people to mark
> things LEAKPROOF when they're really not is a security disaster
> waiting to happen.

The similarity is indeed tempting, but I find the concepts sufficiently
distinct to not make one device serve both.  Adding to the reservations you
offer, LEAKPROOF is superuser-only.  This other marker would not entail any
special privilege.

> For another thing, there are some important cases
> that this doesn't cover, like:
> 
> select * from foo where substring(a,1,3) like '%def%';
> 
> The like operator doesn't seem to be leakproof in the security sense,
> because it can throw an error if the pattern is something like a
> single backslash (ERROR:  LIKE pattern must not end with escape
> character) and indeed it doesn't seem like it would be safe here
> either if the pattern were stored in the table.  But if the pattern
> were constant, it'd be OK, or almost OK: there's still the edge case
> where the table contains invisible rows but no visible ones - whether
> or not we complain about the pattern there ought to be the same as
> whether or not we complain about it on a completely empty table.  If
> we got to that point, then we might as well consider the qual
> leakproof for security purposes under the same set of circumstances
> we'd consider it OK to apply to possibly-invisible tuples.

This sort of thing implicates substring(), too, when you call it as
substring(a, 1, b); b < 0 produces an error.

To handle these, I think we'd need a facility along the lines of protransform.
Have a function inspecting call nodes for a particular other function and
determining whether each is ok-for-index-only-quals.  You could even force
protransform itself into that role.  Create an additional pg_proc entry
identical to the ordinary substring() but for a different name and having the
ok-for-index-only-quals flag.  Add a protransform to the main pg_proc entry
that inspects the argument nodes and, when they're safe, replaces the call
with a call to that errorfree_substring() at plan time.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Tom Lane

Josh Berkus  writes:
> This is exactly why I'm not keen on checksums for 9.2.  We've reached
> the point where the attention on the checksum patch is pushing aside
> other patches which are more ready and have had more work.

IMO the reason why it's sucking so much attention is precisely that it's
not close to being ready to commit.  But this is well off topic for the
thread we're on.  If you want to propose booting checksums from
consideration for 9.2, let's have that discussion on the checksum
thread.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Tom Lane

Robert Haas  writes:
> On Fri, Mar 2, 2012 at 12:48 PM, Tom Lane  wrote:
>> We'll get to it in due time.  In case you haven't noticed, there's a lot
>> of stuff in this commitfest.  And I don't follow the logic that says
>> that because Simon is trying to push through a not-ready-for-commit
>> patch we should drop our standards for other patches.

> I don't follow that logic either, but I also feel like this CommitFest
> is dragging on and on.  Unless you -- or someone -- are prepared to
> devote a lot more time to this, "due time" is not going to arrive any
> time in the forseeable future.  We're currently making progress at a
> rate of maybe 4 patches a week, at which rate we're going to finish
> this CommitFest in May.  And that might be generous, because we've
> been disproportionately knocking off the easy ones.  Do we have any
> kind of a plan for, I don't know, bringing this to closure on some
> reasonable time frame?

Well, personally I was paying approximately zero attention to the
commitfest for most of February, because I was occupied with trying to
get back-branch releases out, as well as some non-Postgres matters.
CF items are now back to the head of my to-do queue; you may have
noticed that I'm busy with Korotkov's array stats patch.  I do intend to
take this one up in due course (although considering it's not marked
Ready For Committer yet, I don't see that it deserves time ahead of
those that are).

As for when we'll be done with the CF, I dunno, but since it's the last
one for this release cycle I didn't think that we'd be arbitrarily
closing it on any particular schedule.  It'll be done when it's done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Josh Berkus


> We'll get to it in due time.  In case you haven't noticed, there's a lot
> of stuff in this commitfest.  And I don't follow the logic that says
> that because Simon is trying to push through a not-ready-for-commit
> patch we should drop our standards for other patches.

What I'm pointing out is that Peter shouldn't even be talking about
cutting functionality from an apparently-ready-for-committer patch in
order to yield way to a patch about which people are still arguing
specification.

This is exactly why I'm not keen on checksums for 9.2.  We've reached
the point where the attention on the checksum patch is pushing aside
other patches which are more ready and have had more work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Simon Riggs

On Fri, Mar 2, 2012 at 5:48 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>>> It would probably be prudent to concentrate on getting the core
>>> infrastructure committed first. That way, we at least know that if
>>> this doesn't get into 9.2, we can work on getting it into 9.3 knowing
>>> that once committed, people won't have to wait over a year at the very
>
>> I don't see why we can't commit the whole thing.  This is way more ready
>> for prime-time than checksums.
>
> We'll get to it in due time.  In case you haven't noticed, there's a lot
> of stuff in this commitfest.  And I don't follow the logic that says
> that because Simon is trying to push through a not-ready-for-commit
> patch we should drop our standards for other patches.

Hmm, not deaf you know. I would never push through a patch that isn't
ready for commit. If I back something it is because it is ready for
use in production by PostgreSQL users, in my opinion. I get burned
just as much, if not more, than others if that's a bad decision, so
its not given lightly.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 16-bit page checksums for 9.2

2012-03-02 Thread Jeff Janes

On Sun, Feb 19, 2012 at 1:49 PM, Simon Riggs  wrote:
> On Thu, Feb 16, 2012 at 11:16 AM, Simon Riggs  wrote:
>
>> v8 attached
>
> v10 attached.
>
> This patch covers all the valid concerns discussed and has been
> extensively tested.

If I turn wal_level to anything other than minimal, this fails to pass
make installcheck with multiple instances of

ERROR:  invalid page header in block 0 of relation

I've looked around in the patch docs and this thread, and this doesn't
seem to be either expected or already known.

This occurs even when page_checksums is off.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 16-bit page checksums for 9.2

2012-03-02 Thread Simon Riggs

On Fri, Mar 2, 2012 at 2:11 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> One thing I'm not too sure about is how to extend the page format to
>> handle optional features.  For example, suppose we want to add 2 bytes
>> to the page header for a checksum (or 4 bytes, or any other number).
>> Ideally, I'd like to not use up those 2 bytes on pages that aren't
>> checksummed.  What do we do about that?
>
> I'm having a hard time getting excited about adding that requirement on
> top of all the others that we have for this feature.  In particular, if
> we insist on that, there is exactly zero chance of turning checksums on
> on-the-fly, because you won't be able to do it if the page is full.
>
> The scheme that seems to me to make the most sense for checksums,
> if we grant Simon's postulate that a 2-byte checksum is enough, is
> (1) repurpose the top N bits of pd_flags as a page version number,
>for some N;
> (2) repurpose pd_pagesize_version as the checksum, with the
>understanding that you can't have a checksum in version-0 pages.

I'd say N = 1, for now. N > 1 in future, as needed. We can document the
intention to reserve high end bits for that purpose. We can defer the
decision about what bits are used for until they are needed.

> (Simon's current patch seems to be an overengineered version of that.

Happy to make changes. Having 3 bits instead of 1 is just for robustness,
but would accept the opinion that this is not worth having.

One suggestion from Heikki is that we don't change the bits at all;
we just have a database level setting that says whether checksums are
enabled. We use VACUUM FREEZE to enable checksumming, rewrite the blocks
and then enable at db level. Once fully enabled we check the checksums
on read. If we do that, we don't need to record that the page format
has changed and we could dispense with page-level flags entirely, but
of course that then means you can't inspect a block and know whether its
actually got a checksum on it or maybe its just a version field.

If we want to know the difference between page formats we need to use
flag bits to indicate that. Which in some ways could be called fragile.

Personally, don't mind what we do there.

> Possibly we should also ask ourselves how much we really need pd_tli
> and whether that couldn't be commandeered as the checksum field.)

pd_tli has slightly more utility than pd_pagesize_version, but we could
easily live without either.

The question is whether a 32-bit value, split into 2 pieces, is a
faster and better way than the current proposal. Splitting the value in two
doesn't sound like it would lead to good performance or sensible code,
but someone may have a reason why 32-bit checksums are important.

IMHO we have a viable mechanism for checksums, its all down to what user
interface we would prefer and related details.

I'm happy to implement whatever we decide.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Robert Haas

On Fri, Mar 2, 2012 at 12:48 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>>> It would probably be prudent to concentrate on getting the core
>>> infrastructure committed first. That way, we at least know that if
>>> this doesn't get into 9.2, we can work on getting it into 9.3 knowing
>>> that once committed, people won't have to wait over a year at the very
>
>> I don't see why we can't commit the whole thing.  This is way more ready
>> for prime-time than checksums.
>
> We'll get to it in due time.  In case you haven't noticed, there's a lot
> of stuff in this commitfest.  And I don't follow the logic that says
> that because Simon is trying to push through a not-ready-for-commit
> patch we should drop our standards for other patches.

I don't follow that logic either, but I also feel like this CommitFest
is dragging on and on.  Unless you -- or someone -- are prepared to
devote a lot more time to this, "due time" is not going to arrive any
time in the forseeable future.  We're currently making progress at a
rate of maybe 4 patches a week, at which rate we're going to finish
this CommitFest in May.  And that might be generous, because we've
been disproportionately knocking off the easy ones.  Do we have any
kind of a plan for, I don't know, bringing this to closure on some
reasonable time frame?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review of: collation for (expr)

2012-03-02 Thread Peter Eisentraut

On tor, 2012-03-01 at 20:30 -0500, Jaime Casanova wrote:
> besides a clash in the oid and the value of leakproof missing in the
> pg_proc entry, everything works fine.

Fixed.

> The only thing is that i don't see a reason for these includes in
> src/backend/utils/adt/misc.c:
> 
> + #include "nodes/nodeFuncs.h"
> + #include "utils/lsyscache.h"

lsyscache.h is necessary for type_is_collatable(), but the other was
probably left over from a previous attempt.

Committed with these changes.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Runtime SHAREDIR for testing CREATE EXTENSION

2012-03-02 Thread Daniel Farina

On Fri, Mar 2, 2012 at 10:37 AM, Peter Eisentraut  wrote:
> On tis, 2012-02-28 at 11:00 -0800, Daniel Farina wrote:
>> I'd really like to support libraries (C or otherwise) of multiple
>> versions at the same time, when the underlying library permits.
>
> What's preventing you from doing that now?  You need to name all the
> symbols differently, of course.

That's the problem: not really practical in a wider ecosystem of C
libraries, especially if the library produces multiple versions.  (Or,
not practical unless someone writes some credible
symbol-version-mangling-magic)

But is it unsurmountable? -- dlsym returns a function pointer, and one
would build up the operator table for the version of the extension at
hand, so one might have ltree "version 1.01" and ltree "version 2.3"
fields in the same database.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Runtime SHAREDIR for testing CREATE EXTENSION

2012-03-02 Thread Peter Eisentraut

On tis, 2012-02-28 at 11:00 -0800, Daniel Farina wrote:
> I'd really like to support libraries (C or otherwise) of multiple
> versions at the same time, when the underlying library permits.

What's preventing you from doing that now?  You need to name all the
symbols differently, of course.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Tom Lane

Josh Berkus  writes:
>> It would probably be prudent to concentrate on getting the core
>> infrastructure committed first. That way, we at least know that if
>> this doesn't get into 9.2, we can work on getting it into 9.3 knowing
>> that once committed, people won't have to wait over a year at the very

> I don't see why we can't commit the whole thing.  This is way more ready
> for prime-time than checksums.

We'll get to it in due time.  In case you haven't noticed, there's a lot
of stuff in this commitfest.  And I don't follow the logic that says
that because Simon is trying to push through a not-ready-for-commit
patch we should drop our standards for other patches.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-03-02 Thread Josh Berkus


> It would probably be prudent to concentrate on getting the core
> infrastructure committed first. That way, we at least know that if
> this doesn't get into 9.2, we can work on getting it into 9.3 knowing
> that once committed, people won't have to wait over a year at the very

I don't see why we can't commit the whole thing.  This is way more ready
for prime-time than checksums.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 16-bit page checksums for 9.2

2012-03-02 Thread Peter Geoghegan

On 23 February 2012 01:39, Robert Haas  wrote:
> Thanks for testing this.  The graph obscures a bit how much percentage
> change we're talking about here - could you post the raw tps numbers?

Sorry for not following up on this until now. The report is available from:

http://pagechecksum.staticloud.com/

Note that detailed latency figures for each test are not available in
this report, so only the main page is available, but that gets you the
raw tps numbers.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collect frequency statistics for arrays

2012-03-02 Thread Tom Lane

Still working through this patch ... there are some things that bother
me about the entries being made in pg_statistic:

1. You re-used STATISTIC_KIND_MCELEM for something that, while similar
to tsvector's usage, is not the same.  In particular, tsvector adds two
extra elements to the stanumbers array, but this patch adds four (and
doesn't bother documenting that in pg_statistic.h).  I don't think it's
acceptable to re-use the same stakind value for something that's not
following the same specification.  I see Nathan complained of this way,
way upthread, but nothing was done about it.

I think we should either assign a different stakind code for this
definition, or change things around so that tsvector actually is using
the same stats kind definition as arrays are.  (We could get away with
redefining the tsvector stats format, because pg_upgrade doesn't try to
copy pg_statistic rows to the new database.)  Now, of the two new
elements added by the patch, it seems to me to be perfectly reasonable
to add a null-element frequency to the kind specification; the fact that
it wasn't there to start with is kind of an oversight born of the fact
that tsvectors don't contain any null lexemes.  But the other new
element is the average distinct element count, which really does not
belong here at all, as it is *entirely* unrelated to element
frequencies.  It seems to me that that more nearly belongs in the
element-count histogram slot.  So my preference is to align the two
definitions of STATISTIC_KIND_MCELEM by adding a null-element frequency
to tsvector's usage (where it'll always be zero) and getting rid of the
average distinct element count here.

2. I think STATISTIC_KIND_LENGTH_HISTOGRAM is badly named and
confusingly documented.  The stats are not about anything I would call a
"length" --- rather we're considering the counts of numbers of distinct
element values present in each array value.  An ideal name perhaps would
be STATISTIC_KIND_DISTINCT_ELEMENTS_COUNT_HISTOGRAM, but of course
that's unreasonably long.  Considering the way that the existing stats
kind names are abbreviated, maybe STATISTIC_KIND_DECHIST would do.
Anybody have a better idea?

3. I also find it a bit odd that you chose to store the length (count)
histogram as an integer array in stavalues.  Usually we've put such data
in stanumbers.  That would make the entries float4 not integer, but that
doesn't seem unreasonable to me --- values would still be exact up to
2^24 or so on typical machines, and if we ever do have values larger
than that, it seems to me that having headroom to go above 2^32 would
be a good thing.  In any case, if we're going to move the average
distinct-element count over here, that would have to go into stanumbers.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ECPG FETCH readahead

2012-03-02 Thread Noah Misch

On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote:
> 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta:
> > 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta:
> >>> On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes  
> >>> wrote:
>  On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote:
> > Is there a reason not to enable it by default? I'm a bit worried
> > that it will receive no testing if it's not always on.
> >   
>  Yes, there is a reason, namely that you don't need it in native mode at 
>  all.
>  ECPG can read as many records as you want in one go, something ESQL/C
>  apparently cannot do. This patch is a workaround for that restriction. I 
>  still
>  do not really see that this feature gives us an advantage in native mode
>  though.

We yet lack a consensus on whether native ECPG apps should have access to the
feature.  My 2c is to make it available, because it's useful syntactic sugar.
If your program independently processes each row of an arbitrary-length result
set, current facilities force you to add an extra outer loop to batch the
FETCHes for every such code site.  Applications could define macros to
abstract that pattern, but this seems common-enough to justify bespoke
handling.  Besides, minimalists already use libpq directly.

I suggest enabling the feature by default but drastically reducing the default
readahead chunk size from 256 to, say, 5.  That still reduces the FETCH round
trip overhead by 80%, but it's small enough not to attract pathological
behavior on a workload where each row is a 10 MiB document.  I would not offer
an ecpg-time option to disable the feature per se.  Instead, let the user set
the default chunk size at ecpg time.  A setting of 1 effectively disables the
feature, though one could later re-enable it with ECPGFETCHSZ.

> > The ASAP took a little long. The attached patch is in git diff format,
> > because (1) the regression test intentionally doesn't do ECPGdebug()
> > so the patch isn't dominated by a 2MB stderr file, so this file is empty
> > and (2) regular diff cannot cope with empty new files.

Avoid the empty file with fprintf(STDERR, "Dummy non-empty error output\n");

> > - *NEW FEATURE* Readahead can be individually enabled or disabled
> >   by ECPG-side grammar:
> > DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ...
> >   Without [ NO ] READAHEAD, the default behaviour is used for cursors.

Likewise, this may as well take a chunk size rather than a yes/no.

The patch adds warnings:
error.c: In function `ecpg_raise':
error.c:281: warning: format not a string literal and no format arguments
error.c:281: warning: format not a string literal and no format arguments

The patch adds few comments and no larger comments explaining its higher-level
ideas.  That makes it much harder to review.  In this regard it follows the
tradition of the ECPG code, but let's depart from that tradition for new work.
I mention a few cases below where the need for commentary is acute.

I tested full reads of various "SELECT * FROM generate_series(1, $1)" commands
over a 50ms link, and the patch gives a sound performance improvement.  With
no readahead, a mere N=100 takes 5s to drain.  With readahead enabled, N=1
takes only 3s.  Performance was quite similar to that of implementing my own
batching with "FETCH 256 FROM cur".  When I kicked up ECPGFETCHSZ (after
fixing its implementation -- see below) past the result set size, performance
was comparable to that of simply passing the query through psql.

> --- a/doc/src/sgml/ecpg.sgml
> +++ b/doc/src/sgml/ecpg.sgml

> @@ -5289,6 +5315,17 @@ while (1)
>  
>  
>  
> + -231 (ECPG_INVALID_CURSOR)
> + 
> +  
> +   The cursor you are trying to use with readahead has not been opened 
> yet (SQLSTATE 34000),
> +   invalid values were passed to libecpg (SQLSTATE 42P11) hor not in 
> prerequisite state, i.e.

Typo.

> --- /dev/null
> +++ b/src/interfaces/ecpg/ecpglib/cursor.c
> @@ -0,0 +1,730 @@

cursor.c contains various >78-col lines.  pgindent has limited ability to
improve those, so please split them.

> +static struct cursor_descriptor *
> +add_cursor(int lineno, struct connection *con, const char *name, bool 
> scrollable, int64 n_tuples, bool *existing)
> +{
> + struct cursor_descriptor *desc,
> + *ptr, *prev = NULL;
> + boolfound = false;
> +
> + if (!name || name[0] == '\0')
> + {
> + if (existing)
> + *existing = false;
> + return NULL;
> + }
> +
> + ptr = con->cursor_desc;
> + while (ptr)
> + {
> + int ret = strcasecmp(ptr->name, name);
> +
> + if (ret == 0)
> + {
> + found = true;
> + break;
> + }
> + if (ret > 0)
> + break;
> +
> + prev = ptr;
> + ptr = ptr->n

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Kevin Grittner

Tom Lane  wrote:

> So the real problem is slow removal of prepared transactions,
> which most likely is an application logic problem.  It's certainly
> not autovac's fault.

Yeah, I've seen way too much Java code lately which fails to close
ResultSet or Statement (which includes PreparedStatement) objects,
leaving it to the vagaries of the Java garbage collector to close
these objects right before freeing their memory to the Java heap. 
People should never count on the finalize() method to do this for
them -- the close() method should be invoked as soon as the object
is no longer needed, and definitely before it goes out of scope.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Tom Lane

Alvaro Herrera  writes:
> Right.  I think I can explain how this locking works: autovacuum needs a
> "cleanup" lock on the page being processed, which is a special exclusive
> lock which also requires that no one is holding a "pin" on the buffer.
> Any process running a query holds a pin on the buffer while inspecting
> tuples on it; when it's done with tuples on that page it should move on
> to the next page in the table -- same as autovac.  So what seems to be
> happening here is that the autovac and the scan are in sync walking the
> table, stepping on each others toes.

I don't believe that.  The trace shows the other process is waiting for
a tuple lock, which is not something that autovacuum would take.

Given the reference to prepared transactions, what seems likely is that
the UPDATE command is blocked by a prepared transaction (IOW, one that
already updated the same tuple) and VACUUM is stuck behind the UPDATE.
So the real problem is slow removal of prepared transactions, which
most likely is an application logic problem.  It's certainly not
autovac's fault.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Robert Haas

On Fri, Mar 2, 2012 at 6:22 AM, Gregg Jaskiewicz  wrote:
> Looking at the system bit more now, it look like 'waiting' states are
> changing for both the query and autovacuum in pg_stat_activity.
> But very slowly. It looks like they both got into that sort of state
> that keeps them on the edge of starvation.
>
> So this isn't really a deadlock, but rather very poor performance in
> this corner case.

Are you making any use of cursors?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

Hello

2012/3/2 Alvaro Herrera :
>
> Excerpts from Pavel Stehule's message of vie mar 02 05:29:26 -0300 2012:
>
>> you cannot to check trigger function without assigned relation -
>> TupleDescription should be assigned to NEW and OLD variables.
>
> Oh, I see, that makes sense.
>
> After mulling over this a bit, I'm dubious about having two separate
> commands, one which checks triggers and another that checks non-trigger
> functions.  Wouldn't it make more sense to have some options into CHECK
> FUNCTION so that it receives the trigger and corresponding relation name
> to check?  For example "check function foo() trigger on tab" or
> something like that?
>

I don't like it - "check trigger" is simple - and consistent

> I also wonder if it would make sense to have grammar for "check all
> triggers on table xyz" or some such, and even "check all triggers on all
> functions".

this is good idea - and I like it

CHECK TRIGGER ALL ON TABLE X

there are possible some combination like CHECK TRIGGER ALL IN SCHEMA ...;

and similar. But I am not sure, if this is necessary - for some more
complex usage developer can use a direct PL check function.

>
> Another thing is that "CHECK FUNCTION ALL FOR ROLE foo" seems a bit
> strange to me.  What about "CHECK FUNCTION ALL OWNED BY foo" instead?
> ("CHECK FUNCTION ALL" seems strange as a whole, but I'm not sure that we
> can improve that ... still, if anyone has ideas I'm sure we can discuss)

this should not be nice from language view, but it doesn't need new keywords

pattern FOR ROLE, IN SCHEMA are used

ALTER DEFAULT PRIVILEGES FOR ROLE

but OWNED BY is used too (SeqOptElem:)


I agree so OWNED BY sounds better (can be changed)

>
> As a reminder: we also have
> CHECK FUNCTION ALL IN SCHEMA f

it is ok

> and
> CHECK FUNCTION ALL IN LANGUAGE f
> (and combinations thereof)
>

it is ok too

Regards

Pavel

> Thoughts?
>
> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-03-02 Thread Gilles Darold

Le 03/02/2012 10:52, Magnus Hagander a écrit :
> On Fri, Feb 3, 2012 at 10:47, Fujii Masao  wrote:
>> On Fri, Feb 3, 2012 at 6:10 PM, Bernd Helmle  wrote:
>>>
>>> --On 3. Februar 2012 13:21:11 +0900 Fujii Masao 
>>> wrote:
>>>
 It seems to be more user-friendly to introduce a view like pg_stat_backup
 rather than the function returning an array.
>>>
>>> I like this idea. A use case i saw for monitoring backup_label's in the
>>> past, was mainly to discover a forgotten exclusive pg_stop_backup() (e.g.
>>> due to broken backup scripts). If the view would be able to distinguish
>>> both, exclusive and non-exclusive backups, this would be great.
>> Agreed. Monitoring an exclusive backup is very helpful. But I wonder
>> why we want to monitor non-exclusive backup. Is there any use case?
> Actually, we can already monitor much of the non-exclusive one through
> pg_stat_replication. Including the info on when it was started (at
> least in almost every case, that will be more or less the
> backend_start time for that one)
>
>> If we want to monitor non-exclusive backup, why not pg_dump backup?
> .. which we can also monitor though pg_stat_activity by looking at
> application_name (which can be faked of course, but still)
>
>> If there is no use case, it seems sufficient to implement the function
>> which reports the information only about exclusive backup.
> Yeah, thinking more of it, i think I agree. But the function should
> then probably be named in such a way that it's clear that we're
> talking about exclusive backups, e.g. not pg_is_in_backup() but
> instead pg_is_in_exclusive_backup() (renamed if we change it to return
> the timestamp instead, of course, but you get the idea)

Agreed and sorry for the response delay. I've attached 2 patches here,
the first one is the same as before with just the renaming of the
function into pg_is_in_exclusive_backup(). Maybe this patch should be
abandoned in favor of the second one which introduce a new function
called pg_exclusive_backup_start_time() that return the backup_label
START TIME information as a timestamp with timezone.

Sample usage/result:

postgres=# select pg_start_backup('Online backup');
 pg_start_backup
-
 0/220
(1 ligne)

postgres=# select pg_exclusive_backup_start_time();
 pg_exclusive_backup_start_time

 2012-03-02 14:52:49+01
(1 ligne)

postgres=# select now() - pg_exclusive_backup_start_time() as
backup_started_since;
  backup_started_since

 00:12:24.569226
(1 ligne)

postgres=# select pg_stop_backup();
 pg_stop_backup

 0/298
(1 ligne)

postgres=# select pg_exclusive_backup_start_time();
 pg_exclusive_backup_start_time


(1 ligne)

Regards,

-- 
Gilles Darold
http://dalibo.com - http://dalibo.org

diff -ru postgresql/doc/src/sgml/func.sgml postgresql-is_in_exclusive_backup-patch//doc/src/sgml/func.sgml
--- postgresql/doc/src/sgml/func.sgml	2012-01-26 23:01:31.956613398 +0100
+++ postgresql-is_in_exclusive_backup-patch//doc/src/sgml/func.sgml	2012-03-01 10:58:20.556472776 +0100
@@ -14371,6 +14371,9 @@
 pg_current_xlog_location


+pg_is_in_exclusive_backup
+   
+   
 pg_start_backup


@@ -14424,6 +14427,14 @@
   
   

+pg_is_in_exclusive_backup()
+
+   bool
+   True if an on-line exclusive backup is still in progress.
+   
+  
+  
+   
 pg_start_backup(label text , fast boolean )
 
text
diff -ru postgresql/src/backend/access/transam/xlogfuncs.c postgresql-is_in_exclusive_backup-patch//src/backend/access/transam/xlogfuncs.c
--- postgresql/src/backend/access/transam/xlogfuncs.c	2012-01-26 23:01:32.032613398 +0100
+++ postgresql-is_in_exclusive_backup-patch//src/backend/access/transam/xlogfuncs.c	2012-03-01 10:48:48.056473056 +0100
@@ -465,3 +465,12 @@
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+/*
+ * Returns bool with current on-line backup mode, a global state.
+ */
+Datum
+pg_is_in_exclusive_backup(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_BOOL(BackupInProgress());
+}
diff -ru postgresql/src/include/access/xlog_internal.h postgresql-is_in_exclusive_backup-patch//src/include/access/xlog_internal.h
--- postgresql/src/include/access/xlog_internal.h	2012-01-26 23:01:32.332613398 +0100
+++ postgresql-is_in_exclusive_backup-patch//src/include/access/xlog_internal.h	2012-03-01 10:48:48.076473056 +0100
@@ -281,5 +281,6 @@
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_is_in_exclusive_backup(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff -ru postgresql/src/include/catalog/pg_proc.h postgresql-is_in_exclusive_backup-patch//src/include/catalog/pg_proc.h
--- postgresql/src/include/

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Alvaro Herrera


Excerpts from Pavel Stehule's message of vie mar 02 05:29:26 -0300 2012:

> you cannot to check trigger function without assigned relation -
> TupleDescription should be assigned to NEW and OLD variables.

Oh, I see, that makes sense.

After mulling over this a bit, I'm dubious about having two separate
commands, one which checks triggers and another that checks non-trigger
functions.  Wouldn't it make more sense to have some options into CHECK
FUNCTION so that it receives the trigger and corresponding relation name
to check?  For example "check function foo() trigger on tab" or
something like that?

I also wonder if it would make sense to have grammar for "check all
triggers on table xyz" or some such, and even "check all triggers on all
functions".

Another thing is that "CHECK FUNCTION ALL FOR ROLE foo" seems a bit
strange to me.  What about "CHECK FUNCTION ALL OWNED BY foo" instead?
("CHECK FUNCTION ALL" seems strange as a whole, but I'm not sure that we
can improve that ... still, if anyone has ideas I'm sure we can discuss)

As a reminder: we also have
CHECK FUNCTION ALL IN SCHEMA f
and
CHECK FUNCTION ALL IN LANGUAGE f
(and combinations thereof)

Thoughts?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Alvaro Herrera


Excerpts from Gregg Jaskiewicz's message of vie mar 02 08:22:22 -0300 2012:
> 
> Looking at the system bit more now, it look like 'waiting' states are
> changing for both the query and autovacuum in pg_stat_activity.
> But very slowly. It looks like they both got into that sort of state
> that keeps them on the edge of starvation.

Ouch.

> So this isn't really a deadlock, but rather very poor performance in
> this corner case.

Right.  I think I can explain how this locking works: autovacuum needs a
"cleanup" lock on the page being processed, which is a special exclusive
lock which also requires that no one is holding a "pin" on the buffer.
Any process running a query holds a pin on the buffer while inspecting
tuples on it; when it's done with tuples on that page it should move on
to the next page in the table -- same as autovac.  So what seems to be
happening here is that the autovac and the scan are in sync walking the
table, stepping on each others toes.

What I don't know is why they are so closely in sync.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup -x stream from the standby gets stuck

2012-03-02 Thread Magnus Hagander

On Tue, Feb 28, 2012 at 09:22, Fujii Masao  wrote:
> On Thu, Feb 23, 2012 at 1:02 AM, Magnus Hagander  wrote:
>> On Tue, Feb 7, 2012 at 12:30, Fujii Masao  wrote:
>>> Hi,
>>>
>>> http://www.depesz.com/2012/02/03/waiting-for-9-2-pg_basebackup-from-slave/
 =$ time pg_basebackup -D /home/pgdba/slave2/ -F p -x stream -c fast -P -v 
 -h 127.0.0.1 -p 5921 -U replication
 xlog start point: 2/AC4E2600
 pg_basebackup: starting background WAL receiver
 692447/692447 kB (100%), 1/1 tablespace
 xlog end point: 2/AC4E2600
 pg_basebackup: waiting for background process to finish streaming...
 pg_basebackup: base backup completed

 real    3m56.237s
 user    0m0.224s
 sys     0m0.936s

 (time is long because this is only test database with no traffic, so I had 
 to make some inserts for it to finish)
>>>
>>> The above article points out the problem of pg_basebackup from the standby:
>>> when "-x stream" is specified, pg_basebackup from the standby gets stuck if
>>> there is no traffic in the database.
>>>
>>> When "-x stream" is specified, pg_basebackup forks the background process
>>> for receiving WAL records during backup, takes an online backup and waits 
>>> for
>>> the background process to end. The forked background process keeps receiving
>>> WAL records, and whenever it reaches end of WAL file, it checks whether it 
>>> has
>>> already received all WAL files required for the backup, and exits if yes. 
>>> Which
>>> means that at least one WAL segment switch is required for pg_basebackup 
>>> with
>>> "-x stream" option to end.
>>>
>>> In the backup from the master, WAL file switch always occurs at both start 
>>> and
>>> end of backup (i.e., in do_pg_start_backup() and do_pg_stop_backup()), so 
>>> the
>>> above logic works fine even if there is no traffic. OTOH, in the backup 
>>> from the
>>> standby, while there is no traffic, WAL file switch is not performed at 
>>> all. So
>>> in that case, there is no chance that the background process reaches end of 
>>> WAL
>>> file, check whether all required WAL arrives and exit. At the end, 
>>> pg_basebackup
>>> gets stuck.
>>>
>>> To fix the problem, I'd propose to change the background process so that it
>>> checks whether all required WAL has arrived, every time data is received, 
>>> even
>>> if end of WAL file is not reached. Patch attached. Comments?
>>
>> This seems like a good thing in general.
>>
>> Why does it need to modify pg_receivexlog, though? I thought only
>> pg_basebackup had tihs issue?
>>
>> I guess it is because of the change of the API to
>> stream_continue_callback only?
>
> Yes, that's the reason why I changed continue_streaming() in pg_receivexlog.c.
>
> But the reason why I changed segment_callback() in pg_receivexlog.c is not the
> same. I did that because previously segment_finish_callback is called
> only at the
> end of WAL segment but in the patch it can be called at the middle of segment.
> OTOH, segment_callback() must emit a verbose message only when current
> WAL segment is complete. So I had to add the check of whether current WAL
> segment is partial or complete into segment_callback().

Yeah, I caught that.


>> Looking at it after your patch,
>> stream_continue_callback and segment_finish_callback are the same.
>> Should we perhaps just fold them into a single
>> stream_continue_callback? Since you had to move the "detect segment
>> end" to the caller anyway?
>
> No. I think we cannot do that because in pg_receivexlog they are not the same.

But couldn't they be made the same by making the same check as you put
in for the verbose message above?


>> Another question related to this - since we clearly don't need the
>> xlog switch in this case, should we make it conditional on the master
>> as well, so we don't switch unnecessarily there as well?
>
> Maybe. At the end of backup, we force WAL segment switch, to ensure all 
> required
> WAL files have been archived. So theoretically if WAL archiving is not 
> enabled,
> we can skip WAL segment switch. But some backup tools might depend on this
> behavior

I was thinking we could keep doing it in pg_stop_backup(), but avoid
doing it when using pg_basebackup only...


> In standby-only backup, we always skip WAL segment switch. So there is
> no guarantee
> that all WAL files required for the backup are archived at the end of
> backup. This
> limitation is documented.

Right.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reducing bgwriter wakeups

2012-03-02 Thread Heikki Linnakangas


On 20.02.2012 00:18, Robert Haas wrote:

On Sun, Feb 19, 2012 at 4:11 PM, Simon Riggs  wrote:

On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas  wrote:

On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs  wrote:

Recent changes for power reduction mean that we now issue a wakeup
call to the bgwriter every time we set a hint bit.

However cheap that is, its still overkill.

My proposal is that we wakeup the bgwriter whenever a backend is
forced to write a dirty buffer, a job the bgwriter should have been
doing.

This significantly reduces the number of wakeup calls and allows the
bgwriter to stay asleep even when very light traffic happens, which is
good because the bgwriter is often the last process to sleep.


That seems like swinging the pendulum too much in the other direction, 
as others have noted. A simple thing you could do, however, is to only 
wake up bgwriter every 10 dirtied pages in the backend or something like 
that. That would reduce the wakeups by a factor of 10. Would that be 
useful? It's not actually clear to me what the problem you're trying to 
solve is.



Seems useful to have an explicit discussion on this point, especially
in view of recent performance results.


I don't see what this has to do with recent performance results, so
please elaborate.  Off-hand, I don't see any point in getting cheap.
It seems far more important to me that the background writer become
active when needed than that we save some trivial amount of power by
waiting longer before activating it.


Then you misunderstand, since I am advocating waking it when needed.


Well, I guess that depends on when it's actually needed.  You haven't
presented any evidence one way or the other.

I mean, let's suppose that a sudden spike of activity hits a
previously-idle system.  If we wait until all of shared_buffers is
dirty before waking up the background writer, it seems possible that
the background writer is going to have a hard time catching up.  If we
wake it immediately, we don't have that problem.


Well, as long as the OS has some clean buffers, as it presumably does if 
the system has been idle for a while, bgwriter will catch up very 
quickly by simply dumping a large number of dirty pages to the OS. Also, 
as the code stands, bgwriter still wakes up every 10 seconds even when 
no-one signals it, which makes this a much less likely to happen.


Nevertheless, I also feel that it would be better for bgwriter to be a 
bit more proactive than that.



Also, in general, I think that it's not a good idea to let dirty data
sit in shared_buffers forever.  I'm unhappy about the change this
release cycle to skip checkpoints if we've written less than a full
WAL segment, and this seems like another step in that direction.  It's
exposing us to needless risk of data loss.  In 9.1, if you process a
transaction and, an hour later, the disk where pg_xlog is written
melts into a heap of molten slag, your transaction will be there, even
if you end up having to run pg_resetxlog.  In 9.2, it may well be that
xlog contains the only record of that transaction, and you're hosed.
The more work we do to postpone writing the data until the absolutely
last possible moment, the more likely it is that it won't be on disk
when we need it.


True. (but as noted above, bgwriter still wakes up every 10 seconds so 
this isn't really an issue at the moment)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Website stylesheet for local docs

2012-03-02 Thread Magnus Hagander

On Mon, Feb 27, 2012 at 17:26, Magnus Hagander  wrote:
> On Mon, Feb 27, 2012 at 16:20, Tom Lane  wrote:
>> Magnus Hagander  writes:
>>> On Mon, Feb 27, 2012 at 04:37, Robert Haas  wrote:
 Why not change the default?  Does anyone really prefer the bare bones
 doc output?
>>
>>> Yes, Peter made a point about preferring that back when we changed the
>>> developer docs to be on the main website (how it got worse but at
>>> least he could work on his local build).
>>
>> FWIW, I don't especially like the website style either --- it's too busy
>> calling attention to itself with colored backgrounds etc.
>
> There we go, at least two people, and people who do a lot of builds
> and checks of the docs, like the current format. So I think that's a
> good argument to keep the current format the default, and just add a
> target like my suggestion as an *option* :-)

Ok, I've applied the current version of this patch based on that it's
useful for many, but kept the default value. I've also went with not
putting the css into the tree at this point, since it's multiple files
and lots of interactions. We can enhance it with that later..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs/slowness inserting and indexing cubes

2012-03-02 Thread Heikki Linnakangas


On 20.02.2012 10:54, Alexander Korotkov wrote:

On Wed, Feb 15, 2012 at 7:28 PM, Tom Lane  wrote:


Alexander Korotkov  writes:

On Wed, Feb 15, 2012 at 4:26 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:

So, I think we should go with your original fix and simply do nothing in
gistRelocateBuildBuffersOnSpli**t() if the page doesn't have a buffer.



I agree.


OK, I won't object.


So, I think we can just commit first version of fix now.


Thanks, committed. Sorry for the delay..

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Gregg Jaskiewicz

Looking at the system bit more now, it look like 'waiting' states are
changing for both the query and autovacuum in pg_stat_activity.
But very slowly. It looks like they both got into that sort of state
that keeps them on the edge of starvation.

So this isn't really a deadlock, but rather very poor performance in
this corner case.

This is a test system, so I can harvest some more data from it.
Already got core files for autovacuum and query connections.
I'm hoping for more suggestions as to how to gather more intel - so
that could be useful to you guys.

Unfortunately my knowledge about locking mechanisms in postgresql is
rather poor, other then the basic one.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Gregg Jaskiewicz

On 2 March 2012 11:03, Alvaro Herrera  wrote:
>
> Excerpts from Gregg Jaskiewicz's message of vie mar 02 07:44:07 -0300 2012:
>> Folks,
>>
>> I got a system here (8.3.7), that is locked up. Few queries waiting
>> for autovacuum aquired locks on a table or two.
>> But it looks like autovacuum is also waiting for some semaphore:
>>
>>
>> #0  0x00f07410 in __kernel_vsyscall ()
>> #1  0x00252d2b in semop () from /lib/libc.so.6
>> #2  0x081ca1ce in PGSemaphoreLock ()
>> #3  0x081f590e in ProcWaitForSignal ()
>> #4  0x081e8dea in LockBufferForCleanup ()
>> #5  0x08160e68 in ?? ()
>> #6  0x08161482 in lazy_vacuum_rel ()
>> #7  0x081603b5 in ?? ()
>> #8  0x081605c4 in vacuum ()
>> #9  0x081cd154 in ?? ()
>> #10 0x081cd7f1 in ?? ()
>> #11 0x081cd911 in StartAutoVacWorker ()
>> #12 0x081d6d47 in ?? ()
>> #13 
>> #14 0x00f07410 in __kernel_vsyscall ()
>> #15 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6
>> #16 0x081d3964 in ?? ()
>> #17 0x081d5105 in PostmasterMain ()
>> #18 0x0818be60 in main ()
>>
>> There's still other transcations going on , and working fine - but two
>> tables seem to be cut out by autovacuum, and other queries stuck
>> waiting for autovacuum to finish.
>
> This vacuum is waiting for some other process to release the buffer
> lock.  These locks are usually very short-lived, but maybe you have some
> other backend doing something weird that won't release it.  Do you have
> idle processes with transactions open, and if you have, what are they
> doing?  Do you have prepared transactions open? (see the
> pg_prepared_xacts view)

That view is empty, but I know that the queries that are waiting for
locks to be released - are in the middle of two phase commit
transaction.

It could be that the system is doing something, but it is just so slow
- that those locks are starved - and no longer so short lived :)

Is there any particular info I could gather to make it easier for you
guys to look at ?

Here's a backtrace from one of the autovacuums (this time with line # info):
#0  0x00f07410 in __kernel_vsyscall ()
#1  0x00252d2b in semop () from /lib/libc.so.6
#2  0x081ca1ce in PGSemaphoreLock (sema=0xb792bdf8, interruptOK=1
'\001') at pg_sema.c:420
#3  0x081f590e in ProcWaitForSignal () at proc.c:1269
#4  0x081e8dea in LockBufferForCleanup (buffer=24516) at bufmgr.c:2304
#5  0x08160e68 in lazy_vacuum_heap (onerel=0x87573c0,
vacrelstats=0x86a4768) at vacuumlazy.c:655
#6  0x08161482 in lazy_scan_heap (onerel=0x87573c0,
vacstmt=0xbfbcfb50, bstrategy=0x86fbe98) at vacuumlazy.c:591
#7  lazy_vacuum_rel (onerel=0x87573c0, vacstmt=0xbfbcfb50,
bstrategy=0x86fbe98) at vacuumlazy.c:191
#8  0x081603b5 in vacuum_rel (relid=,
vacstmt=0xbfbcfb50, expected_relkind=114 'r', for_wraparound=0 '\000')
at vacuum.c:1126
#9  0x081605c4 in vacuum (vacstmt=0xbfbcfb50, relids=0x86fd310,
bstrategy=0x86fbe98, for_wraparound=0 '\000', isTopLevel=1 '\001') at
vacuum.c:427
#10 0x081cd154 in do_autovacuum () at autovacuum.c:2638
#11 0x081cd7f1 in AutoVacWorkerMain (argc=,
argv=) at autovacuum.c:1613
#12 0x081cd911 in StartAutoVacWorker () at autovacuum.c:1416
#13 0x081d6d47 in StartAutovacuumWorker (postgres_signal_arg=10) at
postmaster.c:4109
#14 sigusr1_handler (postgres_signal_arg=10) at postmaster.c:3845
#15 
#16 0x00f07410 in __kernel_vsyscall ()
#17 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6
#18 0x081d3964 in ServerLoop () at postmaster.c:1234
#19 0x081d5105 in PostmasterMain (argc=5, argv=0x8641828) at postmaster.c:1029
#20 0x0818be60 in main (argc=5, argv=0x0) at main.c:188


And the connection that is waiting for locks to be released (the
locked up connection). That query is hanging there since ~18:00 BST:

#0  0x00f07410 in __kernel_vsyscall ()
#1  0x00252d2b in semop () from /lib/libc.so.6
#2  0x081ca1ce in PGSemaphoreLock (sema=0xb7929618, interruptOK=1
'\001') at pg_sema.c:420
#3  0x081f5c53 in ProcSleep (locallock=0x86578f0,
lockMethodTable=0x834e0f4) at proc.c:880
#4  0x081f4a5d in WaitOnLock (locallock=0x86578f0, owner=0x88676d8) at
lock.c:1147
#5  0x081f50af in LockAcquire (locktag=0xbfbcf2b8, lockmode=7,
sessionLock=0 '\000', dontWait=0 '\000') at lock.c:792
#6  0x081f27f9 in LockTuple (relation=0xa72f53a8, tid=0xbfbcf334,
lockmode=7) at lmgr.c:378
#7  0x0809b19c in heap_update (relation=0xa72f53a8, otid=0xbfbcf3da,
newtup=0x86d3be0, ctid=0xbfbcf3d4, update_xmax=0xbfbcf3e0, cid=2,
crosscheck=0x0, wait=1 '\001') at heapam.c:2376
#8  0x08167f8b in ExecutePlan (queryDesc=0x870d010,
direction=ForwardScanDirection, count=0) at execMain.c:1880
#9  ExecutorRun (queryDesc=0x870d010, direction=ForwardScanDirection,
count=0) at execMain.c:270
#10 0x0818051e in _SPI_execute_plan (plan=0x87b8948, Values=0x8e960f8,
Nulls=0x8e960e8 "  ", snapshot=0x0, crosscheck_snapshot=0x0,
read_only=0 '\000', fire_triggers=1 '\001', tcount=0) at spi.c:1845
#11 0x08180796 in SPI_execute_plan (plan=0x87b8948, Values=0x8e960f8,
Nulls=0x8e960e8 "  ", read_only=0 '\000', tcount=0) at spi.c:374
#12 0x0012d488 in exec_stmt_exe

Re: [HACKERS] autovacuum locks

2012-03-02 Thread Alvaro Herrera


Excerpts from Gregg Jaskiewicz's message of vie mar 02 07:44:07 -0300 2012:
> Folks,
> 
> I got a system here (8.3.7), that is locked up. Few queries waiting
> for autovacuum aquired locks on a table or two.
> But it looks like autovacuum is also waiting for some semaphore:
> 
> 
> #0  0x00f07410 in __kernel_vsyscall ()
> #1  0x00252d2b in semop () from /lib/libc.so.6
> #2  0x081ca1ce in PGSemaphoreLock ()
> #3  0x081f590e in ProcWaitForSignal ()
> #4  0x081e8dea in LockBufferForCleanup ()
> #5  0x08160e68 in ?? ()
> #6  0x08161482 in lazy_vacuum_rel ()
> #7  0x081603b5 in ?? ()
> #8  0x081605c4 in vacuum ()
> #9  0x081cd154 in ?? ()
> #10 0x081cd7f1 in ?? ()
> #11 0x081cd911 in StartAutoVacWorker ()
> #12 0x081d6d47 in ?? ()
> #13 
> #14 0x00f07410 in __kernel_vsyscall ()
> #15 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6
> #16 0x081d3964 in ?? ()
> #17 0x081d5105 in PostmasterMain ()
> #18 0x0818be60 in main ()
> 
> There's still other transcations going on , and working fine - but two
> tables seem to be cut out by autovacuum, and other queries stuck
> waiting for autovacuum to finish.

This vacuum is waiting for some other process to release the buffer
lock.  These locks are usually very short-lived, but maybe you have some
other backend doing something weird that won't release it.  Do you have
idle processes with transactions open, and if you have, what are they
doing?  Do you have prepared transactions open? (see the
pg_prepared_xacts view)

> Any ideas ?
> Is there a possible fix up the 8.3 stream, that would target that ?

I don't think so, though I am not certain.

> (I need leverage  to upgrade to 8.3.18, and if that's the case - happy
> days).

Sorry, can't help you there :-)

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] autovacuum locks

2012-03-02 Thread Gregg Jaskiewicz

Folks,

I got a system here (8.3.7), that is locked up. Few queries waiting
for autovacuum aquired locks on a table or two.
But it looks like autovacuum is also waiting for some semaphore:


#0  0x00f07410 in __kernel_vsyscall ()
#1  0x00252d2b in semop () from /lib/libc.so.6
#2  0x081ca1ce in PGSemaphoreLock ()
#3  0x081f590e in ProcWaitForSignal ()
#4  0x081e8dea in LockBufferForCleanup ()
#5  0x08160e68 in ?? ()
#6  0x08161482 in lazy_vacuum_rel ()
#7  0x081603b5 in ?? ()
#8  0x081605c4 in vacuum ()
#9  0x081cd154 in ?? ()
#10 0x081cd7f1 in ?? ()
#11 0x081cd911 in StartAutoVacWorker ()
#12 0x081d6d47 in ?? ()
#13 
#14 0x00f07410 in __kernel_vsyscall ()
#15 0x00249fcd in ___newselect_nocancel () from /lib/libc.so.6
#16 0x081d3964 in ?? ()
#17 0x081d5105 in PostmasterMain ()
#18 0x0818be60 in main ()

There's still other transcations going on , and working fine - but two
tables seem to be cut out by autovacuum, and other queries stuck
waiting for autovacuum to finish.

Any ideas ?
Is there a possible fix up the 8.3 stream, that would target that ? (I
need leverage  to upgrade to 8.3.18, and if that's the case - happy
days).


-- 
GJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multi "-t" and adding "-n" to vacuumdb ?

2012-03-02 Thread Jehan-Guillaume (ioguix) de Rorthais

On 01/03/2012 23:13, Tom Lane wrote:
> "Jehan-Guillaume (ioguix) de Rorthais"  writes:
>> One of our customer send us a patch he wrote for his needs (on
>> "src/bin/scripts/vacuumdb.c", no doc were included).
> 
>> He's using one schema per application and would like to be able to run
>> vacuumdb on each of them independently so he added the "--schema|-n"
>> option and send us the patch.
> 
>> Reviewing his patch, I thought it would be more useful to allow multi
>> "--table|-t" options on the command line first. It might be possible to
>> pass an array of tables to "vacuum_one_database" function instead of
>> just one.
> 
>> Then, we could add the "--schema|-n" option which would fetch and build
>> the table list and call "vacuum_one_database".
> 
>> But before I start writing this patch, I would like some opinion, pros /
>> cons. Do you think such a feature could be accepted in official
>> PostgreSQL code or should we keep this as an external script ?
> 
> I think most of us see vacuumdb as a historical leftover.  We keep it
> around in case anyone is still relying on it, but improving it seems
> like misdirected effort.
> 
> Why isn't your customer using autovacuum?  If there are concrete
> reasons why that doesn't get the job done for him, it would be more
> useful in the long run to work on fixing that.

I usually advice to use both of them: vacuumdb and autovacuum (with
tuning if needed).

Autovacuum is launching its workers whenever tables need some
maintenance. It is usually pretty discrete in most case, but not all of
them (cf. Kevin Grittner email) and you cannot control when these
workers are launched.

As a DBA you know when you can afford some maintenances during the
day/week/month WRT your db activity and plan them accordingly. Hence,
running vacuumdb from crontab will just do autovacuum's job in some
situation avoiding a random worker during the day.

>   regards, tom lane

-- 
Jehan-Guillaume (ioguix) de Rorthais
http://www.dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY with hints, rebirth

2012-03-02 Thread Simon Riggs

On Thu, Mar 1, 2012 at 8:49 PM, Heikki Linnakangas
 wrote:
> On 01.03.2012 18:40, Simon Riggs wrote:
>>
>> On Sun, Feb 26, 2012 at 7:16 PM, Heikki Linnakangas
>>   wrote:
>>>
>>> On 24.02.2012 22:55, Simon Riggs wrote:

 What exactly does it do? Previously, we optimised COPY when it was
 loading data into a newly created table or a freshly truncated table.
 This patch extends that and actually sets the tuple header flag as
 HEAP_XMIN_COMMITTED during the load. Doing so is simple 2 lines of
 code. The patch also adds some tests for corner cases that would make
 that action break MVCC - though those cases are minor and typical data
 loads will benefit fully from this.
>>>
>>>
>>> This doesn't work with subtransactions:
>>
>> ...
>>>
>>> The query should return the row copied in the same subtransaction.
>>
>>
>> Thanks for pointing that out.
>>
>> New patch with corrected logic and test case attached.
>
>
> It's still broken:

Agreed. Thanks for your detailed input.

Performance test results show that playing with XidInMVCCSnapshot()
causes a small but growing performance regression that I find
unacceptable, so while I might fix the above, I doubt if I can improve
this as well.

So this approach isn't the one...

The COPY FREEZE patch provides a way for the user to say explicitly
that they don't really care about these MVCC corner cases and as a
result allows us to avoid touching XidInMVCCSnapshot() at all. So
there is still a patch on the table.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

2012/3/2 Alvaro Herrera :
> I've cleaned up the backend code a bit -- see attached.  More yet to go
> through; I'm mainly sending it out for you (and everyone, really) to
> give your opinion on my changes so far.
>
> (I split out the plpgsql checker for the time being into a separate
> branch; I'll get on it after I get this part committed.)

it looks well

Regards

Pavel Stěhule

>
> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-02 Thread Pavel Stehule

Hello

2012/3/1 Alvaro Herrera :
>
>
> Why does CollectCheckedFunctions skip trigger functions?  My only guess
> is that at one point the checker was not supposed to know how to check
> them, and a later version learned about it and this bit wasn't updated;
> but maybe there's another reason?

you cannot to check trigger function without assigned relation -
TupleDescription should be assigned to NEW and OLD variables.

Regards

Pavel



>
> --
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

56 matches

Mail list logo