date:20130621

Re: [HACKERS] WITH CHECK OPTION for auto-updatable views

2013-06-21 Thread Stephen Frost

Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> Here's an updated version --- I missed the necessary update to the
> check_option column of information_schema.views.

Thanks!  This is really looking quite good, but it's a bit late and I'm
going on vacation tomorrow, so I didn't quite want to commit it yet. :)
Instead, here are a few things that I'd like to see fixed up:

I could word-smith the docs all day, most likely, but at least the
following would be nice to have cleaned up:

 - 'This is parameter may be either'

 - I don't like "This allows an existing view's ...".  The option can be
   used on CREATE VIEW as well as ALTER VIEW.  I'd say something like:

   This parameter may be either local or
   cascaded, and is equivalent to specifying WITH [
   CASCADED | LOCAL ] CHECK OPTION (see below).  This option can be
   changed on existing views using .

 - wrt what shows up in '\h create view' and '\h alter view', I think we
   should go ahead and add in with the options are, ala EXPLAIN.  That
   avoids having to guess at it (I was trying 'with_check_option'
   initially :).

 - Supposedly, this option isn't available for RECURSIVE views, but it's
   happily accepted: 

=*# create recursive view qq (a) with (check_option = local) as select z from q;
CREATE VIEW

(same is true of ALTER VIEW on a RECURSIVE view)

 - pg_dump support is there, but it outputs the definition using the PG
   syntax instead of the SQL syntax; is there any particular reason for
   this..?  imv, we should be dumping SQL spec where we can trivially
   do so.

 - Why check_option_offset instead of simply check_option..?  We don't
   have security_barrier_offset and it seems like we should be
   consistent there.

The rest looks pretty good to me.  If you can fix the above, I'll review
again and would be happy to commit it. :)

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-06-21 Thread Amit kapila

On Friday, June 21, 2013 11:43 PM Robert Haas wrote:
On Fri, Jun 21, 2013 at 12:56 PM, Fujii Masao  wrote:
> On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas  wrote:
>> On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut  wrote:
>>> On 6/7/13 12:14 AM, Amit Kapila wrote:
> I will change the patch as per below syntax if there are no objections:
>
>  ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'};

 I do like using ALTER SYSTEM in general, but now that I think about it,
 the 9.3 feature to create global settings in pg_db_role_setting would
 also have been a candidate for exactly that same syntax if it had been
 available.  In fact, if we do add ALTER SYSTEM, it might make sense to
 recast that feature into that syntax.

 It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE
 or something like that.  It's only a small syntax change, so don't worry
 about it too much, but let's keep thinking about it.
>>>
>>> I think that anything we want to add should either go before SET or
>>> after the value, such as:
>>>
>>> ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf';
>>> ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file';
>>> ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET
>>> configuration_parameter = 'file';
>>>
>>> I agree we shouldn't back ourselves into a syntactic corner.
>>
>> Maybe this idea has been already discussed, but why don't we just
>> add the function like update_config_file(parameter, value)? We can
>> commit the core of the patch with that function API first, and then
>> we can discuss the syntax and add another API like ALTER SYSTEM
>> later.
>>
>> We now have set_config() function to set the parameter,
>> so adding the function for this feature should not be a surprise.
>> If the name of the function is, for example, update_conf_file,
>> most users would think that it's intuitive even if SIGHUP is not
>> automatically sent immediately. We don't need to emit
>> the message like 'This setting change will not be loaded until SIGHUP'.

> I could certainly support that plan.  I'm satisfied with the ALTER
> SYSTEM syntax and feel that might find other plausible uses, but
> functional notation would be OK with me, too.


I can update the patch to have a function as suggested by Fujii-san if it can 
be decided
that for the first committed version it will be sufficient.
OTOH already we already have consensus on ALTER SYSTEM syntax apart from few 
extra keywords to make it more meaningful/extendible. 
I think we can consider the current syntax (ALTER SYSTEM SET ..) and make that 
behavior as default for writing to auto file.
In future we can extend it with other keywords depending on usage.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-06-21 Thread Amit kapila

On Friday, June 21, 2013 11:48 PM Robert Haas wrote:
On Thu, Jun 20, 2013 at 12:18 AM, Amit Kapila  wrote:
>> Auto.conf- 1 Vote (Josh)
>> System.auto.conf - 1 Vote (Josh)
>> Postgresql.auto.conf - 2 Votes (Zoltan, Amit)
>> Persistent.auto.conf - 0 Vote
>> generated_by_server.conf - 1 Vote (Peter E)
>> System.conf  - 1 Vote (Magnus)
>> alter_system.conf- 1 Vote (Alvaro)
>
>> I had consolidated the list, so that it would be helpful for committer to
>> decide among proposed names for this file.

> I'll also vote for postgresql.auto.conf.

  Thanks to all of you for suggesting meaningful names. I will change the name 
of file to postgresql.auto.conf.
  Kindly let me know if there is any objection to it.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-06-21 Thread Michael Paquier

On Fri, Jun 21, 2013 at 10:47 PM, Andres Freund  wrote:
> Hm. Looking at how this is currently used - I am afraid it's not
> correct... the reason RelationGetIndexList() returns a copy is that
> cache invalidations will throw away that list. And you do index_open()
> while iterating over it which will accept invalidation messages.
> Mybe it's better to try using RelationGetIndexList directly and measure
> whether that has a measurable impact=
By looking at the comments of RelationGetIndexList:relcache.c,
actually the method of the patch is correct because in the event of a
shared cache invalidation, rd_indexvalid is set to 0 when the index
list is reset, so the index list would get recomputed even in the case
of shared mem invalidation.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-21 Thread Craig Ringer

On 06/22/2013 03:30 AM, ian link wrote:
> Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?

Consider "RANGE 4.5 PRECEDING'.

You need to be able to test whether, for the current row 'b', any given
row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
< vs <= boundaries, but that's irrelevant for the example.

To test that, you have to be able to do two things: you have to be able
to test whether one value is greater than another, and you have to be
able to add or subtract a constant from one of the values.

Right now, the b-tree access method provides information on the ordering
operators < <= = > >= <> , which provides half the answer. But these
don't give any concept of *distance* - you can test ordinality but not
cardinality.

To implement the "different by 4.5" part, you have to be able to add 4.5
to one value or subtract it from the other.

The obvious way to do that is to look up the function that implements
the '+' or '-' operator, and do:

((OPERATOR(+))(a, 4.5)) > b AND (a <= b)

or

((OPERATOR(-))(b, 4.5)) < a AND (a <= b);

The problem outlined by Tom in prior discussion about this is that
PostgreSQL tries really hard not to assume that particular operator
names mean particular things. Rather than "knowing" that "+" is always
"an operator that adds two values together; is transitive, symmetric and
reflexive", PostgreSQL requires that you define an *operator class* that
names the operator that has those properties.

Or at least, it does for less-than, less-than-or-equals, equals,
greater-than-or-equals, greater-than, and not-equals as part of the
b-tree operator class, which *usually* defines these operators as < <= =
>= > <>, but you could use any operator names you wanted if you really
liked.

Right now (as far as I know) there's no operator class that lets you
identify operators for addition and subtraction in a similar way. So
it's necessary to either add such an operator class (in which case
support has to be added for it for every type), extend the existing
b-tree operator class to provide the info, or blindly assume that "+"
and "-" are always addition and subtraction.

For an example of why such assumptions are a bad idea, consider matrix
multiplication. Normally, "a * b" = "b * a", but this isn't true for
multiplication of matrices. Similarly, if someone defined a "+" operator
as an alias for string concatenation (||), we'd be totally wrong to
assume we could use that for doing range-offset windowing.

So. Yeah. Operator classes required, unless we're going to change the
rules and make certain operator names "special" in PostgreSQL, so that
if you implement them they *must* have certain properties. This seems
like a pretty poor reason to add such a big change.

I hope this explanation (a) is actually correct and (b) is helpful.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane  wrote:
>> The traditional theory has been that that would be less robust, not
>> more so.  Child backends are (mostly) able to carry out queries whether
>> or not the postmaster is around.

> I think that's the Tom Lane theory.  The Robert Haas theory is that if
> the postmaster has died, there's no reason to suppose that it hasn't
> corrupted shared memory on the way down, or that the system isn't
> otherwise heavily fuxxored in some way.

Eh?  The postmaster does its level best never to touch shared memory
(after initialization anyway).

>> True, you can't make new connections,
>> but how does killing the existing children make that better?

> It allows you to start a new postmaster in a timely fashion, instead
> of waiting for an idle connection that may not ever terminate without
> operator intervention.

There may be something in that argument, but I find the other one
completely bogus.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] problem with commitfest redirection

2013-06-21 Thread Martín Marqués


El 21/06/13 23:47, Jaime Casanova escribió:

On Fri, Jun 21, 2013 at 8:56 PM, Martín Marqués  wrote:

When ever I try to see the patch from this commit it never loads:

https://commitfest.postgresql.org/action/patch_view?id=1129

Some problem there? I can see other patches, from other commits.



Yes, the URL is wrong. right URL is
http://www.postgresql.org/message-id/CAFjNrYuh=4Vwnv=2n7cj0jjuwc4hool1epxsoflj6s19u02...@mail.gmail.com


Yes, found out that later. Could there be other URL's like that?

--
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] problem with commitfest redirection

2013-06-21 Thread Jaime Casanova

On Fri, Jun 21, 2013 at 8:56 PM, Martín Marqués  wrote:
> When ever I try to see the patch from this commit it never loads:
>
> https://commitfest.postgresql.org/action/patch_view?id=1129
>
> Some problem there? I can see other patches, from other commits.
>

Yes, the URL is wrong. right URL is
http://www.postgresql.org/message-id/CAFjNrYuh=4Vwnv=2n7cj0jjuwc4hool1epxsoflj6s19u02...@mail.gmail.com


--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau


From: "Robert Haas" 

On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane  wrote:

Robert Haas  writes:

More generally, what do we think the point is of sending SIGQUIT
rather than SIGKILL in the first place, and why does that point cease
to be valid after 5 seconds?


Well, mostly it's about telling the client we're committing hara-kiri.
Without that, there's no very good reason to run quickdie() at all.


That's what I thought, too.  It seems to me that if we think that's
important, then it's important even if it takes more than 5 seconds
for some reason.


I guess Tom san is saying "we should be as kind as possible to the client, 
so try to notify the client of the shutdown".  Not complete kindness. 
Because the DBA requested immediate shutdown by running "pg_ctl stop -mi", 
the top priority is to shutdown the database server as immediately as 
possible.  The idea here is to try to be friendly to the client as long as 
the DBA can stand.  That's tthe 5 second.




A practical issue with starting to send SIGKILL ourselves is that we
will no longer be able to reflexively diagnose "server process died
on signal 9" as "the linux OOM killer got you".  I'm not at all
convinced that the cases where SIGQUIT doesn't work are sufficiently
common to justify losing that property.


I'm not, either.  Maybe this question will provoke many indignant
responses, but who in their right mind even uses immediate shutdown on
a production server with any regularity?  The shutdown checkpoint is
sometimes painfully long, but do you really want to run recovery just
to avoid it?  And in the rare case where an immediate shutdown fails
to work, what's wrong will "killall -9 postgres"?


Checkpoint is irrelevant here because we are discussing immediate shutdown. 
Some problems with "killall -9 postgres" are:


1. It's not available on Windows.
2. It may kill other database server instances running on the same machine.

Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau


From: "Robert Haas" 
On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera

 wrote:

I will go with 5 seconds, then.


I'm uncomfortable with this whole concept, and particularly with such
a short timeout.  On a very busy system, things can take a LOT longer
than they think we should; it can take 30 seconds or more just to get
a prompt back from a shell command.  5 seconds is the blink of an eye.


I'm comfortable with 5 seconds.  We are talking about the interval between 
sending SIGQUIT to the children and then sending SIGKILL to them.  In most 
situations, the backends should terminate immediately.  However, as I said a 
few months ago, ereport() call in quickdie() can deadlock indefinitely. 
This is a PostgreSQL's bug.  In addition, Tom san was concerned that some 
PLs (PL/Perl or PL/Python?) block SIGQUIT while executing the UDF, so they 
may not be able to respond to the immediate shutdown request.


What DBAs want from "pg_ctl stop -mi" is to shutdown the database server as 
immediately as possible.  So I think 5 second is reasonable.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] problem with commitfest redirection

2013-06-21 Thread Martín Marqués


When ever I try to see the patch from this commit it never loads:

https://commitfest.postgresql.org/action/patch_view?id=1129

Some problem there? I can see other patches, from other commits.

--
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-06-21 Thread Michael Paquier

OK let's finalize this patch first. I'll try to send an updated patch
within today.

On Fri, Jun 21, 2013 at 10:47 PM, Andres Freund  wrote:
> On 2013-06-21 20:54:34 +0900, Michael Paquier wrote:
>> On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund  
>> wrote:
>> > On 2013-06-19 09:55:24 +0900, Michael Paquier wrote:
>> >> >> @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid 
>> >> >> OIDNewHeap,
>> >> > Is it actually possible to get here with multiple toast indexes?
>> >> Actually it is possible. finish_heap_swap is also called for example
>> >> in ALTER TABLE where rewriting the table (phase 3), so I think it is
>> >> better to protect this code path this way.
>> >
>> > But why would we copy invalid toast indexes over to the new relation?
>> > Shouldn't the new relation have been freshly built in the previous
>> > steps?
>> What do you think about that? Using only the first valid index would be 
>> enough?
>
> What I am thinking about is the following: When we rewrite a relation,
> we build a completely new toast relation. Which will only have one
> index, right? So I don't see how this could could be correct if we deal
> with multiple indexes. In fact, the current patch's swap_relation_files
> throws an error if there are multiple ones around.
Yes, OK. Let me have a look at the code of CLUSTER more in details
before giving a precise answer, but I'll try to remove that renaming
part.  Btw, I'd like to add an assertion in the code at least to
prevent wrong use of this code path.

>> >> >> diff --git a/src/include/utils/relcache.h 
>> >> >> b/src/include/utils/relcache.h
>> >> >> index 8ac2549..31309ed 100644
>> >> >> --- a/src/include/utils/relcache.h
>> >> >> +++ b/src/include/utils/relcache.h
>> >> >> @@ -29,6 +29,16 @@ typedef struct RelationData *Relation;
>> >> >>  typedef Relation *RelationPtr;
>> >> >>
>> >> >>  /*
>> >> >> + * RelationGetIndexListIfValid
>> >> >> + * Get index list of relation without recomputing it.
>> >> >> + */
>> >> >> +#define RelationGetIndexListIfValid(rel) \
>> >> >> +do { \
>> >> >> + if (rel->rd_indexvalid == 0) \
>> >> >> + RelationGetIndexList(rel); \
>> >> >> +} while(0)
>> >> >
>> >> > Isn't this function misnamed and should be
>> >> > RelationGetIndexListIfInValid?
>> >> When naming that; I had more in mind: "get the list of indexes if it
>> >> is already there". It looks more intuitive to my mind.
>> >
>> > I can't follow. RelationGetIndexListIfValid() doesn't return
>> > anything. And it doesn't do anything if the list is already valid. It
>> > only does something iff the list currently is invalid.
>> In this case RelationGetIndexListIfInvalid?
>
> Yep. Suggested that above ;). Maybe RelationFetchIndexListIfInvalid()?
>
> Hm. Looking at how this is currently used - I am afraid it's not
> correct... the reason RelationGetIndexList() returns a copy is that
> cache invalidations will throw away that list. And you do index_open()
> while iterating over it which will accept invalidation messages.
> Mybe it's better to try using RelationGetIndexList directly and measure
> whether that has a measurable impact=
Yes, I was wondering about potential memory leak that list_copy could
introduce in tuptoaster.c when doing a bulk insert, that's the only
reason why I added this macro.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing incremental backup

2013-06-21 Thread Jehan-Guillaume (ioguix) de Rorthais

On 20/06/2013 03:25, Tatsuo Ishii wrote:
>> On Wed, Jun 19, 2013 at 8:40 PM, Tatsuo Ishii  wrote:
 On Wed, Jun 19, 2013 at 6:20 PM, Stephen Frost  wrote:
> * Claudio Freire (klaussfre...@gmail.com) wrote:
[...]
>>
>> The only bottleneck here, is WAL archiving. This assumes you can
>> afford WAL archiving at least to a local filesystem, and that the WAL
>> compressor is able to cope with WAL bandwidth. But I have no reason to
>> think you'd be able to cope with dirty-map updates anyway if you were
>> saturating the WAL compressor, as the compressor is more efficient on
>> amortized cost per transaction than the dirty-map approach.
> 
> Thank you for detailed explanation. I will think more about this.

Just for the record, I was mulling over this idea since a bunch of
month. I even talked about that with Dimitri Fontaine some weeks ago
with some beers :)

My idea came from a customer during a training explaining me the
difference between differential and incremental backup in Oracle.

My approach would have been to create a standalone tool (say
pg_walaggregate) which takes a bunch of WAL from archives and merge them
in a single big file, keeping only the very last version of each page
after aggregating all their changes. The resulting file, aggregating all
the changes from given WAL files would be the "differential backup".

A differential backup resulting from a bunch of WAL between W1 and Wn
would help to recover much faster to the time of Wn than replaying all
the WALs between W1 and Wn and saves a lot of space.

I was hoping to find some time to dig around this idea, but as the
subject rose here, then here are my 2¢!

Cheers,
-- 
Jehan-Guillaume (ioguix) de Rorthais

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Robert Haas

On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Andres Freund escribió:
>>> What we could do to improve the robustness a bit - at least on linux -
>>> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
>>> killed if the postmaster goes away...
>
>> This is an interesting idea (even if it has no relationship to the patch
>> at hand).
>
> The traditional theory has been that that would be less robust, not
> more so.  Child backends are (mostly) able to carry out queries whether
> or not the postmaster is around.

I think that's the Tom Lane theory.  The Robert Haas theory is that if
the postmaster has died, there's no reason to suppose that it hasn't
corrupted shared memory on the way down, or that the system isn't
otherwise heavily fuxxored in some way.

> True, you can't make new connections,
> but how does killing the existing children make that better?

It allows you to start a new postmaster in a timely fashion, instead
of waiting for an idle connection that may not ever terminate without
operator intervention.

Even if it were possible to start a new postmaster that attached to
the existing shared memory segment and began spawning new children, I
think I'd be heavily in favor of killing the old ones off first and
doing a full system reset just for safety.  But it isn't, so what you
get is a crippled system that never recovers without operator
intervention.  And note that I'm not talking about "pg_ctl restart";
that fails because the children have the shmem segment still attached
and the postmaster, which is the only thing listed in the PID file, is
already dead.  I'm talking about "killall -9 postgres", at least.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixed Cardinality estimation with equality predicates between column of the same table

2013-06-21 Thread Josh Berkus

On 06/21/2013 02:33 PM, desmodemone wrote:
> Hi all,
>   I see a strange behavior ( for me ) on 9.2 (but seems the same on
> 9.1 and 9.3)  of the optimizer on query like that :
> 

Matteo, I just posted this on -performance.  See Tom's answer.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Troels Nielsen

Hello all

I've been examining PostgreSQL to gain a greater understanding
of RDBMS. (Thanks for a nice, very educational system!)

In the process I've been looking into a few problems and the
complications of this patch appeared relatively uninvolved, so I
tried to look for a solution.

I found the following:

The grammar conflict appears to be because of ambiguities in:
1. table_ref (used exclusively in FROM clauses)
2. index_elem (used exclusively in INDEX creation statements).

Now, this doesn't seem to make much sense, as AFAICT window functions
are explicitly disallowed in these contexts (transformWindowFuncCall
will yield errors, and I can't really wrap my head around what a
window function call would mean there).

I therefore propose a simple rearrangement of the grammar,
syntactically disallowing window functions in the outer part of those
contexts (a_expr's inside can't and shouldn't be done much about)
which will allow both RESPECT and IGNORE to become unreserved
keywords, without doing any lexer hacking or abusing the grammar.

I've attached a patch which will add RESPECT NULLS and IGNORE NULLS to
the grammar in the right place. Also the window frame options are set
but nothing more, so this patch needs to be merged with Nicholas White's
original patch.

One problem I see with this approach to the grammar is that the
error messages will change when putting window functions in any of the
forbidden places. The new error messages are I think worse and less
specific than the old ones. I suppose that can be fixed though, or
maybe the problem isn't so severe.

Example of old error message:
window functions are not allowed in functions in FROM

New error message:
syntax error at or near "OVER"

in addition I think the original patch as it stands has a few
problems that I haven't seen discussed:

1. The result of the current patch using lead

create table test_table (
   id serial,
   val integer);
insert into test_table (val) select * from unnest(ARRAY[1,2,3,4,NULL, NULL,
NULL, 5, 6, 7]);

select val, lead(val, 2) ignore nulls over (order by id) from test_table;
 val | lead
-+--
   1 |3
   2 |4
   3 |4
   4 |4
 |4
 |5
 |6
   5 |7
   6 |7
   7 |7
(10 rows)

I would expect it to output:

select val, lead(val, 2) ignore nulls over (order by id) from test_table;
 val | lead
-+--
   1 |3
   2 |4
   3 |5
   4 |6
 |6
 |6
 |6
   5 |7
   6 |
   7 |
(10 rows)

That is: skip two rows forward not counting null rows.

The lag behavior works well as far as I can see.

2. It would be nice if an error was given when ignore nulls was used
on a function for which it had no effect. Perhaps this should be up to
the different window function themselves to do though.

Apart from those points I think the original patch is nice and provides a
functionality
that's definitely nice to have.

Kind Regards
Troels Nielsen

On Fri, Jun 21, 2013 at 8:11 PM, Robert Haas  wrote:

> On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis  wrote:
> >> Regardless of what syntax we settle on, we should also make sure that
> >> the conflict is intrinsic to the grammar and can't be factored out, as
> >> Tom suggested upthread.  It's not obvious to me what the actual
> >> ambiguity is here.  If you've seen "select lag(num,0)" and the
> >> lookahead token is "respect", what's the problem?  It sort of looks
> >> like it could be a column label, but not even unreserved keywords can
> >> be column labels, so that's not it.  Probably deserves a bit more
> >> investigation...
> >
> > I think the problem is when the function is used as a table function;
> > e.g.:
> >
> >   SELECT * FROM generate_series(1,10) respect;
>
> Ah ha.  Well, there's probably not much help for that.  Disallowing
> keywords as table aliases would be a cure worse than the disease, I
> think.  I suppose the good news is that there probably aren't many
> people using RESPECT as a column name, but I have a feeling that we're
> almost certain to get complaints about reserving IGNORE.  I think that
> will have to be quoted as a PL/pgsql variable name as well.  :-(
>
> >> We could just add additional, optional Boolean argument to the
> >> existing functions.  It's non-standard, but we avoid adding keywords.
> >
> > I thought about that, but it is awkward because the argument would have
> > to be constant (or, if not, it would be quite strange).
>
> True... but e.g. string_agg() has the same issue.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

respect_nulls_and_ignore_nulls_grammar.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mai

Re: [HACKERS] Unaccent performance

2013-06-21 Thread Thom Brown

On 21 June 2013 19:04, Thom Brown  wrote:

> Hi,
>
> The unaccent extension is great, especially with its customisability, but
> it's not always easy to recommend.  I witnessed a customer using no less
> than 56 nested replace functions in an SQL function.  I looked to see how
> much this can be mitigated by unaccent.  It turns out that not all the
> characters they were replacing can be replaced by unaccent, either because
> they replace more than 1 character at a time, or the character they're
> replacing, for some reason, isn't processed by unaccent, even with a custom
> rules file.
>
> So there were 20 characters I could identify that they were replacing.  I
> made a custom rules file and compared its performance to the
> difficult-to-manage set of nested replace calls.
>
> CREATE OR REPLACE FUNCTION public.myunaccent(sometext text)
>  RETURNS text
>  LANGUAGE sql
>  IMMUTABLE
> AS $function$
> SELECT
> replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(sometext,'ą','a'),'Ą','A'),'ă','a'),'Ă','A'),'ā','a'),'Ā','A'),'æ','a'),'å','a'),'ä','a'),'ã','a'),'â','a'),'á','a'),'à','a'),'Æ','A'),'Å','A'),'Ä','A'),'Ã','A'),'Â','A'),'Á','A'),'À','A')
> ;
> $function$
>
> postgres=# SELECT myunaccent(sometext::text) FROM (SELECT
> 'ÀÁÂÃÄÅÆàáâãäåæĀāĂăĄą' sometext FROM generate_series(1,100)) x OFFSET
> 99 LIMIT 1;
>   myunaccent
> --
>  AAAaaaAaAaAa
> (1 row)
>
> Time: 726.282 ms
> postgres=# SELECT unaccent(sometext::text) FROM (SELECT
> 'ÀÁÂÃÄÅÆàáâãäåæĀāĂăĄą' sometext FROM generate_series(1,100)) x OFFSET
> 99 LIMIT 1;
>unaccent
> --
>  AAAaaaAaAaAa
> (1 row)
>
> Time: 3305.252 ms
>
> The timings are actually pretty much the same even if I introduce 187
> nested replace calls for every line in the unaccent.rules file for 187
> characters.  But the same character set with unaccent increases to 7418.526
> ms with the same type of query as above.  That's 10 times more expensive.
>
> Is there a way to boost the performance to make its adoption more
> palatable?
>

Another test passing in a string of 10 characters gives the following
timings:

unaccent: 240619.395 ms
myunaccent: 785.505 ms

I guess this must indicate that unaccent is processing all rows, and
myunaccent is only being run on the 1 select row?  I can't account for
myunaccent always being almost the same duration regardless of string
length otherwise.  This is probably an incorrect assessment of performance.

Another test inserting long text strings into a text column of a table
100,000 times, then updating another column to have that unaccented value
using both methods:

unaccent: 3867.306 ms
myunaccent: 43611.732 ms

So I guess this complaint about performance is all just noise.

However, pushing that pointless complaint to one side, I would like to have
the ability to have unaccent support more characters that it doesn't
currently seem to support, such as bullet points, ellipses etc., and also
more than 1 character being replaced.  Naturally these aren't appropriate
to fall under the unaccent function itself, but the rules file is good
starting point.  It would be a bit like translate, except it would use a
rules file instead of providing strings of single characters to convert.

So say we wanted "(trademark)" to be converted into "™" just as an example,
or ";" to ".".  We can't do that with unaccent, but in order to avoid a
huge list of replace functions, a function like unaccent, with a few
adaptations, would solve the problem.

e.g.:

SELECT transform(my_custom_dictionary, 'Commodore Amiga(trademark);')

would return

Commodore Amiga™.

This would ideally somehow cater for replacing tabs and spaces too.

-- 
Thom

Re: [HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Tom Lane

Alvaro Herrera  writes:
> Andres Freund escribió:
>> What we could do to improve the robustness a bit - at least on linux -
>> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
>> killed if the postmaster goes away...

> This is an interesting idea (even if it has no relationship to the patch
> at hand).

The traditional theory has been that that would be less robust, not
more so.  Child backends are (mostly) able to carry out queries whether
or not the postmaster is around.  True, you can't make new connections,
but how does killing the existing children make that better?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Fixed Cardinality estimation with equality predicates between column of the same table

2013-06-21 Thread desmodemone

Hi all,
  I see a strange behavior ( for me ) on 9.2 (but seems the same on
9.1 and 9.3)  of the optimizer on query like that :

/* create a table with random data and 2 rows */

create table test1 ( id int not null primary key, state1 int not null
default 0, state2 int not null default 0, state3 int not null default 0 );

 insert into test1 (id, state1, state2, state3) select i,
(random()*3)::int, (random())::int, (random()*100)::int from
generate_series (1, 2) as gs(i) ;

analyze test1 ;

/* between  same columns  */

explain  select * from test1 where state1=state1 ;
QUERY PLAN
--
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state1 = state1)
(2 rows)

test3=# explain  select * from test1 where state2=state2 ;
QUERY PLAN
--
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state2 = state2)
(2 rows)

/* between different columns of same table  */

test3=# explain  select * from test1 where state1=state2 ;
QUERY PLAN
--
 Seq Scan on test1  (cost=0.00..359.00 rows=100 width=16)
   Filter: (state1 = state2)
(2 rows)

===

/* create a table with random data and 10 rows to verify  */

create table test2 ( id int not null primary key, state1 int not null
default 0, state2 int not null default 0, state3 int not null default 0 );

 insert into test2 (id, state1, state2, state3) select i,
(random()*3)::int, (random())::int, (random()*100)::int from
generate_series (1, 10) as gs(i) ;

test3=#  analyze  test2 ;
ANALYZE
test3=# explain  select * from test2 where
state1=state3;QUERY PLAN
---
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state3)
(2 rows)

test3=# explain  select * from test2 where state1=state2;
QUERY PLAN
---
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state2)
(2 rows)

test3=# explain  select * from test2 where state1=state1;
QUERY PLAN
---
 Seq Scan on test2  (cost=0.00..1791.00 rows=500 width=16)
   Filter: (state1 = state1)
(2 rows)



It's seems always 0.5% of the rows , and it seems indipendent of the type
of data you have in row :

/*add a column where costant value named c3 */

 alter table test1 add c3 int default 1 ;
ALTER TABLE

analyze test1 ;
ANALYZE

explain  select * from test1  where state1=c3;
QUERY PLAN
--
 Seq Scan on test1  (cost=0.00..378.00 rows=100 width=20)
   Filter: (state1 = c3)
(2 rows)

/*add a column where costant value named c3 */

 alter table test2 add c3 int default 1 ;
ALTER TABLE
 analyze test2 ;
ANALYZE
 explain  select * from test2  where state1=c3;
QUERY PLAN
---
 Seq Scan on test2  (cost=0.00..1887.00 rows=500 width=20)
   Filter: (state1 = c3)
(2 rows)

/* add another constant column */

test3=# alter table test2 add c4 int default 1 ;
ALTER TABLE
test3=# analyze test2 ;
ANALYZE
test3=# explain  select * from test2  where c3=c4 ;
QUERY PLAN
---
 Seq Scan on test2  (cost=0.00..1887.00 rows=500 width=24)
   Filter: (c3 = c4)

obviously the statistics are ok :



Always 0.5%.

Greetings

Matteo

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Christopher Browne

The case where I wanted "routine" shutdown immediate (and I'm not sure I
ever actually got it) was when we were using IBM HA/CMP, where I wanted a
"terminate with a fair bit of prejudice".

If we know we want to "switch right away now", immediate seemed pretty much
right.  I was fine with interrupting user sessions, and there wasn't as
much going on in the way of system background stuff back then.

I wasn't keen on waiting on much of anything.  The background writer ought
to be keeping things from being too desperately out of date.

If there's stuff worth waiting a few seconds for, I'm all ears.

But if I have to wait arbitrarily long, colour me unhappy.

If I have to distinguish, myself, between a checkpoint nearly done flushing
and a backend that's stuck waiting forlornly for filesystem access, I'm
inclined to "kill -9" and hope recovery doesn't take *too* long on the next
node...

If shutting a server down in an emergency situation requires a DBA to look
in, as opposed to init.d doing its thing, I think that's pretty much the
same problem too.

[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Alvaro Herrera

Andres Freund escribió:
> On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote:

> > If we leave postmaster running after SIGKILLing its children, the only
> > thing we can do is have it continue to SIGKILL processes continuously
> > every few seconds until they die (or just sit around doing nothing until
> > they all die).  I don't think this will have a different effect than
> > postmaster going away trusting the first SIGKILL to do its job
> > eventually.
> 
> I think it should just wait till all its child processes are dead. No
> need to repeat sending the signals - as you say, that won't help.

OK, I can buy that.  So postmaster stays around waiting in ServerLoop
until all children are gone; and if any persists for whatever reason,
well, tough.

> What we could do to improve the robustness a bit - at least on linux -
> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
> killed if the postmaster goes away...

This is an interesting idea (even if it has no relationship to the patch
at hand).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hardware donation

2013-06-21 Thread Josh Berkus


>> Who can be point of contact from the community to arrange shipping, etc?
> 
> I can be.

And I'll handle the tax credit once the servers are received.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Robert Haas

On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> More generally, what do we think the point is of sending SIGQUIT
>> rather than SIGKILL in the first place, and why does that point cease
>> to be valid after 5 seconds?
>
> Well, mostly it's about telling the client we're committing hara-kiri.
> Without that, there's no very good reason to run quickdie() at all.

That's what I thought, too.  It seems to me that if we think that's
important, then it's important even if it takes more than 5 seconds
for some reason.

> A practical issue with starting to send SIGKILL ourselves is that we
> will no longer be able to reflexively diagnose "server process died
> on signal 9" as "the linux OOM killer got you".  I'm not at all
> convinced that the cases where SIGQUIT doesn't work are sufficiently
> common to justify losing that property.

I'm not, either.  Maybe this question will provoke many indignant
responses, but who in their right mind even uses immediate shutdown on
a production server with any regularity?  The shutdown checkpoint is
sometimes painfully long, but do you really want to run recovery just
to avoid it?  And in the rare case where an immediate shutdown fails
to work, what's wrong will "killall -9 postgres"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Review [was Re: [HACKERS] MD5 aggregate]

2013-06-21 Thread David Fetter

On Fri, Jun 21, 2013 at 10:48:35AM -0700, David Fetter wrote:
> On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote:
> > On 15 June 2013 10:22, Dean Rasheed  wrote:
> > > There seem to be 2 separate directions that this could go, which
> > > really meet different requirements:
> > >
> > > 1). Produce an unordered sum for SQL to compare 2 tables regardless of
> > > the order in which they are scanned. A possible approach to this might
> > > be something like an aggregate
> > >
> > > md5_total(text/bytea) returns text
> > >
> > > that returns the sum of the md5 values of each input value, treating
> > > each md5 value as an unsigned 128-bit integer, and then producing the
> > > hexadecimal representation of the final sum. This should out-perform a
> > > solution based on numeric addition, and in typical cases, the result
> > > wouldn't be much longer than a regular md5 sum, and so would be easy
> > > to eyeball for differences.
> > >
> > 
> > I've been playing around with the idea of an aggregate that computes
> > the sum of the md5 hashes of each of its inputs, which I've called
> > md5_total() for now, although I'm not particularly wedded to that
> > name. Comparing it with md5_agg() on a 100M row table (see attached
> > test script) produces interesting results:
> > 
> > SELECT md5_agg(foo.*::text)
> >   FROM (SELECT * FROM foo ORDER BY id) foo;
> > 
> >  50bc42127fb9b028c9708248f835ed8f
> > 
> > Time: 92960.021 ms
> > 
> > SELECT md5_total(foo.*::text) FROM foo;
> > 
> >  02faea7fafee4d253fc94cfae031afc43c03479c
> > 
> > Time: 96190.343 ms
> > 
> > Unlike md5_agg(), it is no longer a true MD5 sum (for one thing, its
> > result is longer) but it seems like it would be very useful for
> > quickly comparing data in SQL, since its value is not dependent on the
> > row-order making it easier to use and better performing if there is no
> > usable index for ordering.
> > 
> > Note, however, that if there is an index that can be used for
> > ordering, the performance is not necessarily better than md5_agg(), as
> > this example shows. There is a small additional overhead per row for
> > initialising the MD5 sums, and adding the results to the total, but I
> > think the biggest factor is that md5_total() is processing more data.
> > The reason is that MD5 works on 64-byte blocks, so the total amount of
> > data going through the core MD5 algorithm is each row's size is
> > rounded up to a multiple of 64. In this simple case it ends up
> > processing around 1.5 times as much data:
> > 
> > SELECT sum(length(foo.*::text)) AS md5_agg,
> >sum(((length(foo.*::text)+63)/64)*64) AS md5_total FROM foo;
> > 
> >   md5_agg   |  md5_total
> > +-
> >  8103815438 | 12799909248
> > 
> > although of course that overhead won't be as large on wider tables,
> > and even in this case the overall performance is still on a par with
> > md5_agg().
> > 
> > ISTM that both aggregates are potentially useful in different
> > situations. I would probably typically use md5_total() because of its
> > simplicity/order-independence and consistent performance, but
> > md5_agg() might also be useful when comparing with external data.
> > 
> > Regards,
> > Dean

> Performance review (skills needed: ability to time performance)
> 
> Does the patch slow down simple tests?
> 
> Yes.  For an MD5 checksum of some random data, here are
> results from master:
> 
> shackle@postgres:5493=# WITH t1 AS (SELECT 
> string_agg(chr(floor(255*random()+1)::int),'') AS a FROM 
> generate_series(1,1)),
> postgres-# t2 AS (SELECT a FROM t1 CROSS JOIN 
> generate_series(1,1))
> postgres-# select md5(a::text) FROM t2;
> shackle@postgres:5493=# \timing 
> Timing is on.
> shackle@postgres:5493=# \g
> Time: 955.393 ms
> shackle@postgres:5493=# \g
> Time: 950.909 ms
> shackle@postgres:5493=# \g
> Time: 947.955 ms
> shackle@postgres:5493=# \g
> Time: 946.193 ms
> shackle@postgres:5493=# \g
> Time: 950.591 ms
> shackle@postgres:5493=# \g
> Time: 952.346 ms
> shackle@postgres:5493=# \g
> Time: 948.623 ms
> shackle@postgres:5493=# \g
> Time: 939.819 ms
> 
> and here from master + the patch:
> 
> WITH t1 AS (SELECT string_agg(chr(floor(255*random()+1)::int),'') 
> AS a FROM generate_series(1,1)),
> t2 AS (SELECT a FROM t1 CROSS JOIN generate_series(1,1))
> select md5(a::text) FROM t2;
> Time: 1129.178 ms
> shackle@postgres:5494=# \g
> Time: 1134.013 ms
> shackle@postgres:5494=# \g
> Time: 1124.387 ms
> shackle@postgres:5494=# \g
> Time: 1119.733 ms
> shackle@postgres:5494=# \g
> Time: 1104.906 ms
>

Re: [HACKERS] GIN improvements part2: fast scan

2013-06-21 Thread Heikki Linnakangas


On 19.06.2013 11:56, Alexander Korotkov wrote:

On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas<
hlinnakan...@vmware.com>  wrote:


On 19.06.2013 11:30, Alexander Korotkov wrote:


On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas<
hlinnakan...@vmware.com>   wrote:

  On 18.06.2013 23:59, Alexander Korotkov wrote:


  I would like to illustrate that on example. Imagine you have fulltext

query
"rare_term&frequent_term". Frequent term has large posting tree while

rare term has only small posting list containing iptr1, iptr2 and iptr3.
At
first we get iptr1 from posting list of rare term, then we would like to
check whether we have to scan part of frequent term posting tree where
iptr

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-21 Thread ian link

Forgive my ignorance, but I don't entirely understand the problem. What
does '+' and '-' refer to exactly?
Thanks!


On Fri, Jun 21, 2013 at 4:35 AM, Hitoshi Harada wrote:

>
>
>
> On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer wrote:
>
>> On 06/21/2013 05:32 PM, Hitoshi Harada wrote:
>>
>> > I also later found that we are missing not only notion of '+' or '-',
>> > but also notion of 'zero value' in our catalog.  Per spec, RANGE BETWEEN
>> > needs to detect ERROR if the offset value is negative, but it is not
>> > always easy if you think about interval, numeric types as opposed to
>> > int64 used in ROWS BETWEEN.
>>
>> Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
>> should make sense for any type in which the concept of zero makes sense.
>>
>>
>> Yeah, I mean, it needs to know if offset is negative or not by testing
> with zero.  So we need "zero value" or "is_negative function" for each type.
>
> Thanks,
> --
> Hitoshi Harada
>

Re: [HACKERS] Hardware donation

2013-06-21 Thread Mark Wong

On Fri, Jun 21, 2013 at 12:03 PM, Jim Nasby  wrote:
> On 6/21/13 1:45 PM, Josh Berkus wrote:
>>
>> On 06/21/2013 09:48 AM, Jim Nasby wrote:
>>>
>>> We've got some recently decommissioned servers and Enova is willing to
>>> donate 2 of them to the community.
>>>
>>> There's nothing terribly spectacular about the servers except for
>>> memory. We have one 512G server available and the other would be either
>>> 192G or 96G. I know that folks already have access to machines with a
>>> lot of cores, but I haven't seen reports of large memory machines.
>>>
>>> CPU details vary but we're only looking at 20ish cores (though AFAIK
>>> they're all 4 socket servers if that matters).
>>>
>>> Local drives are nothing fancy (though some might possibly be SSD).
>>
>>
>> I'm sure we could use these for the performance test farm.  If we need
>> to replace some of the drives, the community has money for that.
>
>
> We might actually have some spare SSDs floating around; I'm checking. We're
> also thinking we might be able to get at least one of these up to 256G by
> swapping memory around.
>
> Am I correct that the most valuable thing to the community the large memory
> size?

Yeah, I believe it's memory and storage.

> Who can be point of contact from the community to arrange shipping, etc?

I can be.

Regards,
Mark


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hardware donation

2013-06-21 Thread Jim Nasby


On 6/21/13 1:45 PM, Josh Berkus wrote:

On 06/21/2013 09:48 AM, Jim Nasby wrote:

We've got some recently decommissioned servers and Enova is willing to
donate 2 of them to the community.

There's nothing terribly spectacular about the servers except for
memory. We have one 512G server available and the other would be either
192G or 96G. I know that folks already have access to machines with a
lot of cores, but I haven't seen reports of large memory machines.

CPU details vary but we're only looking at 20ish cores (though AFAIK
they're all 4 socket servers if that matters).

Local drives are nothing fancy (though some might possibly be SSD).


I'm sure we could use these for the performance test farm.  If we need
to replace some of the drives, the community has money for that.


We might actually have some spare SSDs floating around; I'm checking. We're 
also thinking we might be able to get at least one of these up to 256G by 
swapping memory around.

Am I correct that the most valuable thing to the community the large memory 
size?

Who can be point of contact from the community to arrange shipping, etc?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Tom Lane

Robert Haas  writes:
> More generally, what do we think the point is of sending SIGQUIT
> rather than SIGKILL in the first place, and why does that point cease
> to be valid after 5 seconds?

Well, mostly it's about telling the client we're committing hara-kiri.
Without that, there's no very good reason to run quickdie() at all.

A practical issue with starting to send SIGKILL ourselves is that we
will no longer be able to reflexively diagnose "server process died
on signal 9" as "the linux OOM killer got you".  I'm not at all
convinced that the cases where SIGQUIT doesn't work are sufficiently
common to justify losing that property.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLogInsert scaling, revisited

2013-06-21 Thread Jeff Janes

On Tue, Jun 18, 2013 at 11:28 PM, Heikki Linnakangas <
hlinnakan...@vmware.com> wrote:

> On 18.06.2013 21:17, Jeff Janes wrote:
>
>> Hi Heikki,
>>
>> I am getting conflicts applying version 22 of this patch to 9.4dev.  Could
>> you rebase?
>>
>
> Here you go.


I think I'm getting an undetected deadlock between the checkpointer and a
user process running a TRUNCATE command.

This is the checkpointer:

#0  0x003a73eeaf37 in semop () from /lib64/libc.so.6
#1  0x005ff847 in PGSemaphoreLock (sema=0x7f8c0a4eb730,
interruptOK=0 '\000') at pg_sema.c:415
#2  0x004b0abf in WaitOnSlot (upto=416178159648) at xlog.c:1775
#3  WaitXLogInsertionsToFinish (upto=416178159648) at xlog.c:2086
#4  0x004b657a in CopyXLogRecordToWAL (write_len=32, isLogSwitch=1
'\001', rdata=0x0, StartPos=, EndPos=416192397312)
at xlog.c:1389
#5  0x004b6fb2 in XLogInsert (rmid=0 '\000', info=, rdata=0x7fff0020) at xlog.c:1209
#6  0x004b7644 in RequestXLogSwitch () at xlog.c:8748
#7  0x00611be3 in CheckArchiveTimeout () at checkpointer.c:622
#8  0x00611d97 in CheckpointWriteDelay (flags=, progress=) at checkpointer.c:705
#9  0x0062ec5a in BufferSync (flags=64) at bufmgr.c:1322
#10 CheckPointBuffers (flags=64) at bufmgr.c:1828
#11 0x004b1393 in CheckPointGuts (checkPointRedo=416178159592,
flags=64) at xlog.c:8365
#12 0x004b8ff3 in CreateCheckPoint (flags=64) at xlog.c:8148
#13 0x006121c3 in CheckpointerMain () at checkpointer.c:502
#14 0x004c4c4a in AuxiliaryProcessMain (argc=2,
argv=0x7fff21c4a5d0) at bootstrap.c:439
#15 0x0060a68c in StartChildProcess (type=CheckpointerProcess) at
postmaster.c:4954
#16 0x0060d1ea in reaper (postgres_signal_arg=) at postmaster.c:2571
#17 
#18 0x003a73ee14d3 in __select_nocancel () from /lib64/libc.so.6
#19 0x0060efee in ServerLoop (argc=,
argv=) at postmaster.c:1537
#20 PostmasterMain (argc=, argv=)
at postmaster.c:1246
#21 0x005ad4e0 in main (argc=3, argv=0x179fd00) at main.c:196


And this is the TRUNCATE command.

#0  0x003a73eeaf37 in semop () from /lib64/libc.so.6
#1  0x005ff847 in PGSemaphoreLock (sema=0x7f8c0a4ea8d0,
interruptOK=0 '\000') at pg_sema.c:415
#2  0x004b002d in WALInsertSlotAcquireOne (slotno=-1) at xlog.c:1667
#3  0x004b6d5d in XLogInsert (rmid=0 '\000', info=, rdata=0x7fff21c4a5e0) at xlog.c:1115
#4  0x004b8abc in XLogPutNextOid (nextOid=67198981) at xlog.c:8704
#5  0x004a3bc2 in GetNewObjectId () at varsup.c:495
#6  0x004c5195 in GetNewRelFileNode (reltablespace=, pg_class=0x0, relpersistence=) at catalog.c:437
#7  0x0070f52d in RelationSetNewRelfilenode
(relation=0x7f8c019cb2a0, freezeXid=2144367079, minmulti=1) at
relcache.c:2758
#8  0x0055de61 in ExecuteTruncate (stmt=) at
tablecmds.c:1163
#9  0x00656080 in standard_ProcessUtility (parsetree=0x2058900,
queryString=, context=,
params=0x0,
dest=, completionTag=) at
utility.c:552
#10 0x00652a87 in PortalRunUtility (portal=0x17bf510,
utilityStmt=0x2058900, isTopLevel=1 '\001', dest=0x2058c40,
completionTag=0x7fff21c4a9a0 "")
at pquery.c:1187
#11 0x006539fd in PortalRunMulti (portal=0x17bf510, isTopLevel=1
'\001', dest=0x2058c40, altdest=0x2058c40, completionTag=0x7fff21c4a9a0 "")
at pquery.c:1318
#12 0x006540b3 in PortalRun (portal=0x17bf510,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x2058c40,
altdest=0x2058c40,
completionTag=0x7fff21c4a9a0 "") at pquery.c:816
#13 0x00650944 in exec_simple_query (query_string=0x2057e90
"truncate foo") at postgres.c:1048
#14 0x00651fe9 in PostgresMain (argc=,
argv=, dbname=0x1fc9e98 "jjanes", username=)
at postgres.c:3985
#15 0x0060f80b in BackendRun (argc=,
argv=) at postmaster.c:3987
#16 BackendStartup (argc=, argv=)
at postmaster.c:3676
#17 ServerLoop (argc=, argv=) at
postmaster.c:1577
#18 PostmasterMain (argc=, argv=)
at postmaster.c:1246
#19 0x005ad4e0 in main (argc=3, argv=0x179fd00) at main.c:196

This is using the same testing harness as in the last round of this patch.

Is there a way for me to dump the list of held/waiting lwlocks from gdb?

Cheers,

Jeff

Re: [HACKERS] Hardware donation

2013-06-21 Thread Joshua D. Drake



On 06/21/2013 09:48 AM, Jim Nasby wrote:


We've got some recently decommissioned servers and Enova is willing to
donate 2 of them to the community.

There's nothing terribly spectacular about the servers except for
memory. We have one 512G server available and the other would be either
192G or 96G. I know that folks already have access to machines with a
lot of cores, but I haven't seen reports of large memory machines.

CPU details vary but we're only looking at 20ish cores (though AFAIK
they're all 4 socket servers if that matters).

Local drives are nothing fancy (though some might possibly be SSD).


A couple 192G machines to put in the performance lab would be nice, 
especially if they have SSD.


JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hardware donation

2013-06-21 Thread Josh Berkus

On 06/21/2013 09:48 AM, Jim Nasby wrote:
> We've got some recently decommissioned servers and Enova is willing to
> donate 2 of them to the community.
> 
> There's nothing terribly spectacular about the servers except for
> memory. We have one 512G server available and the other would be either
> 192G or 96G. I know that folks already have access to machines with a
> lot of cores, but I haven't seen reports of large memory machines.
> 
> CPU details vary but we're only looking at 20ish cores (though AFAIK
> they're all 4 socket servers if that matters).
> 
> Local drives are nothing fancy (though some might possibly be SSD).

I'm sure we could use these for the performance test farm.  If we need
to replace some of the drives, the community has money for that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-06-21 Thread Robert Haas

On Thu, Jun 20, 2013 at 12:18 AM, Amit Kapila  wrote:
> Auto.conf- 1 Vote (Josh)
> System.auto.conf - 1 Vote (Josh)
> Postgresql.auto.conf - 2 Votes (Zoltan, Amit)
> Persistent.auto.conf - 0 Vote
> generated_by_server.conf - 1 Vote (Peter E)
> System.conf  - 1 Vote (Magnus)
> alter_system.conf- 1 Vote (Alvaro)
>
> I had consolidated the list, so that it would be helpful for committer to
> decide among proposed names for this file.

I'll also vote for postgresql.auto.conf.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-06-21 Thread Robert Haas

On Fri, Jun 21, 2013 at 12:56 PM, Fujii Masao  wrote:
> On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas  wrote:
>> On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut  wrote:
>>> On 6/7/13 12:14 AM, Amit Kapila wrote:
 I will change the patch as per below syntax if there are no objections:

  ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'};
>>>
>>> I do like using ALTER SYSTEM in general, but now that I think about it,
>>> the 9.3 feature to create global settings in pg_db_role_setting would
>>> also have been a candidate for exactly that same syntax if it had been
>>> available.  In fact, if we do add ALTER SYSTEM, it might make sense to
>>> recast that feature into that syntax.
>>>
>>> It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE
>>> or something like that.  It's only a small syntax change, so don't worry
>>> about it too much, but let's keep thinking about it.
>>
>> I think that anything we want to add should either go before SET or
>> after the value, such as:
>>
>> ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf';
>> ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file';
>> ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET
>> configuration_parameter = 'file';
>>
>> I agree we shouldn't back ourselves into a syntactic corner.
>
> Maybe this idea has been already discussed, but why don't we just
> add the function like update_config_file(parameter, value)? We can
> commit the core of the patch with that function API first, and then
> we can discuss the syntax and add another API like ALTER SYSTEM
> later.
>
> We now have set_config() function to set the parameter,
> so adding the function for this feature should not be a surprise.
> If the name of the function is, for example, update_conf_file,
> most users would think that it's intuitive even if SIGHUP is not
> automatically sent immediately. We don't need to emit
> the message like 'This setting change will not be loaded until SIGHUP'.

I could certainly support that plan.  I'm satisfied with the ALTER
SYSTEM syntax and feel that might find other plausible uses, but
functional notation would be OK with me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Robert Haas

On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis  wrote:
>> Regardless of what syntax we settle on, we should also make sure that
>> the conflict is intrinsic to the grammar and can't be factored out, as
>> Tom suggested upthread.  It's not obvious to me what the actual
>> ambiguity is here.  If you've seen "select lag(num,0)" and the
>> lookahead token is "respect", what's the problem?  It sort of looks
>> like it could be a column label, but not even unreserved keywords can
>> be column labels, so that's not it.  Probably deserves a bit more
>> investigation...
>
> I think the problem is when the function is used as a table function;
> e.g.:
>
>   SELECT * FROM generate_series(1,10) respect;

Ah ha.  Well, there's probably not much help for that.  Disallowing
keywords as table aliases would be a cure worse than the disease, I
think.  I suppose the good news is that there probably aren't many
people using RESPECT as a column name, but I have a feeling that we're
almost certain to get complaints about reserving IGNORE.  I think that
will have to be quoted as a PL/pgsql variable name as well.  :-(

>> We could just add additional, optional Boolean argument to the
>> existing functions.  It's non-standard, but we avoid adding keywords.
>
> I thought about that, but it is awkward because the argument would have
> to be constant (or, if not, it would be quite strange).

True... but e.g. string_agg() has the same issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Unaccent performance

2013-06-21 Thread Thom Brown

Hi,

The unaccent extension is great, especially with its customisability, but
it's not always easy to recommend.  I witnessed a customer using no less
than 56 nested replace functions in an SQL function.  I looked to see how
much this can be mitigated by unaccent.  It turns out that not all the
characters they were replacing can be replaced by unaccent, either because
they replace more than 1 character at a time, or the character they're
replacing, for some reason, isn't processed by unaccent, even with a custom
rules file.

So there were 20 characters I could identify that they were replacing.  I
made a custom rules file and compared its performance to the
difficult-to-manage set of nested replace calls.

CREATE OR REPLACE FUNCTION public.myunaccent(sometext text)
 RETURNS text
 LANGUAGE sql
 IMMUTABLE
AS $function$
SELECT
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(sometext,'ą','a'),'Ą','A'),'ă','a'),'Ă','A'),'ā','a'),'Ā','A'),'æ','a'),'å','a'),'ä','a'),'ã','a'),'â','a'),'á','a'),'à','a'),'Æ','A'),'Å','A'),'Ä','A'),'Ã','A'),'Â','A'),'Á','A'),'À','A')
;
$function$

postgres=# SELECT myunaccent(sometext::text) FROM (SELECT
'ÀÁÂÃÄÅÆàáâãäåæĀāĂăĄą' sometext FROM generate_series(1,100)) x OFFSET
99 LIMIT 1;
  myunaccent
--
 AAAaaaAaAaAa
(1 row)

Time: 726.282 ms
postgres=# SELECT unaccent(sometext::text) FROM (SELECT
'ÀÁÂÃÄÅÆàáâãäåæĀāĂăĄą' sometext FROM generate_series(1,100)) x OFFSET
99 LIMIT 1;
   unaccent
--
 AAAaaaAaAaAa
(1 row)

Time: 3305.252 ms

The timings are actually pretty much the same even if I introduce 187
nested replace calls for every line in the unaccent.rules file for 187
characters.  But the same character set with unaccent increases to 7418.526
ms with the same type of query as above.  That's 10 times more expensive.

Is there a way to boost the performance to make its adoption more palatable?

-- 
Thom

Review [was Re: [HACKERS] MD5 aggregate]

2013-06-21 Thread David Fetter

On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote:
> On 15 June 2013 10:22, Dean Rasheed  wrote:
> > There seem to be 2 separate directions that this could go, which
> > really meet different requirements:
> >
> > 1). Produce an unordered sum for SQL to compare 2 tables regardless of
> > the order in which they are scanned. A possible approach to this might
> > be something like an aggregate
> >
> > md5_total(text/bytea) returns text
> >
> > that returns the sum of the md5 values of each input value, treating
> > each md5 value as an unsigned 128-bit integer, and then producing the
> > hexadecimal representation of the final sum. This should out-perform a
> > solution based on numeric addition, and in typical cases, the result
> > wouldn't be much longer than a regular md5 sum, and so would be easy
> > to eyeball for differences.
> >
> 
> I've been playing around with the idea of an aggregate that computes
> the sum of the md5 hashes of each of its inputs, which I've called
> md5_total() for now, although I'm not particularly wedded to that
> name. Comparing it with md5_agg() on a 100M row table (see attached
> test script) produces interesting results:
> 
> SELECT md5_agg(foo.*::text)
>   FROM (SELECT * FROM foo ORDER BY id) foo;
> 
>  50bc42127fb9b028c9708248f835ed8f
> 
> Time: 92960.021 ms
> 
> SELECT md5_total(foo.*::text) FROM foo;
> 
>  02faea7fafee4d253fc94cfae031afc43c03479c
> 
> Time: 96190.343 ms
> 
> Unlike md5_agg(), it is no longer a true MD5 sum (for one thing, its
> result is longer) but it seems like it would be very useful for
> quickly comparing data in SQL, since its value is not dependent on the
> row-order making it easier to use and better performing if there is no
> usable index for ordering.
> 
> Note, however, that if there is an index that can be used for
> ordering, the performance is not necessarily better than md5_agg(), as
> this example shows. There is a small additional overhead per row for
> initialising the MD5 sums, and adding the results to the total, but I
> think the biggest factor is that md5_total() is processing more data.
> The reason is that MD5 works on 64-byte blocks, so the total amount of
> data going through the core MD5 algorithm is each row's size is
> rounded up to a multiple of 64. In this simple case it ends up
> processing around 1.5 times as much data:
> 
> SELECT sum(length(foo.*::text)) AS md5_agg,
>sum(((length(foo.*::text)+63)/64)*64) AS md5_total FROM foo;
> 
>   md5_agg   |  md5_total
> +-
>  8103815438 | 12799909248
> 
> although of course that overhead won't be as large on wider tables,
> and even in this case the overall performance is still on a par with
> md5_agg().
> 
> ISTM that both aggregates are potentially useful in different
> situations. I would probably typically use md5_total() because of its
> simplicity/order-independence and consistent performance, but
> md5_agg() might also be useful when comparing with external data.
> 
> Regards,
> Dean

Submission review:

Is the patch in a patch format which has context? (eg: context diff format)

Yes.

Does it apply cleanly to the current git master?

Yes.

Does it include reasonable tests, necessary doc patches, etc? 

Yes.

Usability review:

Does the patch actually implement that?

Yes.

Do we want that?

Enough do.

Do we already have it?

No.

Does it follow SQL spec, or the community-agreed behavior?

No, and yes, respectively.

Does it include pg_dump support (if applicable)?

N/A

Are there dangers?

Patch changes the MD5 implementation, which could
theoretically result in backward incompatibility if the
changes are not 100% backward-compatible.

Have all the bases been covered? 

Yes.

Feature test:

Does the feature work as advertised?

Yes.

Are there corner cases the author has failed to consider?

Not that I've found so far.

Are there any assertion failures or crashes?

No.

Performance review (skills needed: ability to time performance)

Does the patch slow down simple tests?

Yes.  For an MD5 checksum of some random data, here are
results from master:

shackle@postgres:5493=# WITH t1 AS (SELECT 
string_agg(chr(floor(255*random()+1)::int),'') AS a FROM 
generate_series(1,1)),
postgres-# t2 AS (SELECT a FROM t1 CROSS JOIN 
generate_series(1,1))
postgres-# select md5(a::text) FROM t2;
shackle@postgres:5493=# \timing 
Timing is on.
shackle@postgres:5493=# \g
Time: 955.393 ms
shackle@postgres:5493=# \g
Time: 950.909 ms
shackle@postgres:5493=# \g
Time: 947.955 ms
shackle@postgres:5493=# \g
Time: 946.193 ms
shackle@postgres:5493=# \g
Time: 950.591 ms

Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?

2013-06-21 Thread Josh Berkus

Cedric,

> See this example:
> # create table foo (i int, t timestamptz);
> # insert into foo select n, now() from generate_series(1,10) g(n);
> # select i, first_value(i) over (order by t desc) from foo;
> # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and 
> UNBOUNDED FOLLOWING) from foo;
> 
> What do you expect "SELECT first(val order by ts desc)" to output ?
> 

Ah, right, I see what you mean.  Yeah, I was doing queries without peer
rows, so it looked the same to me, and it uses some of the same
machinery.  But of course it's not completely the same.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?

2013-06-21 Thread David Johnston

Cédric Villemain-2 wrote
> And also, first_value is a *window* function, not a simple aggregate 
> function...

Per the documentation any aggregate function can be used with a WINDOW
declaration.  The logical question is why are window aggregates special so
that the reverse cannot be true?  In other words why is not every function
simply defined as a normal aggregate that can be used in both contexts?


> See this example:
> # create table foo (i int, t timestamptz);
> # insert into foo select n, now() from generate_series(1,10) g(n);
> # select i, first_value(i) over (order by t desc) from foo;
> # select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING
> and 
> UNBOUNDED FOLLOWING) from foo;
> 
> What do you expect "SELECT first(val order by ts desc)" to output ? 

Undefined due to incorrect specificity of the ORDER BY definition.  The
window version has the same issue.

The window aggregates should simply treat the entire input set as the
relevant frame - basically the same output as would result from
(simplistically):

SELECT window_agg(...)
FROM (
SELECT id, window_agg(...) OVER (ORDER BY id ASC)  ORDER BY id ASC
) agg
ORDER BY id DESC LIMIT 1

Admittedly this really only makes sense for first_value, last_value, and
nth_value; the other window aggregates can return valid values but to have
meaning they really need to be output in a windowing context.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Why-can-t-I-use-windowing-functions-over-ordered-aggregates-tp5760233p5760358.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-06-21 Thread Fujii Masao

On Sat, Jun 22, 2013 at 12:11 AM, Robert Haas  wrote:
> On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut  wrote:
>> On 6/7/13 12:14 AM, Amit Kapila wrote:
>>> I will change the patch as per below syntax if there are no objections:
>>>
>>>  ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'};
>>
>> I do like using ALTER SYSTEM in general, but now that I think about it,
>> the 9.3 feature to create global settings in pg_db_role_setting would
>> also have been a candidate for exactly that same syntax if it had been
>> available.  In fact, if we do add ALTER SYSTEM, it might make sense to
>> recast that feature into that syntax.
>>
>> It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE
>> or something like that.  It's only a small syntax change, so don't worry
>> about it too much, but let's keep thinking about it.
>
> I think that anything we want to add should either go before SET or
> after the value, such as:
>
> ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf';
> ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file';
> ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET
> configuration_parameter = 'file';
>
> I agree we shouldn't back ourselves into a syntactic corner.

Maybe this idea has been already discussed, but why don't we just
add the function like update_config_file(parameter, value)? We can
commit the core of the patch with that function API first, and then
we can discuss the syntax and add another API like ALTER SYSTEM
later.

We now have set_config() function to set the parameter,
so adding the function for this feature should not be a surprise.
If the name of the function is, for example, update_conf_file,
most users would think that it's intuitive even if SIGHUP is not
automatically sent immediately. We don't need to emit
the message like 'This setting change will not be loaded until SIGHUP'.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hardware donation

2013-06-21 Thread Jim Nasby


I stand corrected... we don't have a 512G server available. We do have plenty 
of 192G and 96G servers though if 2 of those would be of use.

Sorry for the noise.

On 6/21/13 11:48 AM, Jim Nasby wrote:

We've got some recently decommissioned servers and Enova is willing to donate 2 
of them to the community.

There's nothing terribly spectacular about the servers except for memory. We 
have one 512G server available and the other would be either 192G or 96G. I 
know that folks already have access to machines with a lot of cores, but I 
haven't seen reports of large memory machines.

CPU details vary but we're only looking at 20ish cores (though AFAIK they're 
all 4 socket servers if that matters).

Local drives are nothing fancy (though some might possibly be SSD).


--
Jim Nasby, Lead Data Architect
(512) 569-9461 (primary) (512) 579-9024 (backup)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hardware donation

2013-06-21 Thread Atri Sharma

On Fri, Jun 21, 2013 at 10:18 PM, Jim Nasby  wrote:
> We've got some recently decommissioned servers and Enova is willing to
> donate 2 of them to the community.
>
> There's nothing terribly spectacular about the servers except for memory. We
> have one 512G server available and the other would be either 192G or 96G. I
> know that folks already have access to machines with a lot of cores, but I
> haven't seen reports of large memory machines.
>
> CPU details vary but we're only looking at 20ish cores (though AFAIK they're
> all 4 socket servers if that matters).
>
> Local drives are nothing fancy (though some might possibly be SSD).

Woot!

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Hardware donation

2013-06-21 Thread Jim Nasby


We've got some recently decommissioned servers and Enova is willing to donate 2 
of them to the community.

There's nothing terribly spectacular about the servers except for memory. We 
have one 512G server available and the other would be either 192G or 96G. I 
know that folks already have access to machines with a lot of cores, but I 
haven't seen reports of large memory machines.

CPU details vary but we're only looking at 20ish cores (though AFAIK they're 
all 4 socket servers if that matters).

Local drives are nothing fancy (though some might possibly be SSD).
--
Jim Nasby, Lead Data Architect
(512) 569-9461 (primary) (512) 579-9024 (backup)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why can't I use windowing functions over ordered aggregates?

2013-06-21 Thread Cédric Villemain

Le vendredi 21 juin 2013 03:32:33, Josh Berkus a écrit :
> Hackers,
> 
> So, I can create a custom aggregate "first" and do this:
> 
> SELECT first(val order by ts desc) ...
> 
> And I can do this:
> 
> SELECT first_value(val) OVER (order by ts desc)
> 
> ... but I can't do this:
> 
> SELECT first_value(val order by ts desc)
> 
> ... even though under the hood, it's the exact same operation.

First I'm not sure it is the same, in a window frame you have the notion of 
peer-rows (when you use ORDER BY).

And also, first_value is a *window* function, not a simple aggregate 
function...

See this example:
# create table foo (i int, t timestamptz);
# insert into foo select n, now() from generate_series(1,10) g(n);
# select i, first_value(i) over (order by t desc) from foo;
# select i, first_value(i) over (order by t desc ROWS between 0 PRECEDING and 
UNBOUNDED FOLLOWING) from foo;

What do you expect "SELECT first(val order by ts desc)" to output ?

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Jeff Davis

On Fri, 2013-06-21 at 09:21 -0400, Robert Haas wrote:
> The other question here is - do we actually have the grammar right?
> As in, is this actually the syntax we're supposed to be implementing?
> It looks different from what's shown here, where the IGNORE NULLS is
> inside the function's parentheses, rather than afterwards:
> 
> http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html
> 
> IBM seems to think it's legal either inside or outside the parentheses:
> 
> http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm

The spec seems pretty clear that it falls outside of the parentheses,
unless I am missing something.

> Regardless of what syntax we settle on, we should also make sure that
> the conflict is intrinsic to the grammar and can't be factored out, as
> Tom suggested upthread.  It's not obvious to me what the actual
> ambiguity is here.  If you've seen "select lag(num,0)" and the
> lookahead token is "respect", what's the problem?  It sort of looks
> like it could be a column label, but not even unreserved keywords can
> be column labels, so that's not it.  Probably deserves a bit more
> investigation...

I think the problem is when the function is used as a table function;
e.g.:

  SELECT * FROM generate_series(1,10) respect;

> We could just add additional, optional Boolean argument to the
> existing functions.  It's non-standard, but we avoid adding keywords.

I thought about that, but it is awkward because the argument would have
to be constant (or, if not, it would be quite strange).

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Robert Haas

On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera
 wrote:
> I will go with 5 seconds, then.

I'm uncomfortable with this whole concept, and particularly with such
a short timeout.  On a very busy system, things can take a LOT longer
than they think we should; it can take 30 seconds or more just to get
a prompt back from a shell command.  5 seconds is the blink of an eye.

More generally, what do we think the point is of sending SIGQUIT
rather than SIGKILL in the first place, and why does that point cease
to be valid after 5 seconds?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add visibility map information to pg_freespace.

2013-06-21 Thread Robert Haas

On Wed, Jun 19, 2013 at 11:26 PM, Satoshi Nagayasu  wrote:
> - pageinspect provies several functions for debugging purpose.
> - pg_freespace provies a view for monitoring purpose.
> - pgstattuple provies several functions for collecting
>   specific table/index statistics.

I think we should be careful to think about upgrade considerations
when adding this functionality.  Bumping the module version number to
add new functions is less likely to break things for users than
changing the return value of an existing SRF.  Maybe that's too far
down in the weeds to worry about, but it's a thought - especially for
pg_freespace, where there's no real efficiency benefit to have the
same function look at the FSM and the VM anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-06-21 Thread Robert Haas

On Wed, Jun 19, 2013 at 1:59 PM, Peter Eisentraut  wrote:
> On 6/7/13 12:14 AM, Amit Kapila wrote:
>> I will change the patch as per below syntax if there are no objections:
>>
>>  ALTER SYSTEM SET configuration_parameter {TO | =} {value, | 'value'};
>
> I do like using ALTER SYSTEM in general, but now that I think about it,
> the 9.3 feature to create global settings in pg_db_role_setting would
> also have been a candidate for exactly that same syntax if it had been
> available.  In fact, if we do add ALTER SYSTEM, it might make sense to
> recast that feature into that syntax.
>
> It might be clearer to do something like ALTER SYSTEM SET EXTERNAL FILE
> or something like that.  It's only a small syntax change, so don't worry
> about it too much, but let's keep thinking about it.

I think that anything we want to add should either go before SET or
after the value, such as:

ALTER SYSTEM SET configuration_parameter = 'value' IN FILE 'myconf.conf';
ALTER SYSTEM CONFIGURATION FILE SET configuration_parameter = 'file';
ALTER SYSTEM CONFIGURATION FILE 'myconf.conf' SET
configuration_parameter = 'file';

I agree we shouldn't back ourselves into a syntactic corner.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Noah Misch

On Fri, Jun 21, 2013 at 04:12:32PM +0200, Andres Freund wrote:
> On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
> > On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:

> > Even if you could skip it, queries with expensive
> > constant expressions would notice the performance loss.  The situations 
> > helped
> > by a change like this are too marginal to accept that cost.
> 
> I have to say, that argument doesn't excite me mu8ch. It's not like we
> don't want to do the constant expression evaluation at all anymore. Just
> not inside CASE WHEN blocks which already are barring some optimizations
> anyway...

Sure, it's a narrow loss.  Before introducing a new narrow loss to fix an
existing one, we should consider which loss hurts more.  Offhand, I sympathize
with the folks who would lose performance more than with the folks who want to
write the sorts of expressions under consideration.

> > Would it work to run eval_const_expressions() lazily on THEN clauses?  The
> > plan-time eval_const_expressions() would not descend to them.  The first
> > ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
> > before proceeding.
> 
> Ugh. Doesn't sound nice.

Would you elaborate?

> > I question whether real applications care.  It's important to have CASE 
> > usable
> > for avoiding data-dependent errors, but what's the use of including in your
> > query an expression that can do nothing but throw an error?  Does anyone 
> > have
> > a real-world example?  Perhaps some generated-query scenario.
> 
> It doesn't need to be an actual constant. Something that evaluates to
> the value at plan time is enough:
> PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
> EXECUTE foo(0);

> Now, that example only crashes because one place uses (SELECT $1) and
> the other doesn't, but...

Not the "real-world" I was hoping for, but fair enough.

"Crash" in this context means "raise an error", right?

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Pavel Stehule

2013/6/21 Andres Freund :
> On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
>> On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
>> > Andres Freund wrote:
>> > > Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
>> > > though. If we can agree it is, the fix outlined over on -bugs seems to
>> > > be easily enough implemented...
>>
>> If you refer to this:
>>
>> On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
>> > So it seems we need to stop processing after finding a single WHEN
>> > that's not const? Does anybody have a better idea?
>>
>> eval_const_expressions() is not just an optimization: it performs mandatory
>> transformations such as the conversion of named-argument function calls to
>> positional function calls.
>
> Ah yes. Forgot about that... Scrap that. Although it surely isn't nice
> that all that is done in a function calleed eval_const_expressions()...
>
>> Even if you could skip it, queries with expensive
>> constant expressions would notice the performance loss.  The situations 
>> helped
>> by a change like this are too marginal to accept that cost.

yes, I dislike it too - then we can have inconsistent behave of
constant between CASE and other statements.

We should to do without any performance lost, if we do some changes in
this area.

Regards

Pavel

>
> I have to say, that argument doesn't excite me mu8ch. It's not like we
> don't want to do the constant expression evaluation at all anymore. Just
> not inside CASE WHEN blocks which already are barring some optimizations
> anyway...
>
>> Would it work to run eval_const_expressions() lazily on THEN clauses?  The
>> plan-time eval_const_expressions() would not descend to them.  The first
>> ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
>> before proceeding.
>
> Ugh. Doesn't sound nice. I don't have any better ideas than to actually
> split eval_const_expressions into one function that does the necessary
> things like canonicalization of AND/OR and one that actually evaluates
> expressions inside though.
> So maybe that's the way to go :/
>
>> > I think that it is surprising behaviour.
>>
>> That's about how I characterize it, too.
>>
>> I question whether real applications care.  It's important to have CASE 
>> usable
>> for avoiding data-dependent errors, but what's the use of including in your
>> query an expression that can do nothing but throw an error?  Does anyone have
>> a real-world example?  Perhaps some generated-query scenario.
>
> It doesn't need to be an actual constant. Something that evaluates to
> the value at plan time is enough:
> PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
> EXECUTE foo(0);
>
> That example will most likely only crashes in 9.2+ because it will
> replan it with the acutal parameter values in place. But you could have
> the same in earlier versions e.g. using PQExecParams(), but that's
> harder to demonstrate.
>
> Now, that example only crashes because one place uses (SELECT $1) and
> the other doesn't, but...
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund

On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
> That being said, if we discover a simple-enough fix that performs well, we may
> as well incorporate it.

What about passing another parameter down eval_const_expressions_mutator
(which is static, so changing the API isn't a problem) that basically
tells us whether we actually should evaluate expressions or just perform
the transformation?
There's seems to be basically a couple of places where we call dangerous
things:
* simplify_function (via ->evaluate_function->evaluate_expr) which is
  called in a bunch of places
* evaluate_expr which is directly called in T_ArrayExpr
  T_ArrayCoerceExpr

All places I've inspected so far need to deal with simplify_function
returning NULL anyway, so that seems like a viable fix.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund

On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
> On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
> > Andres Freund wrote:
> > > Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
> > > though. If we can agree it is, the fix outlined over on -bugs seems to
> > > be easily enough implemented...
> 
> If you refer to this:
> 
> On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
> > So it seems we need to stop processing after finding a single WHEN
> > that's not const? Does anybody have a better idea?
> 
> eval_const_expressions() is not just an optimization: it performs mandatory
> transformations such as the conversion of named-argument function calls to
> positional function calls.

Ah yes. Forgot about that... Scrap that. Although it surely isn't nice
that all that is done in a function calleed eval_const_expressions()...

> Even if you could skip it, queries with expensive
> constant expressions would notice the performance loss.  The situations helped
> by a change like this are too marginal to accept that cost.

I have to say, that argument doesn't excite me mu8ch. It's not like we
don't want to do the constant expression evaluation at all anymore. Just
not inside CASE WHEN blocks which already are barring some optimizations
anyway...

> Would it work to run eval_const_expressions() lazily on THEN clauses?  The
> plan-time eval_const_expressions() would not descend to them.  The first
> ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
> before proceeding.

Ugh. Doesn't sound nice. I don't have any better ideas than to actually
split eval_const_expressions into one function that does the necessary
things like canonicalization of AND/OR and one that actually evaluates
expressions inside though.
So maybe that's the way to go :/

> > I think that it is surprising behaviour.
>
> That's about how I characterize it, too.
>
> I question whether real applications care.  It's important to have CASE usable
> for avoiding data-dependent errors, but what's the use of including in your
> query an expression that can do nothing but throw an error?  Does anyone have
> a real-world example?  Perhaps some generated-query scenario.

It doesn't need to be an actual constant. Something that evaluates to
the value at plan time is enough:
PREPARE foo AS SELECT CASE WHEN (SELECT $1::int)=0 THEN 0 ELSE 1/$1::int END;
EXECUTE foo(0);

That example will most likely only crashes in 9.2+ because it will
replan it with the acutal parameter values in place. But you could have
the same in earlier versions e.g. using PQExecParams(), but that's
harder to demonstrate.

Now, that example only crashes because one place uses (SELECT $1) and
the other doesn't, but...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau


From: "Alvaro Herrera" 

Actually, I think it would be cleaner to have a new state in pmState,
namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT.  When
we're in this state, postmaster is only waiting for the timeout to
expire; and when it does, it sends SIGKILL and exits.  Pretty much the
same you have, except that instead of checking AbortStartTime we check
the pmState variable.


Are you suggesting simplifying the following part in ServerLoop()?  I 
welcome the idea if this condition becomes simpler.  However, I cannot 
imagine how.


 if (AbortStartTime > 0 &&  /* SIGKILL only once */
  (Shutdown == ImmediateShutdown || (FatalError && !SendStop)) &&
  now - AbortStartTime >= 10)
 {
  SignalAllChildren(SIGKILL);
  AbortStartTime = 0;
 }

I thought of adding some new state of pmState for some reason (that might be 
the same as your idea).
But I refrained from doing that, because pmState has already many states.  I 
was afraid adding a new pmState value for this bug fix would complicate the 
state management (e.g. state transition in PostmasterStateMachine()).  In 
addition, I felt PM_WAIT_BACKENDS was appropriate because postmaster is 
actually waiting for backends to terminate after sending SIGQUIT.  The state 
name is natural.


I don't have strong objection to your idea if it makes the code cleaner and 
more understandable.  Thank you very much.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Noah Misch

On Fri, Jun 21, 2013 at 09:20:21AM +, Albe Laurenz wrote:
> Andres Freund wrote:
> > Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
> > though. If we can agree it is, the fix outlined over on -bugs seems to
> > be easily enough implemented...

If you refer to this:

On Tue, Jun 18, 2013 at 03:31:32PM +0200, Andres Freund wrote:
> So it seems we need to stop processing after finding a single WHEN
> that's not const? Does anybody have a better idea?

eval_const_expressions() is not just an optimization: it performs mandatory
transformations such as the conversion of named-argument function calls to
positional function calls.  Even if you could skip it, queries with expensive
constant expressions would notice the performance loss.  The situations helped
by a change like this are too marginal to accept that cost.

Would it work to run eval_const_expressions() lazily on THEN clauses?  The
plan-time eval_const_expressions() would not descend to them.  The first
ExecEvalCase() to use a given THEN clause would run eval_const_expressions()
before proceeding.

> I think that it is surprising behaviour.

That's about how I characterize it, too.

I question whether real applications care.  It's important to have CASE usable
for avoiding data-dependent errors, but what's the use of including in your
query an expression that can do nothing but throw an error?  Does anyone have
a real-world example?  Perhaps some generated-query scenario.

That being said, if we discover a simple-enough fix that performs well, we may
as well incorporate it.

> If fixing the behaviour is undesirable, at least the documentation
> should be fixed.

A brief documentation mention sounds fine.  Perhaps add a paragraph on
constant folding in general and reference that from the CASE page.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-06-21 Thread Andres Freund

On 2013-06-21 20:54:34 +0900, Michael Paquier wrote:
> On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund  wrote:
> > On 2013-06-19 09:55:24 +0900, Michael Paquier wrote:
> >> >> @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
> >> > Is it actually possible to get here with multiple toast indexes?
> >> Actually it is possible. finish_heap_swap is also called for example
> >> in ALTER TABLE where rewriting the table (phase 3), so I think it is
> >> better to protect this code path this way.
> >
> > But why would we copy invalid toast indexes over to the new relation?
> > Shouldn't the new relation have been freshly built in the previous
> > steps?
> What do you think about that? Using only the first valid index would be 
> enough?

What I am thinking about is the following: When we rewrite a relation,
we build a completely new toast relation. Which will only have one
index, right? So I don't see how this could could be correct if we deal
with multiple indexes. In fact, the current patch's swap_relation_files
throws an error if there are multiple ones around.


> >> >> diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
> >> >> index 8ac2549..31309ed 100644
> >> >> --- a/src/include/utils/relcache.h
> >> >> +++ b/src/include/utils/relcache.h
> >> >> @@ -29,6 +29,16 @@ typedef struct RelationData *Relation;
> >> >>  typedef Relation *RelationPtr;
> >> >>
> >> >>  /*
> >> >> + * RelationGetIndexListIfValid
> >> >> + * Get index list of relation without recomputing it.
> >> >> + */
> >> >> +#define RelationGetIndexListIfValid(rel) \
> >> >> +do { \
> >> >> + if (rel->rd_indexvalid == 0) \
> >> >> + RelationGetIndexList(rel); \
> >> >> +} while(0)
> >> >
> >> > Isn't this function misnamed and should be
> >> > RelationGetIndexListIfInValid?
> >> When naming that; I had more in mind: "get the list of indexes if it
> >> is already there". It looks more intuitive to my mind.
> >
> > I can't follow. RelationGetIndexListIfValid() doesn't return
> > anything. And it doesn't do anything if the list is already valid. It
> > only does something iff the list currently is invalid.
> In this case RelationGetIndexListIfInvalid?

Yep. Suggested that above ;). Maybe RelationFetchIndexListIfInvalid()?

Hm. Looking at how this is currently used - I am afraid it's not
correct... the reason RelationGetIndexList() returns a copy is that
cache invalidations will throw away that list. And you do index_open()
while iterating over it which will accept invalidation messages.
Mybe it's better to try using RelationGetIndexList directly and measure
whether that has a measurable impact=

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trgm regex index peculiarity

2013-06-21 Thread Erik Rijkers

On Fri, June 21, 2013 15:11, Alexander Korotkov wrote:
> On Fri, Jun 21, 2013 at 2:40 PM, Erik Rijkers  wrote:
>
>> On Fri, June 21, 2013 05:25, Tom Lane wrote:
>> > "Erik Rijkers"  writes:
>> >> In a 112 MB test table (containing random generated text) with a trgm
>> index (gin_trgm_ops), I consistently get these
>> >> timings:
>> >> select txt from azjunk6 where txt ~ '^abcd';
>> >>130 ms
>> >> select txt from azjunk6
>> >> where txt ~ 'abcd' and substr(txt,1,4) = 'abcd';
>> >>3 ms
>> >
>
> Regex '^abcd' will be expanded into trigrams '__a', '_ab', 'abc' and 'bcd'.
> However trigrams '__a' is much more frequent than '_ab' which in turn is
> much more frequent than 'abc' and 'bcd'. Ommiting of ^ leads to ommiting of
> '__a' and '_ab' and that gives so significant speedup.

> [trgm_regex_optimize.1.patch ]

Yes, that fixes the problem, thanks.

Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-21 Thread Robert Haas

On Wed, Jun 19, 2013 at 2:42 PM, Fabien COELHO  wrote:
>> number of transactions actually processed: 301921
> Just a thought before spending too much time on this subtle issue.
>
> The patch worked reasonnably for 301900 transactions in your above run, and
> the few last ones, less than the number of clients, show strange latency
> figures which suggest that something is amiss in some corner case when
> pgbench is stopping. However, the point of pgbench is to test a steady
> state, not to achieve the cleanest stop at the end of a run.
>
> So my question is: should this issue be a blocker wrt to the feature?

I think so.  If it doesn't get fixed now, it's not likely to get fixed
later.  And the fact that nobody understands why it's happening is
kinda worrisome...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau


From: "Alvaro Herrera" 

MauMau escribió:


One concern is that umount would fail in such a situation because
postgres has some open files on the filesystem, which is on the
shared disk in case of traditional HA cluster.


See my reply to Noah.  If postmaster stays around, would this be any
different?  I don't think so.


I'll consider this a bit and respond as a reply to Andres-san's mail.



Actually, in further testing I noticed that the fast-path you introduced
in BackendCleanup (or was it HandleChildCrash?) in the immediate
shutdown case caused postmaster to fail to clean up properly after
sending the SIGKILL signal, so I had to remove that as well.  Was there
any reason for that?


You are talking about the code below at the beginning of HandleChildCrash(), 
aren't you?


/* Do nothing if the child terminated due to immediate shutdown */
if (Shutdown == ImmediateShutdown)
 return;

If my memory is correct, this was necessary; without this, 
HandleChildCrash() or LogChildExit() left extra messages in the server log.


LOG:  %s (PID %d) exited with exit code %d
LOG:  terminating any other active server processes

These messages are output because postmaster sends SIGQUIT to its children 
and wait for them to terminate.  The children exit with non-zero status when 
they receive SIGQUIT.



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2013-06-21 Thread Robert Haas

On Fri, Jun 21, 2013 at 12:18 AM, Jeff Davis  wrote:
> On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
>> I think the question is whether this feature is really worth adding
>> new reserved keywords for.  I have a hard time saying we shouldn't
>> support something that's part of the SQL standard, but personally,
>> it's not something I've seen come up prior to this thread.
>
> What's the next step here?

Well, ideally, some other people weigh in on the value of the feature
vs. the pain of reserving the keywords.

> The feature sounds useful to me.

...and there's one person with an opinion now!   :-)

The other question here is - do we actually have the grammar right?
As in, is this actually the syntax we're supposed to be implementing?
It looks different from what's shown here, where the IGNORE NULLS is
inside the function's parentheses, rather than afterwards:

http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html

IBM seems to think it's legal either inside or outside the parentheses:

http://pic.dhe.ibm.com/infocenter/informix/v121/index.jsp?topic=%2Fcom.ibm.sqls.doc%2Fids_sqs_2594.htm

Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread.  It's not obvious to me what the actual
ambiguity is here.  If you've seen "select lag(num,0)" and the
lookahead token is "respect", what's the problem?  It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it.  Probably deserves a bit more
investigation...

> If the grammar is unacceptable, does
> someone have an alternative idea, like using new function names instead
> of grammar? If so, what are reasonable names to use?

We could just add additional, optional Boolean argument to the
existing functions.  It's non-standard, but we avoid adding keywords.

> Also, I think someone mentioned this already, but what about
> first_value() and last_value()? Shouldn't we do those at the same time?

Not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trgm regex index peculiarity

2013-06-21 Thread Alexander Korotkov

On Fri, Jun 21, 2013 at 2:40 PM, Erik Rijkers  wrote:

> On Fri, June 21, 2013 05:25, Tom Lane wrote:
> > "Erik Rijkers"  writes:
> >> In a 112 MB test table (containing random generated text) with a trgm
> index (gin_trgm_ops), I consistently get these
> >> timings:
> >> select txt from azjunk6 where txt ~ '^abcd';
> >>130 ms
> >> select txt from azjunk6
> >> where txt ~ 'abcd' and substr(txt,1,4) = 'abcd';
> >>3 ms
> >
> > Hm, could you provide a self-contained test case?
> >
>
> yes, sorry.   I tested on a 1M row table:
>
> #!/bin/sh
>
> # create table:
> for power in 6;
> do
>   table=azjunk${power}
>   index=${table}_trgm_re_idx
>   perl -E'
>   sub ss{ join"",@_[ map{rand @_} 1 .. shift ] };
>   say(ss(80,"a".."g"," ","h".."m"," ","n".."s"," ","t".."z")) for 1 ..
> 1e'"${power};" \
>   | psql -aqXc "
> drop table if exists $table;
> create table $table(txt text);
> copy $table from stdin;";
>   echo "set session maintenance_work_mem='1GB';
> create index $index on $table using gin (txt gin_trgm_ops);
> analyze $table;" | psql -qtAX;
> done
>
> # test:
> echo "
> \\timing on
> explain analyze select txt from azjunk6 where txt ~ '^abcd';  -- slow (140
> ms)
> explain analyze select txt from azjunk6 where txt ~ 'abcd' and
> substr(txt,1,4) = 'abcd'; -- fast (5 ms)
> " | psql -Xqa

Regex '^abcd' will be expanded into trigrams '__a', '_ab', 'abc' and 'bcd'.
However trigrams '__a' is much more frequent than '_ab' which in turn is
much more frequent than 'abc' and 'bcd'. Ommiting of ^ leads to ommiting of
'__a' and '_ab' and that gives so significant speedup.
The problem is that trigram regex engine doesn't know that '__a' and '_ab'
is more frequent than another trigrams because we don't know their
distribution. However, simple assumption that blank character (in pg_trgm
meaning) is much more frequent than others seems to be true for most cases.
Attached patch implementing this assumption. It introduces BLANK_COLOR_SIZE
macro which is used in selectColorTrigrams like count of characters in
COLOR_BLANK. Another change in this patch is split of MAX_TRGM_COUNT into
WISH_TRGM_SIZE and MAX_TRGM_SIZE. The idea is to keep trying to remove
color trigrams from graph even when it fits into MAX_TRGM_SIZE, because we
are intending to scan less posting lists/trees.
Comments is not fixed yet, coming soon.

--
With best regards,
Alexander Korotkov.

trgm_regex_optimize.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-06-21 Thread Michael Paquier

On Fri, Jun 21, 2013 at 6:19 PM, Andres Freund  wrote:
> On 2013-06-19 09:55:24 +0900, Michael Paquier wrote:
>> >>   /* Clean up. */
>> >>   heap_freetuple(reltup1);
>> >> @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
>> >>   if (OidIsValid(newrel->rd_rel->reltoastrelid))
>> >>   {
>> >>   Relationtoastrel;
>> >> - Oid toastidx;
>> >>   charNewToastName[NAMEDATALEN];
>> >> + ListCell   *lc;
>> >> + int count = 0;
>> >>
>> >>   toastrel = 
>> >> relation_open(newrel->rd_rel->reltoastrelid,
>> >>
>> >> AccessShareLock);
>> >> - toastidx = toastrel->rd_rel->reltoastidxid;
>> >> + RelationGetIndexList(toastrel);
>> >>   relation_close(toastrel, AccessShareLock);
>> >>
>> >>   /* rename the toast table ... */
>> >> @@ -1543,11 +1585,23 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
>> >>   
>> >> RenameRelationInternal(newrel->rd_rel->reltoastrelid,
>> >>  
>> >> NewToastName, true);
>> >>
>> >> - /* ... and its index too */
>> >> - snprintf(NewToastName, NAMEDATALEN, 
>> >> "pg_toast_%u_index",
>> >> -  OIDOldHeap);
>> >> - RenameRelationInternal(toastidx,
>> >> -
>> >> NewToastName, true);
>> >> + /* ... and its indexes too */
>> >> + foreach(lc, toastrel->rd_indexlist)
>> >> + {
>> >> + /*
>> >> +  * The first index keeps the former toast 
>> >> name and the
>> >> +  * following entries have a suffix appended.
>> >> +  */
>> >> + if (count == 0)
>> >> + snprintf(NewToastName, NAMEDATALEN, 
>> >> "pg_toast_%u_index",
>> >> +  OIDOldHeap);
>> >> + else
>> >> + snprintf(NewToastName, NAMEDATALEN, 
>> >> "pg_toast_%u_index_%d",
>> >> +  OIDOldHeap, count);
>> >> + RenameRelationInternal(lfirst_oid(lc),
>> >> +
>> >> NewToastName, true);
>> >> + count++;
>> >> + }
>> >>   }
>> >>   relation_close(newrel, NoLock);
>> >>   }
>> >
>> > Is it actually possible to get here with multiple toast indexes?
>> Actually it is possible. finish_heap_swap is also called for example
>> in ALTER TABLE where rewriting the table (phase 3), so I think it is
>> better to protect this code path this way.
>
> But why would we copy invalid toast indexes over to the new relation?
> Shouldn't the new relation have been freshly built in the previous
> steps?
What do you think about that? Using only the first valid index would be enough?

>
>> >> diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
>> >> index 8ac2549..31309ed 100644
>> >> --- a/src/include/utils/relcache.h
>> >> +++ b/src/include/utils/relcache.h
>> >> @@ -29,6 +29,16 @@ typedef struct RelationData *Relation;
>> >>  typedef Relation *RelationPtr;
>> >>
>> >>  /*
>> >> + * RelationGetIndexListIfValid
>> >> + * Get index list of relation without recomputing it.
>> >> + */
>> >> +#define RelationGetIndexListIfValid(rel) \
>> >> +do { \
>> >> + if (rel->rd_indexvalid == 0) \
>> >> + RelationGetIndexList(rel); \
>> >> +} while(0)
>> >
>> > Isn't this function misnamed and should be
>> > RelationGetIndexListIfInValid?
>> When naming that; I had more in mind: "get the list of indexes if it
>> is already there". It looks more intuitive to my mind.
>
> I can't follow. RelationGetIndexListIfValid() doesn't return
> anything. And it doesn't do anything if the list is already valid. It
> only does something iff the list currently is invalid.
In this case RelationGetIndexListIfInvalid?
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-21 Thread Hitoshi Harada

On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer  wrote:

> On 06/21/2013 05:32 PM, Hitoshi Harada wrote:
>
> > I also later found that we are missing not only notion of '+' or '-',
> > but also notion of 'zero value' in our catalog.  Per spec, RANGE BETWEEN
> > needs to detect ERROR if the offset value is negative, but it is not
> > always easy if you think about interval, numeric types as opposed to
> > int64 used in ROWS BETWEEN.
>
> Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
> should make sense for any type in which the concept of zero makes sense.
>
>
> Yeah, I mean, it needs to know if offset is negative or not by testing
with zero.  So we need "zero value" or "is_negative function" for each type.

Thanks,
-- 
Hitoshi Harada

Re: [HACKERS] Patch for removng unused targets

2013-06-21 Thread Etsuro Fujita

> From: Hitoshi Harada [mailto:umi.tan...@gmail.com]

> I tried several ways but I couldn't find big problems.  Small typo:
> s/rejunk/resjunk/

Thank you for the review.  Attached is an updated version of the patch.

Thanks,

Best regards,
Etsuro Fujita


unused-targets-20130621.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trgm regex index peculiarity

2013-06-21 Thread Erik Rijkers

On Fri, June 21, 2013 05:25, Tom Lane wrote:
> "Erik Rijkers"  writes:
>> In a 112 MB test table (containing random generated text) with a trgm index 
>> (gin_trgm_ops), I consistently get these
>> timings:
>> select txt from azjunk6 where txt ~ '^abcd';
>>130 ms
>> select txt from azjunk6
>> where txt ~ 'abcd' and substr(txt,1,4) = 'abcd';
>>3 ms
>
> Hm, could you provide a self-contained test case?
>

yes, sorry.   I tested on a 1M row table:

#!/bin/sh

# create table:
for power in 6;
do
  table=azjunk${power}
  index=${table}_trgm_re_idx
  perl -E'
  sub ss{ join"",@_[ map{rand @_} 1 .. shift ] };
  say(ss(80,"a".."g"," ","h".."m"," ","n".."s"," ","t".."z")) for 1 .. 
1e'"${power};" \
  | psql -aqXc "
drop table if exists $table;
create table $table(txt text);
copy $table from stdin;";
  echo "set session maintenance_work_mem='1GB';
create index $index on $table using gin (txt gin_trgm_ops);
analyze $table;" | psql -qtAX;
done

# test:
echo "
\\timing on
explain analyze select txt from azjunk6 where txt ~ '^abcd';  -- slow (140 ms)
explain analyze select txt from azjunk6 where txt ~ 'abcd' and substr(txt,1,4) 
= 'abcd'; -- fast (5 ms)
" | psql -Xqa

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-21 Thread Craig Ringer

On 06/21/2013 05:32 PM, Hitoshi Harada wrote:

> I also later found that we are missing not only notion of '+' or '-',
> but also notion of 'zero value' in our catalog.  Per spec, RANGE BETWEEN
> needs to detect ERROR if the offset value is negative, but it is not
> always easy if you think about interval, numeric types as opposed to
> int64 used in ROWS BETWEEN.

Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That
should make sense for any type in which the concept of zero makes sense.

Thanks for the warning on that issue.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch for removng unused targets

2013-06-21 Thread Hitoshi Harada

On Thu, Jun 20, 2013 at 12:19 AM, Etsuro Fujita  wrote:

> > From: Hitoshi Harada [mailto:umi.tan...@gmail.com]
>
> > I guess the patch works fine, but what I'm saying is it might be limited
> to
> > small use cases.  Another instance of this that I can think of is ORDER
> BY
> clause
> > of window specifications, which you may want to remove from the target
> list
> > as well, in addition to ORDER BY of query.  It will just not be removed
> by
> this
> > approach, simply because it is looking at only parse->sortClause.
>  Certainly
> > you can add more rules to the new function to look at the window
> specification,
> > but then I'm not sure what we are missing.
>
> Yeah, I thought the extension to the window ORDER BY case, too.  But I'm
> not
> sure it's worth complicating the code, considering that the objective of
> this
> optimization is to improve full-text search related things if I understand
> correctly, though general solutions would be desirable as you mentioned.
>
>
Ah, I see the use case now.  Makes sense.


> > So, as it stands it doesn't have
> > critical issue, but more generalized approach would be desirable.  That
> said,
> > I don't have strong objection to the current patch, and just posting one
> thought
> > to see if others may have the same opinion.
>
> OK.  I'll also wait for others' comments.  For review, an updated version
> of the
> patch is attached, which fixed the bug using the approach that directly
> uses the
> clause information in the parse tree.
>
>
>
I tried several ways but I couldn't find big problems.  Small typo:
s/rejunk/resjunk/

I defer to commiter.


Thanks,
-- 
Hitoshi Harada

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Pavel Stehule

2013/6/21 Andres Freund :
> Hi,
>
>
> On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
>> I want to draw attention to this thread on -general:
>> camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com
>
> There's also a bug reported for it:
> #8237: e1uovmc-0007ft...@wrigleys.postgresql.org
>
>> Would you concur that this is a bug?
>
> Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
> though. If we can agree it is, the fix outlined over on -bugs seems to
> be easily enough implemented...

fix is not easy :-( you should to catch any possible exception, so it
means, so you should to do some optimalization under subtransactions -
or you should not do this kind of optimizations.

Pavel

>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add session_preload_libraries configuration parameter

2013-06-21 Thread Dimitri Fontaine

Hi,

Peter Eisentraut  writes:
> This is like shared_preload_libraries except that it takes effect at
> backend start and can be changed without a full postmaster restart.  It
> is like local_preload_libraries except that it is still only settable by
> a superuser.  This can be a better way to load modules such as
> auto_explain.

I had a pretty hard time to get my head around that new one, and I'm not
sure I totally did. The important things to me are:

  - No need to manually copy the lib into the plugin directory,
  - ALTER ROLE support.

So basically it's a very good solution for auto_explain and any other
module you want to load eagerly but not for everyone and when not using
shared memory.

I have a feeling that something simpler could be made, but I will have
to continue thinking about it.

I found it strange that those two paras read differently for saying the
same thing?

> +preloaded at connection start.  This parameter cannot be changed 
> after
> +the start of a particular session.  If a specified library is not

> +The parameter value only takes effect at the start of the connection.
> +Subsequent changes have no effect.  If a specified library is not

Will review the code in more details, wanted to get back on the context
where this patch is useful first, and try to understand better the trade
offs involved.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refresh materialized view concurrently

2013-06-21 Thread Andres Freund

On 2013-06-21 02:43:23 -0700, Hitoshi Harada wrote:
> On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada wrote:
> 
> >
> >
> >
> > On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner  wrote:
> >
> >> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
> >> 9.4 CF1.  The goal of this patch is to allow a refresh without
> >> interfering with concurrent reads, using transactional semantics.
> >>
> >
> > I spent a few hours to review the patch.
> >
> 
> Oh, BTW, though it is not part of this patch, but I came across this.
> 
> ! /*
> !  * We're not using materialized views in the system catalogs.
> !  */
>   Assert(!IsSystemRelation(matviewRel));
> 
> Of course we don't have builtin matview on system catalog, but it is
> possible to create such one by allow_system_table_mods=true, so Assert
> doesn't look like good to me.

Anybody using allow_system_table_mods gets to keep the pieces. There are
so many ways to break just about everything things using it that I don't
think worrying much makes sense.
If you want you could replace that by an elog(), but...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refresh materialized view concurrently

2013-06-21 Thread Hitoshi Harada

On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada wrote:

>
>
>
> On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner  wrote:
>
>> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
>> 9.4 CF1.  The goal of this patch is to allow a refresh without
>> interfering with concurrent reads, using transactional semantics.
>>
>
> I spent a few hours to review the patch.
>

Oh, BTW, though it is not part of this patch, but I came across this.

! /*
!  * We're not using materialized views in the system catalogs.
!  */
  Assert(!IsSystemRelation(matviewRel));

Of course we don't have builtin matview on system catalog, but it is
possible to create such one by allow_system_table_mods=true, so Assert
doesn't look like good to me.

Thanks,
-- 
Hitoshi Harada

[HACKERS] Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Andres Freund

On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote:
> Noah Misch escribió:
> > On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
> > > MauMau escribi?:
> > > > Here, "reliable" means that the database server is certainly shut
> > > > down when pg_ctl returns, not telling a lie that "I shut down the
> > > > server processes for you, so you do not have to be worried that some
> > > > postgres process might still remain and write to disk".  I suppose
> > > > reliable shutdown is crucial especially in HA cluster.  If pg_ctl
> > > > stop -mi gets stuck forever when there is an unkillable process (in
> > > > what situations does this happen? OS bug, or NFS hard mount?), I
> > > > think the DBA has to notice this situation from the unfinished
> > > > pg_ctl, investigate the cause, and take corrective action.
> > > 
> > > So you're suggesting that keeping postmaster up is a useful sign that
> > > the shutdown is not going well?  I'm not really sure about this.  What
> > > do others think?
> > 
> > It would be valuable for "pg_ctl -w -m immediate stop" to have the property
> > that an subsequent start attempt will not fail due to the presence of some
> > backend still attached to shared memory.  (Maybe that's true anyway or can 
> > be
> > achieved a better way; I have not investigated.)
> 
> Well, the only case where a process that's been SIGKILLed does not go
> away, as far as I know, is when it is in some uninterruptible sleep due
> to in-kernel operations that get stuck.  Personally I have never seen
> this happen in any other case than some network filesystem getting
> disconnected, or a disk that doesn't respond.  And whenever the
> filesystem starts to respond again, the process gets out of its sleep
> only to die due to the signal.

Those are the situation in which it takes a really long time, yes. But
there can be timing issues involved. Consider a backend that's currently
stuck in a write() because it hit the dirtying limit.  Say you have a
postgres cluster that's currently slowing down to a crawl because it's
overloaded and hitting the dirty limit. Somebody very well might just
want to restart it with -m immediate. In that case a delay of a second
or two till enough dirty memory has been written that write() can
continue is enough for the postmaster to start up again and try to
attach to shared memory where it will find the shared memory to be still
in use.
I don't really see any argument for *not* waiting. Sure it might take a
bit longer till it's shut down, but if it didn't wait that will cause
problems down the road.

> If we leave postmaster running after SIGKILLing its children, the only
> thing we can do is have it continue to SIGKILL processes continuously
> every few seconds until they die (or just sit around doing nothing until
> they all die).  I don't think this will have a different effect than
> postmaster going away trusting the first SIGKILL to do its job
> eventually.

I think it should just wait till all its child processes are dead. No
need to repeat sending the signals - as you say, that won't help.

What we could do to improve the robustness a bit - at least on linux -
is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
killed if the postmaster goes away...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Config reload/restart preview

2013-06-21 Thread Thom Brown

On 21 June 2013 05:47, Gurjeet Singh  wrote:

> On Thu, Jun 20, 2013 at 9:33 AM, Magnus Hagander wrote:
>
>> On Thu, Jun 20, 2013 at 2:54 PM, Dimitri Fontaine
>>  wrote:
>> > Magnus Hagander  writes:
>> >>> Should we have a way of previewing changes that would be applied if we
>> >>> reloaded/restarted the server?
>> >>
>> >> Yes, we should.
>> >
>> > +1
>> >
>> >> This would go well with something I started working on some time ago
>> >> (but haven't actually gotten far on at all), which is the ability for
>> >> pg_ctl to be able to give feedback at all. Meaning a "pg_ctl reload"
>> >> should also be able to tell you which parameters were changed, without
>> >> having to go to the log. Obviously that's almost exactly the same
>> >> feature.
>> >
>> > It could probably connect to the server and issue the SQL command to
>> > reload, and that one could probably get enhanced to return modified
>> > variable as NOTICE, or be run with the right client_min_message:
>> >
>> > SELECT pg_reload_conf();
>> >
>> > The pg_ctl client would then have to know to display the messages sent
>> > back by the server.
>>
>> The problem with that is that now you must *always* have your system
>> set up to allow the postgres user to log in in pg_hba.conf or it
>> fails.
>>
>> But yes, one option would be to use SQL instead of opening a socket.
>> Maybe that's a better idea - have pg_ctl try to use that if available,
>> and if not send a regular signal and not try to collect the output.
>>
>
> I started working on it yesterday after Thom proposed this idea internally
> at EDB. The discussion until now doesn't seem to be hostile to a SQL
> interface, so attached is a hack attempt, which hopefully will serve at
> least as a POC. A sample session is shown below, while changing a few
> values in postgresql.conf files.
>
> The central idea is to use the SIGHUP processing function to do the work
> for us and report potential changes via DEBUG2, instead of having to write
> a new parsing engine. The (GUC-nesting + PGC_S_TEST) is nice to have since
> it avoids the current session from adopting the values that are different
> in conf file. This approach is susceptible to the fact that the connected
> superuser may have its GUC values picked up from user/database/session
> level settings (ALTER USER/DATABASE .. SET ; or SET param TO val;).
>
> $ pgsql
> Expanded display is used automatically.
> psql (9.4devel)
> Type "help" for help.
>
> postgres=# show work_mem;
>  work_mem
> --
>  1MB
> (1 row)
>
> postgres=# set client_min_messages = debug2;
> SET
> postgres=# select pg_test_reload_conf();
> DEBUG:  parameter "work_mem" changed to "70MB"
>  pg_test_reload_conf
> -
>  t
> (1 row)
>
> postgres=# show work_mem;
>  work_mem
> --
>  1MB
> (1 row)
>
> postgres=# select pg_test_reload_conf();
> DEBUG:  parameter "shared_buffers" cannot be changed without restarting
> the server
> DEBUG:  configuration file
> "/home/gurjeet/dev/pgdbuilds/report_guc_chanege_pre_reload/db/data/postgresql.conf"
> contains errors; unaffected changes were applied
>  pg_test_reload_conf
> -
>  t
> (1 row)
>
> postgres=# select pg_test_reload_conf();
> DEBUG:  parameter "log_min_messages" removed from configuration file,
> reset to default
>  pg_test_reload_conf
> -
>  t
> (1 row)
>

Thanks for taking a look at this Gurjeet.  This seems to be a step towards
it, but there are a few issues:

1) As you say, if the superuser has role-level settings, it will provide
incorrect feedback.

2) Settings that require a restart don't tell you what there are currently
set to and what they would be set to.  I'm not sure the currently-logged
text provided by a SIGHUP is necessarily appropriate for such a feature.

3) I'd expect that a DBA that goes editing postgresql.conf wouldn't then go
logging in to the database to run this command, if they even knew it
existed.  Whereas they would typically be familiar with
"/etc/init.d/postgresql " or "pg_ctl ".  Now if the SQL
function was a medium to achieve this, that would be fine, but I'm not
clear on what would be required technically to achieve the kind of
intuitive admin-friendly interface.  Or maybe there's a better method of
checking the config that I haven't considered.

-- 
Thom

Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER

2013-06-21 Thread Hitoshi Harada

On Thu, Jun 20, 2013 at 7:24 PM, Craig Ringer wrote:
>
> I've missed this feature more than once, and am curious about whether
> any more recent changes may have made it cleaner to tackle this, or
> whether consensus can be formed on adding the new entries to btree's
> opclass to avoid the undesirable explicit lookups of the '+' and '-'
> oprators.
>
>

As far as I know the later development didn't add anything to help this
conversation.  I initially thought range type or knn gist would add
something, but they were something else far from this.  On the other hand,
if this makes it, it'll also open doors to range PARTITION BY for CREATE
TABLE command, so the impact will be bigger than you may think.

I also later found that we are missing not only notion of '+' or '-', but
also notion of 'zero value' in our catalog.  Per spec, RANGE BETWEEN needs
to detect ERROR if the offset value is negative, but it is not always easy
if you think about interval, numeric types as opposed to int64 used in ROWS
BETWEEN.

Thanks,
-- 
Hitoshi Harada

Re: [HACKERS] refresh materialized view concurrently

2013-06-21 Thread Hitoshi Harada

On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner  wrote:

> Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
> 9.4 CF1.  The goal of this patch is to allow a refresh without
> interfering with concurrent reads, using transactional semantics.
>

I spent a few hours to review the patch.

As far as I can tell, the overall approach is as follows.

- create a new temp heap as non-concurrent does, but with ExclusiveLock on
the matview, so that reader wouldn't be blocked
- with this temp table and the matview, query FULL JOIN and extract
difference between original matview and temp heap (via SPI)
- this operation requires unique index for performance reason (or
correctness reason too)
- run ANALYZE on this diff table
- run UPDATE, INSERT and DELETE via SPI, to do the merge
- close these temp heap

If I don't miss something, the requirement for the CONCURRENTLY option is
to allow simple SELECT reader to read the matview concurrently while the
view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked.  So, I wonder why it is not possible just
to acquire ExclusiveLock on the matview while populating the data and swap
the relfile by taking small AccessExclusiveLock.  This lock escalation is
no dead lock hazard, I suppose, because concurrent operation would block
the other at the point ExclusiveLock is acquired, and ExclusiveLock
conflicts AccessExclusiveLock.  Then you don't need the complicated SPI
logic or unique key index dependency.

Assuming I'm asking something wrong and going for the current approach,
here's what I've found in the patch.

- I'm not sure if unique key index requirement is reasonable or not,
because users only add CONCURRENTLY option and not merging or incremental
update.
- This could be an overflow in diffname buffer.
+ quoteRelationName(tempname, tempRel);
+ strcpy(diffname, tempname);
+ strcpy(diffname + strlen(diffname) - 1, "_2\"");
- As others pointed out, quoteOneName can be replaced with quote_identifier
- This looks harmless, but I wonder if you need to change relkind.

*** 665,672  make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
!   OldHeap->rd_rel->relkind,
!   OldHeap->rd_rel->relpersistence,
false,
RelationIsMapped(OldHeap),
true,
--- 680,687 
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
!   RELKIND_RELATION,
!   relpersistence,
false,
RelationIsMapped(OldHeap),
true,

Since OldHeap->rd_rel->relkind has been working with 'm', too, not only 'r'?
- I found two additional parameters on make_new_heap ugly, but couldn't
come up with better solution.  Maybe we can pass Relation of old heap to
the function instead of Oid..

That's about it from me.

Thanks,
-- 
Hitoshi Harada

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Albe Laurenz

Andres Freund wrote:
> On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
> > I want to draw attention to this thread on -general:
> > camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com
> 
> There's also a bug reported for it:
> #8237: e1uovmc-0007ft...@wrigleys.postgresql.org
> 
> > Would you concur that this is a bug?
> 
> Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
> though. If we can agree it is, the fix outlined over on -bugs seems to
> be easily enough implemented...

I think that it is surprising behaviour.

If fixing the behaviour is undesirable, at least the documentation
should be fixed.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] updated emacs configuration

2013-06-21 Thread Dimitri Fontaine

Hi,

Peter Eisentraut  writes:
> I think the suggested emacs configuration snippets in
> src/tools/editors/emacs.samples no longer represent current best
> practices.  I have come up with some newer things that I'd like to
> propose for review.

Thanks for doing that!

> First, I propose adding a .dir-locals.el file to the top-level directory
> with basic emacs settings.  These get applied automatically.  This
> especially covers the particular tab and indentation settings that
> PostgreSQL uses.  With this, casual developers will not need to modify
> any of their emacs settings.

I've tested that on a new git clone and with the `emacs -q` command so
as not to load any of my local setup. While the indentation seemed ok,
the placement of the comments seems way off:

Compare what you see using those commands:

emacs -q src/backend/commands/extension.c
emacs -q -l ../emacs.samples src/backend/commands/extension.c

(When using macosx, you might have to replace the 'emacs' binary
 location with /Applications/Emacs.app/Contents/MacOS/Emacs).

I did also test on doc/src/sgml/extend.sgml and some Makefile, only with
using the emacs.samples file content though.

> With that, emacs.samples can be shrunk significantly.  The only real
> reason to keep is that that c-offsets-alist and (more dubiously)
> sgml-basic-offset cannot be set from .dir-locals.el because they are not
> "safe".  I have also removed many of the redundant examples and settled
> on a hook-based solution.

A couple of notes about your emacs.sample file:

  - Name the lambda used in the hook for easier removing / reference

  - A fresh git clone will create a directory named postgres, so I did
change your /postgresql/ regex to /postgres/ in my attached version

> I think together this setup would be significantly simpler and more
> practical.

Agreed.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

;; -*- mode: emacs-lisp -*-

;; This file contains code to set up Emacs to edit PostgreSQL source
;; code.  Copy these snippets into your .emacs file or equivalent, or
;; use load-file to load this file directly.
;;
;; Note also that there is a .dir-locals.el file at the top of the
;; PostgreSQL source tree, which contains many of the settings shown
;; here.  So for light editing, you might not need any additional
;; Emacs configuration.

;;; C files

;; Style that matches the formatting used by
;; src/tools/pgindent/pgindent.  Many extension projects also use this
;; style.
(c-add-style "postgresql"
 '("bsd"
   (c-basic-offset . 4)
   (c-offsets-alist . ((case-label . +)))
   (fill-column . 79)
   (indent-tabs-mode . t)
   (tab-width . 4)))

(add-hook 'c-mode-hook
  (defun postgresql-c-mode-hook ()
(when (string-match "/postgres/" buffer-file-name)
  (c-set-style "postgresql"

;;; documentation files

(add-hook 'sgml-mode-hook
   (defun postgresql-sgml-mode-hook ()
 (when (string-match "/postgres/" buffer-file-name)
   (setq fill-column 79)
   (setq indent-tabs-mode nil)
   (setq sgml-basic-offset 1

;;; Makefiles

;; use GNU make mode instead of plain make mode
(add-to-list 'auto-mode-alist '("/postgres/.*Makefile.*" . makefile-gmake-mode))
(add-to-list 'auto-mode-alist '("/postgres/.*\\.mk\\'" . makefile-gmake-mode))

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-06-21 Thread Andres Freund

On 2013-06-19 09:55:24 +0900, Michael Paquier wrote:
> Please find an updated patch. The regression test rules has been
> updated, and all the comments are addressed.
> 
> On Tue, Jun 18, 2013 at 6:35 PM, Andres Freund  wrote:
> > Hi,
> >
> > On 2013-06-18 10:53:25 +0900, Michael Paquier wrote:
> >> diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
> >> index c381f11..3a6342c 100644
> >> --- a/contrib/pg_upgrade/info.c
> >> +++ b/contrib/pg_upgrade/info.c
> >> @@ -321,12 +321,17 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
> >> "INSERT INTO 
> >> info_rels "
> >> "SELECT 
> >> reltoastrelid "
> >> "FROM info_rels i 
> >> JOIN pg_catalog.pg_class c "
> >> -   " ON 
> >> i.reloid = c.oid"));
> >> +   " ON 
> >> i.reloid = c.oid "
> >> +   " AND 
> >> c.reltoastrelid != %u", InvalidOid));
> >>   PQclear(executeQueryOrDie(conn,
> >> "INSERT INTO 
> >> info_rels "
> >> -   "SELECT 
> >> reltoastidxid "
> >> -   "FROM info_rels i 
> >> JOIN pg_catalog.pg_class c "
> >> -   " ON 
> >> i.reloid = c.oid"));
> >> +   "SELECT indexrelid 
> >> "
> >> +   "FROM pg_index "
> >> +   "WHERE indrelid IN 
> >> (SELECT reltoastrelid "
> >> +   "  FROM 
> >> pg_class "
> >> +   "  WHERE 
> >> oid >= %u "
> >> +   " AND 
> >> reltoastrelid != %u)",
> >> +   
> >> FirstNormalObjectId, InvalidOid));
> >
> > What's the idea behind the >= here?
> It is here to avoid fetching the toast relations of system tables. But
> I see your point, the inner query fetching the toast OIDs should do a
> join on the exising info_rels and not try to do a join on a plain
> pg_index, so changed this way.

I'd also rather not introduce knowledge about FirstNormalObjectId into
client applications... But you fixed it already.


> >>   /* Clean up. */
> >>   heap_freetuple(reltup1);
> >> @@ -1529,12 +1570,13 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
> >>   if (OidIsValid(newrel->rd_rel->reltoastrelid))
> >>   {
> >>   Relationtoastrel;
> >> - Oid toastidx;
> >>   charNewToastName[NAMEDATALEN];
> >> + ListCell   *lc;
> >> + int count = 0;
> >>
> >>   toastrel = 
> >> relation_open(newrel->rd_rel->reltoastrelid,
> >>
> >> AccessShareLock);
> >> - toastidx = toastrel->rd_rel->reltoastidxid;
> >> + RelationGetIndexList(toastrel);
> >>   relation_close(toastrel, AccessShareLock);
> >>
> >>   /* rename the toast table ... */
> >> @@ -1543,11 +1585,23 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
> >>   RenameRelationInternal(newrel->rd_rel->reltoastrelid,
> >>  
> >> NewToastName, true);
> >>
> >> - /* ... and its index too */
> >> - snprintf(NewToastName, NAMEDATALEN, 
> >> "pg_toast_%u_index",
> >> -  OIDOldHeap);
> >> - RenameRelationInternal(toastidx,
> >> -
> >> NewToastName, true);
> >> + /* ... and its indexes too */
> >> + foreach(lc, toastrel->rd_indexlist)
> >> + {
> >> + /*
> >> +  * The first index keeps the former toast 
> >> name and the
> >> +  * following entries have a suffix appended.
> >> +  */
> >> + if (count == 0)
> >> + snprintf(NewToastName, NAMEDATALEN, 
> >> "pg_toast_%u_index",
> >> +  OIDOldHeap);
> >> + else
> >> +

Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-21 Thread Dean Rasheed

On 21 June 2013 06:16, David Fetter  wrote:
> On Fri, Jun 21, 2013 at 12:10:25AM -0400, Alvaro Herrera wrote:
>> David Fetter escribió:
>> > On Thu, Jun 20, 2013 at 08:59:27PM +0100, Dean Rasheed wrote:
>>
>> > > In my testing of sub-queries in the FILTER clause (an extension to the
>> > > spec), I was able to produce the following error:
>> >
>> > Per the spec,
>> >
>> > B) A  shall not contain a , a > > function>, or an outer reference.
>>
>> If this is not allowed, I think there should be a clearer error message.
>> What Dean showed is an internal (not user-facing) error message.
>
> Please find attached a patch which allows subqueries in the FILTER
> clause and adds regression testing for same.
>

Nice!

I should have pointed out that it was already working for some
sub-queries, just not all. It's good to see that it was basically just
a one-line fix, because I think it would have been a shame to not
support sub-queries.

I hope to take another look at it over the weekend.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: FILTER for aggregates [was Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division]

2013-06-21 Thread Dean Rasheed

On 21 June 2013 05:01, David Fetter  wrote:
> What tests do you think should be there that aren't?
>

I think I expected to see more tests related to some of the specific
code changes, such as

CREATE TABLE t AS SELECT * FROM generate_series(1,10) t(x);

-- Should fail (filter can't be used for non-aggregates)
SELECT abs(x) FILTER (WHERE x > 5) FROM t;

-- Should fail (filter clause can't contain aggregates)
SELECT array_agg(x) FILTER (WHERE x > AVG(x)) t;

-- OK (aggregate in filter sub-query)
SELECT array_agg(x) FILTER (WHERE x > (SELECT AVG(x) FROM t)) FROM t;

-- OK (trivial sub-query)
SELECT array_agg(x) FILTER (WHERE (SELECT x > 5)) FROM t;

-- OK
SELECT array_agg(x) FILTER (WHERE (SELECT x > (SELECT AVG(x) FROM t))) FROM t;

-- OK
SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x > AVG(t2.x) FROM
t as t2))) FROM t;

-- Should fail
SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT 5 > AVG(t.x) FROM t
as t2))) FROM t;

-- OK (filtered aggregate in window context)
SELECT x, AVG(x) OVER(ORDER BY x), AVG(x) FILTER (WHERE x > 5)
OVER(ORDER BY x) FROM t;

-- Should fail (non-aggregate window function with filter)
SELECT x, rank() OVER(ORDER BY x), rank() FILTER (WHERE x > 5)
OVER(ORDER BY x) FROM t;

-- View definition test
CREATE VIEW v AS SELECT array_agg(x) FILTER (WHERE (SELECT (SELECT t.x
> AVG(t2.x) FROM t as t2))) FROM t;
\d+ v

etc...

You could probably dream up better examples --- I just felt that the
current tests don't give much coverage of the code changes.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Andres Freund

Hi,


On 2013-06-21 08:16:22 +, Albe Laurenz wrote:
> I want to draw attention to this thread on -general:
> camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com

There's also a bug reported for it:
#8237: e1uovmc-0007ft...@wrigleys.postgresql.org

> Would you concur that this is a bug?

Yes, I think it's pretty clearly a bug - Tom doesn't seem think so
though. If we can agree it is, the fix outlined over on -bugs seems to
be easily enough implemented...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013-06-21 Thread Dmitriy Igrishin

2013/6/21 Albe Laurenz 

> Dmitriy Igrishin wrote:
> > Sent: Thursday, June 20, 2013 5:09 PM
> > To: PostgreSQL Hackers
> > Subject: [HACKERS] Frontend/backend protocol improvements proposal
> (request).
> >
> > Hackers,
> >
> > While developing a C++ client library for Postgres I felt lack of extra
> > information in command tags in the CommandComplete (B) message
> > for the following commands:
> >   PREPARE;
> >   DEALLOCATE;
> >   DECLARE;
> >   CLOSE;
> >   LISTEN;
> >   UNLISTEN;
> >   SET;
> >   RESET.
> > Namely, for example, users of my library can prepare statements by using
> > protocol directly or via PREPARE command. Since the protocol does not
> > supports prepared statement deallocation, I wrote a wrapper over
> DEALLOCATE
> > command. The library knows about all prepared statements and
> > invalidates them automatically when user performs deallocate() wrapper.
> > But users can go with DEALLOCATE command directly and in these cases
> > I need to query the database to get the list of currently prepared
> statements
> > whenever CommandComplete message with DEALLOCATE command tag
> > is consumed. Moreover, I need to do it *synchronously* and this breaks
> > asynchronous API.
> > I propose to include name of the object in the CommandComplete (B)
> > message for the above commands.
>
> That would be a change in the protocol, so it's not likely to happen
> soon.  There is a page where proposed changes to the wire protocol
> are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes

Well, even if this proposal moves to the TODO, it would be nice.

>
>
> It seems like bad design to me to keep a list of prepared statements
> on the client side when it is already kept on the server side
> (accessible with the pg_prepared_statements view).
>

> What's wrong with the following:
> If the user wants to deallocate an individual prepared statement,
> just send "DEALLOCATE " to the server.  If the
> statement does not exist, the server will return an error.
> If the user wants to deallocate all statements, just send
> "DEALLOCATE ALL".
> Why do you need to track prepared statements on the client side?
>
Nothing wrong if the user wants to deal with scary and cumbersome code.
As library author, I want to help people make things simpler.
To understand me, please look at the pseudo C++ code below.

// A class designed to work with prepared statements
class Prepared_statement {
public:
  // Methods to generate a Bind message, like
  Prepared_statement* bind(Position, Value);
  // ... and more
  // Methods to send Execute message, like
  void execute();
  void execute_async();
};

class Connection {
public:
  // many stuff ...
  void close();

  Prepared_statement* prepare(Name, Query);
  void prepare_async(Statement);

  // Make yet another instance of prepared statement.
  Prepared_statement* prepared_statement(Name);

  // etc.
};

The Connection class is a factory for Prepared_statement instances.
As you can see, the Connection::prepare() returns new instance of
*synchronously* prepared statement. Next, the user can bind values
and execute the statement, like this:

void f(Connection* cn)
{
  // Prepare unnamed statement and execute it.
  cn->prepare("SELECT $1::text")->bind(0, "Albe")->execute();
  // Ps: don't worry about absence of delete; We are using smart pointers
:-)
}

But there is a another possible case:

void f(Connection* cn)
{
  Prepared_statement* ps = cn->prepare("SELECT $1::text");
  cn->close(); // THIS SHOULD invalidate all Prepared_statement instances
...
  ps->bind(0, "Albe"); // ... to throw the exception here
}

Moreover, consider:

void f(Connection* cn)
{
  Prepared_statement* ps1 = cn->prepare("ps1", "SELECT $1::text");
  cn->deallocate("ps1"); // THIS invalidates ps1 object...
  ps1->bind(0, "Albe"); // ... to throw the exception here

  Prepared_statement* ps2 = cn->prepare("ps2", "SELECT $1::text");
  cn->perform("DEALLOCATE ps2"); // THIS SHOULD ALSO invalidate ps2
object...
  ps2->bind(0, "Albe"); // ... to throw the exception here
}

In the latter case when the user deallocates named prepared statement
directly,
the implementation of Connection can invalidates the prepared statement
(ps2) by
analyzing and parsing CommandComplete command tag to get it's name.

And please note, that the user can send DEALLOCATE asynchronously. And
there is
only two ways to get the prepared statement (or another session object's)
name:
  1) Parse the SQL command which the user is attempts to send;
  2) Just get it from CommandComplete command tag.

I beleive that the 1) is a 100% bad idea.

PS: this C++11 library is not publicaly available yet, but I hope it will
this year.

-- 
// Dmitriy.

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-21 Thread KONDO Mitsumasa


Hi,

I took results of my separate patches and original PG.

* Result of DBT-2
  | TPS  90%tileAverage  Maximum
--
original_0.7  | 3474.62  18.348328  5.73936.977713
original_1.0  | 3469.03  18.637865  5.84241.754421
fsync | 3525.03  13.872711  5.38228.062947
write | 3465.96  19.653667  5.80440.664066
fsync + write | 3564.94  16.31922   5.1  34.530766

 - 'original_*' indicates checkpoint_completion_target in PG 9.2.4.
 - In other patched postgres, checkpoint_completion_target sets 0.7.
 - 'write' is applied write patch, and 'fsync' is applied fsync patch.
 - 'fsync + write' is applied both patches.


* Investigation of result
 - Large value of checkpoint_completion_target in original and the patch in 
write become slow latency in benchmark transactions. Because slow write pages are 
caused long time fsync IO in final checkpoint.
 - The patch in fsync has an effect latency in each file fsync. Continued 
fsyncsin each files are caused slow latency. Therefore, it is good for latency 
that fsync stage in checkpoint has sleeping time after slow fsync IO.
 - The patches of fsync + write were seemed to improve TPS. I think that write 
patch does not disturb transactions which are in full-page-write WAL write than 
original(plain) PG.


I will send you more detail investigation and result next week. And I will also 
take result in pgbench. If you mind other part of benchmark result or parameter 
of postgres, please tell me.


Best Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Possible bug in CASE evaluation

2013-06-21 Thread Albe Laurenz

I want to draw attention to this thread on -general:
camq5dgq4sujpbht2-9xlapasvknul2-bb0cpyci2fp+pfsf...@mail.gmail.com

Would you concur that this is a bug?

The fine manual says about CASE:

  If the condition's result is true, the value of the CASE expression
  is the result that follows the condition, and the remainder of the
  CASE expression is not processed.

But:

test=> CREATE FUNCTION zero() RETURNS integer IMMUTABLE LANGUAGE SQL AS 'SELECT 
0';
CREATE FUNCTION
test=> SELECT CASE WHEN (SELECT zero()) = 0 THEN -1 ELSE 1/zero() END;
ERROR:  division by zero

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013-06-21 Thread Albe Laurenz

Dmitriy Igrishin wrote:
> Sent: Thursday, June 20, 2013 5:09 PM
> To: PostgreSQL Hackers
> Subject: [HACKERS] Frontend/backend protocol improvements proposal (request).
> 
> Hackers,
> 
> While developing a C++ client library for Postgres I felt lack of extra
> information in command tags in the CommandComplete (B) message
> for the following commands:
>   PREPARE;
>   DEALLOCATE;
>   DECLARE;
>   CLOSE;
>   LISTEN;
>   UNLISTEN;
>   SET;
>   RESET.
> Namely, for example, users of my library can prepare statements by using
> protocol directly or via PREPARE command. Since the protocol does not
> supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE
> command. The library knows about all prepared statements and
> invalidates them automatically when user performs deallocate() wrapper.
> But users can go with DEALLOCATE command directly and in these cases
> I need to query the database to get the list of currently prepared statements
> whenever CommandComplete message with DEALLOCATE command tag
> is consumed. Moreover, I need to do it *synchronously* and this breaks
> asynchronous API.
> I propose to include name of the object in the CommandComplete (B)
> message for the above commands.

That would be a change in the protocol, so it's not likely to happen
soon.  There is a page where proposed changes to the wire protocol
are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE " to the server.  If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY

2013-06-21 Thread Dean Rasheed

On 21 June 2013 08:02, Dean Rasheed  wrote:
> On 21 June 2013 06:54, David Fetter  wrote:
>>> For example "SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS file"
>>
>> The spec is pretty specific about the "all or none" nature of naming
>> in the AS clause...unless we can figure out a way of getting around it
>> somehow.
>
> We already support and document the ability to provide a subset of the
> columns in the alias. I didn't realise that was beyond the spec, but
> since we already have it...
>
>
>> I'm weighing in on the side of a name that's like ?columnN? elsewhere
>> in the code, i.e. one that couldn't sanely be an actual column name,
>> whether in table, view, or SRF.
>
> I don't think we need to be overly concerned with coming up with a
> unique name for the column. Duplicate column names are fine, except if
> the user wants to refer to them elsewhere in the query, in which case
> they need to provide aliases to distinguish them, otherwise the query
> won't be accepted.
>
> I'd be happy if you just replaced "?column?" with ordinality in a
> couple of places in your original patch.
>

Expanding on that, I think it's perfectly acceptable to allow
potentially duplicate column names in this context. For the majority
of simple queries the user wouldn't have to (and wouldn't feel
compelled to) supply aliases. Where there was ambiguity they would be
forced to do so, but people are already used to that.

If I write a simple query that selects from a single unnest() with or
without ordinality, I probably won't want to supply aliases.

If I select from 2 unnest()'s *without* ordinality, I already have to
provide aliases.

If I select from 2 SRF's functions with ordinality, I won't be too
surprised if I am also forced to provide aliases.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Hitoshi Harada

On Thu, Jun 20, 2013 at 3:40 PM, MauMau  wrote:
>
>  Here, "reliable" means that the database server is certainly shut
>>> down when pg_ctl returns, not telling a lie that "I shut down the
>>> server processes for you, so you do not have to be worried that some
>>> postgres process might still remain and write to disk".  I suppose
>>> reliable shutdown is crucial especially in HA cluster.  If pg_ctl
>>> stop -mi gets stuck forever when there is an unkillable process (in
>>> what situations does this happen? OS bug, or NFS hard mount?), I
>>> think the DBA has to notice this situation from the unfinished
>>> pg_ctl, investigate the cause, and take corrective action.
>>>
>>
>> So you're suggesting that keeping postmaster up is a useful sign that
>> the shutdown is not going well?  I'm not really sure about this.  What
>> do others think?
>>
>
> I think you are right, and there is no harm in leaving postgres processes
> in unkillable state.  I'd like to leave the decision to you and/or others.
>
>
+1 for leaving processes, not waiting for long (or possibly forever,
remember not all people are running postgres on such cluster ware).  I'm
sure some users rely on the current behavior of immediate shutdown.  If
someone wants to ensure processes are finished when pg_ctl returns, is it
fast shutdown, not immediate shutdown?  To me the current immediate
shutdown is reliable, in that it without doubt returns control back to
terminal, after killing postmaster at least.

Thanks,
-- 
Hitoshi Harada

Re: [HACKERS] Review: UNNEST (and other functions) WITH ORDINALITY

2013-06-21 Thread Dean Rasheed

On 21 June 2013 06:54, David Fetter  wrote:
>> For example "SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS file"
>
> The spec is pretty specific about the "all or none" nature of naming
> in the AS clause...unless we can figure out a way of getting around it
> somehow.

We already support and document the ability to provide a subset of the
columns in the alias. I didn't realise that was beyond the spec, but
since we already have it...

> I'm weighing in on the side of a name that's like ?columnN? elsewhere
> in the code, i.e. one that couldn't sanely be an actual column name,
> whether in table, view, or SRF.

I don't think we need to be overly concerned with coming up with a
unique name for the column. Duplicate column names are fine, except if
the user wants to refer to them elsewhere in the query, in which case
they need to provide aliases to distinguish them, otherwise the query
won't be accepted.

I'd be happy if you just replaced "?column?" with ordinality in a
couple of places in your original patch.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

87 matches

Mail list logo