Re: [GENERAL] Replication terminated due to PANIC

2013-04-24 Thread Adarsh Sharma
Thanks Sergey for such a quick response, but i dont think this is some
patch problem because we have other DB servers also running fine on same
version and message is also different :

host= PANIC: _bt_restore_page: cannot add item to page

And the whole day replication is working fine but at midnight when log
rotates it shows belows msg :

2013-04-24 00:00:00 UTC [26989]: [4945032-1] user= db= host= LOG:
checkpoint starting: time
2013-04-24 00:00:00 UTC [26989]: [4945033-1] user= db= host= ERROR:
could not open file "global/14078": No such file or directory
2013-04-24 00:00:00 UTC [26989]: [4945034-1] user= db= host= CONTEXT:
writing block 0 of relation global/14078
2013-04-24 00:00:00 UTC [26989]: [4945035-1] user= db= host= WARNING:
could not write block 0 of global/14078
2013-04-24 00:00:00 UTC [26989]: [4945036-1] user= db= host= DETAIL:
Multiple failures --- write error might be permanent.

Looks like some index corruption.


Thanks






On Thu, Apr 25, 2013 at 8:14 AM, Sergey Konoplev  wrote:

> On Wed, Apr 24, 2013 at 5:05 PM, Adarsh Sharma 
> wrote:
> > I have a Postgresql 9.2 instance running on a CentOS6.3 box.Yesterday i
> > setup a hot standby by using pgbasebackup. Today i got the below  alert
> from
> > standby box :
> >
> > [1] (from line 412,723)
> > 2013-04-24 23:07:18 UTC [13445]: [6-1] user= db= host= PANIC:
> > _bt_restore_page: cannot add item to page
> >
> > When i check, the replication is terminated due to slave DB shutdown.
> From
> > the logs i can see below messages :-
>
> I am not sure that it is your situation but take a look at this thread:
>
>
> http://www.postgresql.org/message-id/CAL_0b1t=WuM6roO8dki=w8dhh8p8whhohbpjreymmqurocn...@mail.gmail.com
>
> There is a patch by Andres Freund in the end of the discussion. Three
> weeks have passed after I installed the patched version and it looks
> like the patch fixed my issue.
>
> >
> > 2013-04-24 23:17:16 UTC [26989]: [5360083-1] user= db= host= ERROR:
>  could
> > not open file "global/14078": No such file or directory
> > 2013-04-24 23:17:16 UTC [26989]: [5360084-1] user= db= host= CONTEXT:
> > writing block 0 of relation global/14078
> > 2013-04-24 23:17:16 UTC [26989]: [5360085-1] user= db= host= WARNING:
>  could
> > not write block 0 of global/14078
> > 2013-04-24 23:17:16 UTC [26989]: [5360086-1] user= db= host= DETAIL:
> > Multiple failures --- write error might be permanent.
> >
> > I checked in global directory of master, the directory 14078 doesn't
> exist.
> >
> > Anyone has faced above issue ?
> >
> > Thanks
>
>
>
> --
> Kind regards,
> Sergey Konoplev
> Database and Software Consultant
>
> Profile: http://www.linkedin.com/in/grayhemp
> Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
> Skype: gray-hemp
> Jabber: gray...@gmail.com
>


Re: [GENERAL] DISTINCT ON changes sort order

2013-04-24 Thread Tom Lane
Alexander Reichstadt  writes:
> I think I solved it:
> SELECT * FROM (SELECT DISTINCT ON(refid_messages) refid_messages as msgid, * 
> FROM messagehistorywithcontent WHERE   
> (lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' 
> ')) LIKE '%gg%') ORDER BY refid_messages DESC) as foo ORDER BY messagekind ASC

This is not really a solution, unless you don't care which of the rows
with a particular refid_messages value comes out.  Usually, use of
"DISTINCT ON (x)" should be accompanied by "ORDER BY x, y", where y is
what you want to determine which row of a given x value comes out.
If you write it like this, you get the row with smallest y for each
x; or you could write "ORDER BY x, y DESC" to get the row with largest
y.  See the DISTINCT ON example in the SELECT reference page.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom session variables?

2013-04-24 Thread Darren Duncan

On 2013.04.24 7:16 PM, � wrote:

Maybe you must see this extension [1] ;-)

[1] http://pgxn.org/dist/session_variables/

Fabrízio de Royes Mello


Thanks for your response.



/*
 * Author: Fabrízio de Royes Mello
 * Created at: Thu Oct 27 14:37:36 -0200 2011
 *
 */

CREATE FUNCTION set_value(TEXT, TEXT) RETURNS void AS $$
BEGIN
  PERFORM set_config('session_variables.'||$1, $2, false);
  RETURN;
END;
$$ LANGUAGE plpgsql;
COMMENT ON FUNCTION set_value(TEXT, TEXT) IS
'Create/Assign value to a new/existing session variable';

SET check_function_bodies TO OFF;
CREATE FUNCTION get_value(TEXT) RETURNS TEXT AS $$
  SELECT current_setting('session_variables.'||$1);
$$ LANGUAGE sql;
COMMENT ON FUNCTION get_value(TEXT) IS
'Returns the value of session variable passed as a parameter';



So, ok, basically the same as 
http://frefo.blogspot.ca/2009/04/session-variables-in-postgresql.html which I 
initially pointed to.


I'll take that as several people endorsing set_config()/current_setting() as a 
preferred way to do this.


The main limitation seems to be that those builtins just store and return text 
values, but a little casting on store/fetch should take care of that.


The temporary table approach wouldn't need casting in contrast.

Well, I can choose either then as the circumstances warrant.

Thanks for the feedback, I now consider my question to be answered.

-- Darren Duncan



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Confusing error message.

2013-04-24 Thread Sergey Konoplev
On Wed, Apr 24, 2013 at 4:21 AM, Dmitriy Igrishin  wrote:
> I've spend some time to find a bug in the application,
> which performed query with entire quoted schema-qualified
> relation name (i.e. "schema.relation" instead of "schema"."relation"
> or just schema.relation), and the error handler printed to the log an
> error message a confusing message.
>
> Thoughts?

+1

[local]:5432 grayhemp@grayhemp=# table "бла.бла";
ERROR:  relation "бла.бла" does not exist
LINE 1: table "бла.бла";
  ^

It looks ambiguous indeed.

--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] CLUSTER, REINDEX and VACUUM on batch ops

2013-04-24 Thread Sergey Konoplev
On Wed, Apr 24, 2013 at 8:49 AM, François Beausoleil
 wrote:
> Presently, I'm migrating each partition individually to add NOT NULL, set a 
> default value and update the table to have correct values. Essentially, I'm 
> doing this:
>
> ALTER TABLE parent ADD COLUMN new_field int; -- adds the field to all child 
> tables - runs quickly
>
> -- the bulk of the data transfer
> for each partition in partitions:
>   BEGIN;
>   UPDATE partition SET new_field = 0;
>   ALTER TABLE partition
>   ALTER COLUMN new_field SET NOT NULL
> , ALTER COLUMN new_field SET DEFAULT 0;
>   COMMIT;
>
>   CLUSTER partition USING partition_pkey;
>   REINDEX TABLE partition;
>   VACUUM ANALYZE partition;
> done
>
> After I've clustered the table, must I reindex and vacuum as well? It is 
> unclear to me if clustering a table reindexes or not: the docs at 
> http://www.postgresql.org/docs/current/static/sql-cluster.html are silent on 
> the matter, but do mention that an ANALYZE is in order.

CLUSTER does full table rewrite including all its indexes so REINDEX
is not required after it. It is mentioned in the docs implicitly:

<<
When an index scan is used, a temporary copy of the table is created
that contains the table data in the index order. Temporary copies of
each index on the table are created as well. Therefore, you need free
space on disk at least equal to the sum of the table size and the
index sizes.

When a sequential scan and sort is used, a temporary sort file is also
created, so that the peak temporary space requirement is as much as
double the table size, plus the index sizes.
>>

BTW, you do not need to do the ALTERs for each partition. What you
need to do is:

1. add the column to the parent,
2. set the default constraint on the column of the parent,
3. update the column in partitions to the value,
4. set the not null constraint on the parent.

It will be better from the point of view of inheritance as the new
column will be fully inherited from the parent rather then partially
overloaded in partitions.

--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Replication terminated due to PANIC

2013-04-24 Thread Sergey Konoplev
On Wed, Apr 24, 2013 at 5:05 PM, Adarsh Sharma  wrote:
> I have a Postgresql 9.2 instance running on a CentOS6.3 box.Yesterday i
> setup a hot standby by using pgbasebackup. Today i got the below  alert from
> standby box :
>
> [1] (from line 412,723)
> 2013-04-24 23:07:18 UTC [13445]: [6-1] user= db= host= PANIC:
> _bt_restore_page: cannot add item to page
>
> When i check, the replication is terminated due to slave DB shutdown. From
> the logs i can see below messages :-

I am not sure that it is your situation but take a look at this thread:

http://www.postgresql.org/message-id/CAL_0b1t=WuM6roO8dki=w8dhh8p8whhohbpjreymmqurocn...@mail.gmail.com

There is a patch by Andres Freund in the end of the discussion. Three
weeks have passed after I installed the patched version and it looks
like the patch fixed my issue.

>
> 2013-04-24 23:17:16 UTC [26989]: [5360083-1] user= db= host= ERROR:  could
> not open file "global/14078": No such file or directory
> 2013-04-24 23:17:16 UTC [26989]: [5360084-1] user= db= host= CONTEXT:
> writing block 0 of relation global/14078
> 2013-04-24 23:17:16 UTC [26989]: [5360085-1] user= db= host= WARNING:  could
> not write block 0 of global/14078
> 2013-04-24 23:17:16 UTC [26989]: [5360086-1] user= db= host= DETAIL:
> Multiple failures --- write error might be permanent.
>
> I checked in global directory of master, the directory 14078 doesn't exist.
>
> Anyone has faced above issue ?
>
> Thanks



--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Memory usage after upgrade to 9.2.4

2013-04-24 Thread Adrian Klaver

On 04/24/2013 11:20 AM, Daniel Cristian Cruz wrote:

I've done an explain analyze under the test environment, and there is no
aggressive memory usage.

So I dropped the database in the new cluster and restored a fresh dump
from production (in theory, that's the difference between the two
environments).

Some minutes after I got an answer: after a dump / restore, there is no
problem. The same cluster just a dump/restore.

Since I had no idea on what is the problem, and the structure from the
dump of the bad database is equal to the dump from the production (it
differs only in check constraints where "((turma.situacao)::text = ANY
((ARRAY['Aguardando Atualização'::character varying, 'Em
Andamento'::character varying])::text[])))" became
"((turma.situacao)::text = ANY (ARRAY[('Aguardando
Atualização'::character varying)::text, ('Em Andamento'::character
varying)::text])))"), how can I report a issue in pg_upgrade?


The bug reporting page is here:

http://www.postgresql.org/support/submitbug/



Thanks, Adrian, Tomas and Tom. It's a specific issue, it's on backup
until May, 1st. Until there, if someone would like to know something
about it, just ask me, I would like to help removing an issue.






--
Daniel Cristian Cruz
クルズ クリスチアン ダニエル



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding template1 database

2013-04-24 Thread Adrian Klaver

On 04/24/2013 06:48 PM, S H wrote:

What should be the interval for calling vacuum on template1 and postgres
database in case

1. No table is created on template1.

Should i pass analyze parameter for template1 and postgres vacuum.


I would suggest reading this section of the docs:

http://www.postgresql.org/docs/9.2/interactive/routine-vacuuming.html

If anything is still unclear afterwards feel free to ask follow up 
questions.



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom session variables?

2013-04-24 Thread Fabrízio de Royes Mello
On Wed, Apr 24, 2013 at 9:47 PM, Darren Duncan wrote:

> Thank you for that.
>
> I had actually thought of this, but it seemed to me that using a temporary
> table was a heavy-handed approach and that a temporary scalar variable
> would be more efficient or less verbose to use.
>
> It is *a* solution certainly, and potentially a better one than the url I
> pointed to, but I was hoping for something less bulky.
>
> Granted, I realize that sequence generators are basically small tables
> like this behind the scenes, but is there any appreciable overhead of
> creating and dropping a temporary table for every session?
>
>
Hi,

Maybe you must see this extension [1] ;-)


[1] http://pgxn.org/dist/session_variables/

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


Re: [GENERAL] Regarding template1 database

2013-04-24 Thread S H
What should be the interval for calling vacuum on template1 and postgres 
database in case 
1. No table is created on template1.
Should i pass analyze parameter for template1 and postgres vacuum.  
  

Re: [GENERAL] custom session variables?

2013-04-24 Thread Darren Duncan

Thank you for that.

I had actually thought of this, but it seemed to me that using a temporary table 
was a heavy-handed approach and that a temporary scalar variable would be more 
efficient or less verbose to use.


It is *a* solution certainly, and potentially a better one than the url I 
pointed to, but I was hoping for something less bulky.


Granted, I realize that sequence generators are basically small tables like this 
behind the scenes, but is there any appreciable overhead of creating and 
dropping a temporary table for every session?


-- Darren Duncan

On 2013.04.24 5:07 PM, Christian Hammers wrote:

Hello

You could just use temporary tables like:

  BEGIN;
  CREATE TEMPORARY TABLE csid (i int);
-- somehow write the ID you want into that table
-- and then...
  INSERT INTO other_table (changeset_ids, msg) VALUES
((SELECT i FROM csid), 'Some log message');
  COMMIT;

When inserting a new changeset, you fill the temporary table
afterwards with curval('changeset_id_seq'). If you just want
to add more referring lines, you seem to know the ID already
can can just insert it. In any case, the following INSERTs into
the "other tables" will look exactly the same, which seem to
be your goal.

bye,

-christian-





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] custom session variables?

2013-04-24 Thread Christian Hammers
Hello

You could just use temporary tables like:

 BEGIN;
 CREATE TEMPORARY TABLE csid (i int);
-- somehow write the ID you want into that table
-- and then...
 INSERT INTO other_table (changeset_ids, msg) VALUES
   ((SELECT i FROM csid), 'Some log message');
 COMMIT;

When inserting a new changeset, you fill the temporary table
afterwards with curval('changeset_id_seq'). If you just want
to add more referring lines, you seem to know the ID already
can can just insert it. In any case, the following INSERTs into
the "other tables" will look exactly the same, which seem to
be your goal.

bye,

-christian-


Am Wed, 24 Apr 2013 16:17:58 -0700
schrieb Darren Duncan :

> Hello,
> 
> I would like to have one or more session-scoped global variables that
> are useable in a similar way to sequence generators, via analogies to 
> setval()+currval().
> 
> Here's a (simplified) scenario ...
> 
> Say that for auditing purposes all regular database tables have a
> changeset_id column, which is a foreign key into a changesets table
> that has extra columns like when_occurred and who_did_it and
> purpose_of_changes etc.  There is a sequence generator
> changeset_id_gen that is typically used to generate the changeset_id
> values.
> 
> During typical day to day use, when a set of data manipulation work
> is done to enact some application task, a changesets record is added
> with a newly generated changeset_id, and that changeset_id then used
> in the other records added/updated/deleted (there are also
> audit/history tables) to associate everything that was done as a
> logical unit for some task.
> 
> So in the changesets table we have this as its pk:
> 
>  changeset_id integer not null default
> nextval('changeset_id_gen'::regclass)
> 
> ... and in the other tables we have this as a non-pk field:
> 
>  changeset_id integer not null default
> currval('changeset_id_gen'::regclass)
> 
> ... or there may also be a trigger to similar effect of the latter,
> so it is changed for a record update too.
> 
> Now I understand that within a particular database session currval
> will fail if nextval or setval weren't called on that sequence
> generator yet.  And so I depend on this behavior to enforce a general
> business rule that a changesets record has to be inserted before
> other changes in the current session.
> 
> However, under some circumstances, we may want alternately to
> associate some regular changes with a prior changesets record, or
> otherwise with some changesets record whose changeset_id didn't come
> from the sequence generator.
> 
> To make the database simpler or cut down on verbosity, I would like
> in those cases to effectively setval() changeset_id_gen at the start
> of that changeset to some explicit value, so subsequent changes in
> that session can just use that value instead of some newly generated
> one.
> 
> Now here's the main point of this message ...
> 
> When I effectively setval() in the session, I want that change to
> only be visible in the session and have no effect outside of it, such
> as affecting what nextval() produces in other sessions.
> 
> It seemed to me that the proper way to do this would be to have some
> other session scope variable, say changeset_id_to_use, that is an
> intermediary between the sequence generator and nearly all the code
> using it.
> 
> So at the start of a session, changeset_id_to_use would be manually
> set by the application, either from nextval(changeset_id_gen) or from
> an explicit value, and then the table default defs or triggers would
> read changeset_id_to_use rather than reading
> currval(changeset_id_gen).
> 
> The changeset_id_to_use should start off null/missing at the start of
> the session so code that uses it will fail per business rules without
> it being set first.
> 
> I would like to know the proper/best way to declare and access the 
> aforementioned changeset_id_to_use variable?
> 
> I found a 2009 blog post 
> http://frefo.blogspot.ca/2009/04/session-variables-in-postgresql.html
> which looks relevant to what I want to do, but that looks more like
> abuse of the system rather than using it as intended, though I could
> be wrong.  Is there are better solution than that or what should I be
> doing?
> 
> Thank you in advance.
> 
> -- Darren Duncan
> 
> 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Replication terminated due to PANIC

2013-04-24 Thread Adarsh Sharma
Hi all,

I have a Postgresql 9.2 instance running on a CentOS6.3 box.Yesterday i
setup a hot standby by using pgbasebackup. Today i got the below  alert
from standby box :

[1] (from line 412,723)
2013-04-24 23:07:18 UTC [13445]: [6-1] user= db= host= PANIC:
_bt_restore_page: cannot add item to page


When i check, the replication is terminated due to slave DB shutdown. From
the logs i can see below messages :-

2013-04-24 23:17:16 UTC [26989]: [5360083-1] user= db= host= ERROR:  could
not open file "global/14078": No such file or directory
2013-04-24 23:17:16 UTC [26989]: [5360084-1] user= db= host= CONTEXT:
writing block 0 of relation global/14078
2013-04-24 23:17:16 UTC [26989]: [5360085-1] user= db= host= WARNING:
could not write block 0 of global/14078
2013-04-24 23:17:16 UTC [26989]: [5360086-1] user= db= host= DETAIL:
Multiple failures --- write error might be permanent.

I checked in global directory of master, the directory 14078 doesn't exist.

Anyone has faced above issue ?

Thanks


[GENERAL] custom session variables?

2013-04-24 Thread Darren Duncan

Hello,

I would like to have one or more session-scoped global variables that are 
useable in a similar way to sequence generators, via analogies to 
setval()+currval().


Here's a (simplified) scenario ...

Say that for auditing purposes all regular database tables have a changeset_id 
column, which is a foreign key into a changesets table that has extra columns 
like when_occurred and who_did_it and purpose_of_changes etc.  There is a 
sequence generator changeset_id_gen that is typically used to generate the 
changeset_id values.


During typical day to day use, when a set of data manipulation work is done to 
enact some application task, a changesets record is added with a newly generated 
changeset_id, and that changeset_id then used in the other records 
added/updated/deleted (there are also audit/history tables) to associate 
everything that was done as a logical unit for some task.


So in the changesets table we have this as its pk:

changeset_id integer not null default nextval('changeset_id_gen'::regclass)

... and in the other tables we have this as a non-pk field:

changeset_id integer not null default currval('changeset_id_gen'::regclass)

... or there may also be a trigger to similar effect of the latter, so it is 
changed for a record update too.


Now I understand that within a particular database session currval will fail if 
nextval or setval weren't called on that sequence generator yet.  And so I 
depend on this behavior to enforce a general business rule that a changesets 
record has to be inserted before other changes in the current session.


However, under some circumstances, we may want alternately to associate some 
regular changes with a prior changesets record, or otherwise with some 
changesets record whose changeset_id didn't come from the sequence generator.


To make the database simpler or cut down on verbosity, I would like in those 
cases to effectively setval() changeset_id_gen at the start of that changeset to 
some explicit value, so subsequent changes in that session can just use that 
value instead of some newly generated one.


Now here's the main point of this message ...

When I effectively setval() in the session, I want that change to only be 
visible in the session and have no effect outside of it, such as affecting what 
nextval() produces in other sessions.


It seemed to me that the proper way to do this would be to have some other 
session scope variable, say changeset_id_to_use, that is an intermediary between 
the sequence generator and nearly all the code using it.


So at the start of a session, changeset_id_to_use would be manually set by the 
application, either from nextval(changeset_id_gen) or from an explicit value, 
and then the table default defs or triggers would read changeset_id_to_use 
rather than reading currval(changeset_id_gen).


The changeset_id_to_use should start off null/missing at the start of the 
session so code that uses it will fail per business rules without it being set 
first.


I would like to know the proper/best way to declare and access the 
aforementioned changeset_id_to_use variable?


I found a 2009 blog post 
http://frefo.blogspot.ca/2009/04/session-variables-in-postgresql.html which 
looks relevant to what I want to do, but that looks more like abuse of the 
system rather than using it as intended, though I could be wrong.  Is there are 
better solution than that or what should I be doing?


Thank you in advance.

-- Darren Duncan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Set Returning Functions and array_agg()

2013-04-24 Thread Stephen Scheck
I'm guessing the reason is something like this: even though the "things"
returned by these two statements are the same logical entity (from a
mathematics/set theory standpoint):

pg_dev=# select * from unnest(array[1,2,3]);
 unnest

  1
  2
  3
(3 rows)

pg_dev=# select unnest(array[1,2,3]);
 unnest

  1
  2
  3
(3 rows)

The processing code-path for an aggregate function gets fed row-by-row and
is not just handed a complete set to work on. That would explain why
set-returning functions are allowed in the columns-clause (no general
prohibition on that) but not passable to aggregate functions.

But then, shouldn't it be possible to write something like array_agg that
takes a set as input and returns an array, that is not an aggregate
function, and is callable from the columns-clause?



On Wed, Apr 24, 2013 at 2:26 PM, Stephen Scheck wrote:

> Possibly due to my lack of thorough SQL understanding. Perhaps there's a
> better way of doing what I'm ultimately trying to accomplish, but still the
> question remains - why does this work:
>
> pg_dev=# select unnest(array[1,2,3]);
>  unnest
> 
>   1
>   2
>   3
> (3 rows)
>
> But not this:
>
> pg_dev=# select array_agg(unnest(array[1,2,3]));
> ERROR:  set-valued function called in context that cannot accept a set
>
> The solution to the problem is actually of less interest right now then in
> understanding what's going on in the two statements above. It seems a bit
> inconsistent to me. If an aggregate function cannot handle rows generated
> in the columns-part of the statement, then why is a single-column row(s)
> result acceptable in the first statement?
>
>
>
> On Wed, Apr 24, 2013 at 1:29 PM, hubert depesz lubaczewski <
> dep...@depesz.com> wrote:
>
>> On Wed, Apr 24, 2013 at 12:48:44PM -0700, Stephen Scheck wrote:
>> > I have a UDF (written in C) that returns SETOF RECORD of an anonymous
>> > record type
>> > (defined via OUT parameters). I'm trying to use array_agg() to transform
>> > its output to
>> > an array:
>> > pg_dev=# SELECT array_agg((my_setof_record_returning_func()).col1);
>> > ERROR:  set-valued function called in context that cannot accept a set
>>
>> Is there any reason why you're not using normal syntax:
>> select array_agg(col1) from my_setof_record_returning_func();
>> ?
>>
>> Best regards,
>>
>> depesz
>>
>>
>


Re: [GENERAL] Set Returning Functions and array_agg()

2013-04-24 Thread Stephen Scheck
Possibly due to my lack of thorough SQL understanding. Perhaps there's a
better way of doing what I'm ultimately trying to accomplish, but still the
question remains - why does this work:

pg_dev=# select unnest(array[1,2,3]);
 unnest

  1
  2
  3
(3 rows)

But not this:

pg_dev=# select array_agg(unnest(array[1,2,3]));
ERROR:  set-valued function called in context that cannot accept a set

The solution to the problem is actually of less interest right now then in
understanding what's going on in the two statements above. It seems a bit
inconsistent to me. If an aggregate function cannot handle rows generated
in the columns-part of the statement, then why is a single-column row(s)
result acceptable in the first statement?



On Wed, Apr 24, 2013 at 1:29 PM, hubert depesz lubaczewski <
dep...@depesz.com> wrote:

> On Wed, Apr 24, 2013 at 12:48:44PM -0700, Stephen Scheck wrote:
> > I have a UDF (written in C) that returns SETOF RECORD of an anonymous
> > record type
> > (defined via OUT parameters). I'm trying to use array_agg() to transform
> > its output to
> > an array:
> > pg_dev=# SELECT array_agg((my_setof_record_returning_func()).col1);
> > ERROR:  set-valued function called in context that cannot accept a set
>
> Is there any reason why you're not using normal syntax:
> select array_agg(col1) from my_setof_record_returning_func();
> ?
>
> Best regards,
>
> depesz
>
>


Re: [GENERAL] Set Returning Functions and array_agg()

2013-04-24 Thread hubert depesz lubaczewski
On Wed, Apr 24, 2013 at 12:48:44PM -0700, Stephen Scheck wrote:
> I have a UDF (written in C) that returns SETOF RECORD of an anonymous
> record type
> (defined via OUT parameters). I'm trying to use array_agg() to transform
> its output to
> an array:
> pg_dev=# SELECT array_agg((my_setof_record_returning_func()).col1);
> ERROR:  set-valued function called in context that cannot accept a set

Is there any reason why you're not using normal syntax:
select array_agg(col1) from my_setof_record_returning_func();
?

Best regards,

depesz



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DISTINCT ON changes sort order

2013-04-24 Thread Alexander Reichstadt
I think I solved it:

SELECT * FROM (SELECT DISTINCT ON(refid_messages) refid_messages as msgid, * 
FROM messagehistorywithcontent WHERE   
(lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' ')) 
LIKE '%gg%') ORDER BY refid_messages DESC) as foo ORDER BY messagekind ASC

Thanks
Alex


On 24.04.2013, at 16:03, Alexander Reichstadt wrote:

> Hi,
> 
> following a query:
> SELECT DISTINCT ON (msgid) msgid FROM (SELECT refid_messages as msgid FROM 
> messagehistorywithcontent WHERE 1=1 AND  
> (lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' 
> ')) LIKE '%gg%') ORDER BY messagekind DESC) as foo
> 
> This query rearranges the sort order though.
> 
> When I only execute the inner SELECT I get this:
> 
> 53
> 29
> 46
> 46
> 51
> 52
> 53
> 29
> 46
> 47
> 48
> 48
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 49
> 50
> 49
> 49
> 46
> 46
> 46
> 46
> 43
> 43
> 43
> 43
> 43
> 4
> (Ignore that the last entry is 4, it's a copy-paste error and should be 43 as 
> well. Anyway.)
> 
> 
> The order is correct. Now from the outer SELECT I would expect then to get:
> 53
> 29
> 46
> 51
> 52
> 53
> 46
> .
> .
> .
> .
> 43
> 
> But this is not the case. 43 is the id of the only record with messagekind 
> 'AM' where all other have messagekind 'PD'. Yet the order in which the full 
> query returns the results is:
> 29
> 43
> 46
> 47
> .
> .
> .
> 
> Which is wrong. I can't figure out why this is wrong, but from toying around 
> I found that depending on wether I use DISTINCT msgid or DISTINCT ON (msgid) 
> msgid I get different results. Still, both results are wrong.
> 
> The 'must sort by what you distinct on' behavior gives me a total headache. 
> What does one have to do to sort by a criterion and at the same time not use 
> that criterion in the distinct clause?
> 
> Thank you
> A. Reichtadt
> 



[GENERAL] Set Returning Functions and array_agg()

2013-04-24 Thread Stephen Scheck
Hi,

I have a UDF (written in C) that returns SETOF RECORD of an anonymous
record type
(defined via OUT parameters). I'm trying to use array_agg() to transform
its output to
an array:

pg_dev=# SELECT array_agg((my_setof_record_returning_func()).col1);
ERROR:  set-valued function called in context that cannot accept a set

Or the alternate syntax:

pg_dev=# SELECT array_agg(col1(my_setof_record_returning_func()));
ERROR:  set-valued function called in context that cannot accept a set

Can somebody explain why I get the error message? Presumably the parser is
deciding that
the expression provided to array_agg() is a set, based on
my_setof_record_returning_func()'s
definition. But shouldn't the column selection (or equivalent
column-as-func-call notation) supply
the expected context to array_agg()?

Thanks.


Re: [GENERAL] Memory usage after upgrade to 9.2.4

2013-04-24 Thread Daniel Cristian Cruz
I've done an explain analyze under the test environment, and there is no
aggressive memory usage.

So I dropped the database in the new cluster and restored a fresh dump from
production (in theory, that's the difference between the two environments).

Some minutes after I got an answer: after a dump / restore, there is no
problem. The same cluster just a dump/restore.

Since I had no idea on what is the problem, and the structure from the dump
of the bad database is equal to the dump from the production (it differs
only in check constraints where "((turma.situacao)::text = ANY
((ARRAY['Aguardando Atualização'::character varying, 'Em
Andamento'::character varying])::text[])))" became "((turma.situacao)::text
= ANY (ARRAY[('Aguardando Atualização'::character varying)::text, ('Em
Andamento'::character varying)::text])))"), how can I report a issue in
pg_upgrade?

Thanks, Adrian, Tomas and Tom. It's a specific issue, it's on backup until
May, 1st. Until there, if someone would like to know something about it,
just ask me, I would like to help removing an issue.


2013/4/23 Daniel Cristian Cruz 

> 2013/4/23 Adrian Klaver 
>
>> On 04/23/2013 04:23 AM, Daniel Cristian Cruz wrote:
>>
>>> 2013/4/22 Daniel Cristian Cruz >> >
>>>
>>>
>>> query1:
>>> EXPLAIN ANALYZE SELECT ced.id_evento, ced.inicio, ced.termino,
>>> ced.evento, ced.id_eventos IS NOT NULL AS aula_dividida, ac.titulo,
>>> ced.id_tipo_evento, ced.tipo_evento, ac.media_referencia, p.nome,
>>> ef.nome AS nomeEspacoFisico, ( SELECT count ( pre2.presente ) > 0
>>> FROM turma.presenca pre2 WHERE pre2.id_aula = ac.id_aula AND
>>> pre2.id_evento = ac.id_evento AND pre2.id_diario = '64469' ) AS
>>> presenca, ced.id_aula FROM recurso.consulta_evento_diario ced LEFT
>>> JOIN recurso.evento e USING ( id_evento ) LEFT JOIN
>>> recurso.espaco_fisico ef USING ( id_espaco_fisico ) LEFT JOIN
>>> turma.aula_calendario ac USING ( id_aula, id_evento ) LEFT JOIN
>>> recurso.evento_participante ep USING ( id_evento ) LEFT JOIN
>>> senai.pessoa p USING ( id_pessoa ) WHERE id_diario = '64469' AND
>>> ced.id_evento NOT IN ( SELECT ec.id_evento_sobreposto FROM
>>> recurso.evento_conflito ec WHERE ec.id_evento_sobreposto =
>>> ced.id_evento AND ec.ignorado IS NULL ) AND ced.inicio BETWEEN
>>> '2013-04-14 00:00:00' AND '2013-04-20 23:59:59.99' ORDER BY
>>> inicio;
>>>
>>> server 9.1:
>>> http://explain.depesz.com/s/**fmM 
>>>
>>> server 9.2:
>>> http://explain.depesz.com/s/**wXm 
>>>
>>> After run this one on server 9.2, RES memory reached 6.5GB, VIRT
>>> 15GB.
>>>
>>>
>>> Since there is no response, is this memory usage normal? The same query
>>> on version 9.1 doesn't use that much memory.
>>>
>>
>> Not sure how it applies but I noticed that a GroupAggregate in 9.1 that
>> took 1.22 secs became a a HashAggregate in the 9.2 query and took 12.54
>> secs.
>
>
> I used to read a explain and find something, but this one is huge.
> Unfortunately I'm still working on data migration from the 9.2 to 9.1 and
> didn't get time to read it in detail...
>
> I'm concerned about this because there is just only one report like
>>> that. Does someone else has the same pattern when using inherited tables?
>>>
>>
>> Also noticed that in your 9.2 production conf:
>>
>> (no constraint_exclusion set)
>>
>> Does this mean the default of 'partition' was left as is or that the
>> setting was set to 'off'?
>
>
> No, default:
>
> senai=# show constraint_exclusion ;
>  constraint_exclusion
> --
>  partition
> (1 row)
>
> --
> Daniel Cristian Cruz
> クルズ クリスチアン ダニエル
>



-- 
Daniel Cristian Cruz
クルズ クリスチアン ダニエル


Re: [GENERAL] Log messages regarding automatic vacuum and exclusive locks

2013-04-24 Thread jonesd

I'll give using TRUNCATE to clear the tables a try and see what happens.


Dominic Jones


Quoting Sergey Konoplev :


On Tue, Apr 23, 2013 at 8:50 AM,   wrote:

Good morning.  I'm seeing several of the following log messages each morning
(for example, there were five this morning, spaced approximately one minute
apart, with the closest interval between messages being 44 seconds).
They're occurring during a daily cycle of deleting all rows from a table and
then repopulating it from another database; the first message came


If all the rows are deleted from the table every time to repopulate it
later, I would suggest using TRUNCATE table_name; instead DELETE FROM
table_name. TRUNCATE does not left any free space in the table so
autovacuum will have nothing to do with it.


LOG:  automatic vacuum of table "x.y.z": could not (re)acquire exclusive
lock for truncate scan


Hence you will not get this messages and BTW it will be faster.



I'm using default settings for autovacuum, with server version 9.2.4.

Some other smaller tables produce the same messages during the same process,
but apparently less frequently.

Any thoughts on the cause?  Chapters 18 and 23 of the documentation provided
some hints, but nothing conclusive.


Thanks in advance,

Dominic Jones


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread Amador Alvarez
I would also give it a try on turning on statistics on service_id and
follower_id fields and tune collecting of distinct values for the optimizer.

Cheers,

Amador A.


On Wed, Apr 24, 2013 at 9:04 AM, François Beausoleil
wrote:

> Hi!
>
> Le 2013-04-17 à 14:15, Jeff Janes a écrit :
>
> On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil  > wrote:
>
>>
>>
>>  Insert on public.persona_followers  (cost=139261.12..20483497.65
>> rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
>>Buffers: shared hit=33135295 read=4776921
>>->  Subquery Scan on t1  (cost=139261.12..20483497.65 rows=6256498
>> width=16) (actual time=562265.156..578844.999 rows=6819520 loops=1)
>>
>
>
> It looks like 12% of the time is being spent figuring out what rows to
> insert, and 88% actually doing the insertions.
>
> So I think that index maintenance is killing you.  You could try adding a
> sort to your select so that rows are inserted in index order, or inserting
> in batches in which the batches are partitioned by service_id (which is
> almost the same thing as sorting, since service_id is the lead column)
>
>
>
> To close out the thread, the final results are in
> http://explain.depesz.com/s/xOe :
>
> Insert on public.persona_followers  (cost=149905.33..149906.58 rows=100
> width=24) (actual time=19.837..19.837 rows=0 loops=1)
>   Buffers: shared hit=206, local hit=1 read=105
>   ->  Sort  (cost=149905.33..149905.58 rows=100 width=24) (actual
> time=19.534..19.536 rows=6 loops=1)
> Output: persona_followers_import.service_id,
> persona_followers_import.follower_id,
> (min(persona_followers_import.valid_at))
> Sort Key: persona_followers_import.follower_id
> Sort Method: quicksort  Memory: 25kB
> Buffers: shared hit=176, local hit=1 read=105
> ->  HashAggregate  (cost=149901.01..149902.01 rows=100 width=24)
> (actual time=19.514..19.526 rows=6 loops=1)
>   Output: persona_followers_import.service_id,
> persona_followers_import.follower_id, min(persona_followers_import.valid_at)
>   Buffers: shared hit=176, local hit=1 read=105
>   ->  Bitmap Heap Scan on pg_temp_35.persona_followers_import
>  (cost=93051.86..149734.25 rows=22234 width=24) (actual time=14.350..19.505
> rows=6 loops=1)
> Output: persona_followers_import.service_id,
> persona_followers_import.follower_id, persona_followers_import.valid_at
> Recheck Cond: ((persona_followers_import.service_id =
> 362513855) AND (persona_followers_import.follower_id IS NOT NULL))
> Filter: (NOT (hashed SubPlan 1))
> Buffers: shared hit=176, local hit=1 read=105
> ->  Bitmap Index Scan on
> persona_followers_import_service_id  (cost=0.00..1134.32 rows=44469
> width=0) (actual time=1.752..1.752 rows=1 loops=1)
>   Index Cond:
> ((persona_followers_import.service_id = 362513855) AND
> (persona_followers_import.follower_id IS NOT NULL))
>   Buffers: local hit=1 read=40
> SubPlan 1
>   ->  Bitmap Heap Scan on public.persona_followers
>  (cost=661.54..91851.35 rows=24252 width=8) (actual time=2.309..6.400
> rows=14317 loops=1)
> Output: public.persona_followers.follower_id
> Recheck Cond:
> (public.persona_followers.service_id = 362513855)
> Buffers: shared hit=176
> ->  Bitmap Index Scan on
> persona_followers_pkey  (cost=0.00..655.48 rows=24252 width=0) (actual
> time=2.284..2.284 rows=14317 loops=1)
>   Index Cond:
> (public.persona_followers.service_id = 362513855)
>   Buffers: shared hit=88
> Total runtime: 19.917 ms
>
> Runtime is under 20 milliseconds, per imported service_id. I have a few
> thousand such items per day, and that's fine. The final script looks like
> this:
>
> CREATE TEMPORARY TABLE persona_followers_import( service_id bigint,
> follower_id bigint );
> COPY TO persona_followers_import FROM stdin;
> ...
> \.
>
> CREATE INDEX index_persona_followers_import_on_service_id ON
> persona_followers_import(service_id, follower_id);
>
> service_ids := SELECT DISTINCT service_id FROM persona_followers_import;
> for each service_id in service_ids:
>   EXPLAIN ( ANALYZE, VERBOSE, COSTS, BUFFERS )
>   INSERT INTO persona_followers(service_id, follower_id, valid_at)
> SELECT service_id, follower_id, MIN(valid_at)
> FROM persona_followers_import
> WHERE follower_id IS NOT NULL
>   AND follower_id NOT IN (SELECT follower_id FROM persona_followers
> WHERE service_id = :service_id)
>   AND service_id = :service_id
> GROUP BY service_id, follower_id
> ORDER BY follower_id
>
> This seems to give me the best possible throughput. I was able to import
> days of data in an hour, compared to hours of

Re: [GENERAL] Most efficient way to insert without duplicates

2013-04-24 Thread François Beausoleil
Hi!

Le 2013-04-17 à 14:15, Jeff Janes a écrit :

> On Wed, Apr 17, 2013 at 4:26 AM, François Beausoleil  
> wrote:
> 
> 
>  Insert on public.persona_followers  (cost=139261.12..20483497.65 
> rows=6256498 width=16) (actual time=4729255.535..4729255.535 rows=0 loops=1)
>Buffers: shared hit=33135295 read=4776921
>->  Subquery Scan on t1  (cost=139261.12..20483497.65 rows=6256498 
> width=16) (actual time=562265.156..578844.999 rows=6819520 loops=1)
> 
> 
> It looks like 12% of the time is being spent figuring out what rows to 
> insert, and 88% actually doing the insertions.
> 
> So I think that index maintenance is killing you.  You could try adding a 
> sort to your select so that rows are inserted in index order, or inserting in 
> batches in which the batches are partitioned by service_id (which is almost 
> the same thing as sorting, since service_id is the lead column)


To close out the thread, the final results are in 
http://explain.depesz.com/s/xOe :

Insert on public.persona_followers  (cost=149905.33..149906.58 rows=100 
width=24) (actual time=19.837..19.837 rows=0 loops=1)
  Buffers: shared hit=206, local hit=1 read=105
  ->  Sort  (cost=149905.33..149905.58 rows=100 width=24) (actual 
time=19.534..19.536 rows=6 loops=1)
Output: persona_followers_import.service_id, 
persona_followers_import.follower_id, (min(persona_followers_import.valid_at))
Sort Key: persona_followers_import.follower_id
Sort Method: quicksort  Memory: 25kB
Buffers: shared hit=176, local hit=1 read=105
->  HashAggregate  (cost=149901.01..149902.01 rows=100 width=24) 
(actual time=19.514..19.526 rows=6 loops=1)
  Output: persona_followers_import.service_id, 
persona_followers_import.follower_id, min(persona_followers_import.valid_at)
  Buffers: shared hit=176, local hit=1 read=105
  ->  Bitmap Heap Scan on pg_temp_35.persona_followers_import  
(cost=93051.86..149734.25 rows=22234 width=24) (actual time=14.350..19.505 
rows=6 loops=1)
Output: persona_followers_import.service_id, 
persona_followers_import.follower_id, persona_followers_import.valid_at
Recheck Cond: ((persona_followers_import.service_id = 
362513855) AND (persona_followers_import.follower_id IS NOT NULL))
Filter: (NOT (hashed SubPlan 1))
Buffers: shared hit=176, local hit=1 read=105
->  Bitmap Index Scan on 
persona_followers_import_service_id  (cost=0.00..1134.32 rows=44469 width=0) 
(actual time=1.752..1.752 rows=1 loops=1)
  Index Cond: ((persona_followers_import.service_id = 
362513855) AND (persona_followers_import.follower_id IS NOT NULL))
  Buffers: local hit=1 read=40
SubPlan 1
  ->  Bitmap Heap Scan on public.persona_followers  
(cost=661.54..91851.35 rows=24252 width=8) (actual time=2.309..6.400 rows=14317 
loops=1)
Output: public.persona_followers.follower_id
Recheck Cond: (public.persona_followers.service_id 
= 362513855)
Buffers: shared hit=176
->  Bitmap Index Scan on persona_followers_pkey  
(cost=0.00..655.48 rows=24252 width=0) (actual time=2.284..2.284 rows=14317 
loops=1)
  Index Cond: 
(public.persona_followers.service_id = 362513855)
  Buffers: shared hit=88
Total runtime: 19.917 ms

Runtime is under 20 milliseconds, per imported service_id. I have a few 
thousand such items per day, and that's fine. The final script looks like this:

CREATE TEMPORARY TABLE persona_followers_import( service_id bigint, follower_id 
bigint );
COPY TO persona_followers_import FROM stdin;
...
\.

CREATE INDEX index_persona_followers_import_on_service_id ON 
persona_followers_import(service_id, follower_id);

service_ids := SELECT DISTINCT service_id FROM persona_followers_import;
for each service_id in service_ids:
  EXPLAIN ( ANALYZE, VERBOSE, COSTS, BUFFERS )
  INSERT INTO persona_followers(service_id, follower_id, valid_at)
SELECT service_id, follower_id, MIN(valid_at)
FROM persona_followers_import
WHERE follower_id IS NOT NULL
  AND follower_id NOT IN (SELECT follower_id FROM persona_followers WHERE 
service_id = :service_id)
  AND service_id = :service_id
GROUP BY service_id, follower_id
ORDER BY follower_id

This seems to give me the best possible throughput. I was able to import days 
of data in an hour, compared to hours of work for one day of data.

Thanks for all suggestions, and PostgreSQL rocks!
François Beausoleil

smime.p7s
Description: S/MIME cryptographic signature


[GENERAL] CLUSTER, REINDEX and VACUUM on batch ops

2013-04-24 Thread François Beausoleil
Hi all!

I have a partitioned table with millions of rows per weekly partition. I am 
adding new fields, with null values and no default values to ensure I had a 
reasonable migration time. All downstream code knows how to work with null 
fields.

Presently, I'm migrating each partition individually to add NOT NULL, set a 
default value and update the table to have correct values. Essentially, I'm 
doing this:

ALTER TABLE parent ADD COLUMN new_field int; -- adds the field to all child 
tables - runs quickly

-- the bulk of the data transfer
for each partition in partitions:
  BEGIN;
  UPDATE partition SET new_field = 0;
  ALTER TABLE partition
  ALTER COLUMN new_field SET NOT NULL
, ALTER COLUMN new_field SET DEFAULT 0;
  COMMIT;

  CLUSTER partition USING partition_pkey;
  REINDEX TABLE partition;
  VACUUM ANALYZE partition;
done

After I've clustered the table, must I reindex and vacuum as well? It is 
unclear to me if clustering a table reindexes or not: the docs at 
http://www.postgresql.org/docs/current/static/sql-cluster.html are silent on 
the matter, but do mention that an ANALYZE is in order.

Thanks!
François Beausoleil

smime.p7s
Description: S/MIME cryptographic signature


Re: [GENERAL] run COPY as user other than postgres

2013-04-24 Thread Kirk Wythers

On Apr 24, 2013, at 6:14 AM, Bill Moran  wrote:
>>> 
> 
> Write your own client that uses the copy interface to
> load a file from wherever and send it to the server.
> 
> Or just use the one built in to psql, as Jasen suggested.
> 

I am using "copy to" to write data from the db out to csv files. The psql 
suggestion of using \copy worked like a charm. 





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DISTINCT ON changes sort order

2013-04-24 Thread Kevin Grittner
Alexander Reichstadt  wrote:

> SELECT
>   DISTINCT ON (msgid)
> msgid
>   FROM (
>  SELECT refid_messages as msgid
>    FROM messagehistorywithcontent
>    WHERE 1=1
>  AND 
>(lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' ')) 
>LIKE '%gg%')
>    ORDER BY messagekind DESC
>    ) as foo

> [does not return results in the order of in subquery]

> Which is wrong.

No, your query is underspecified.  The subquery in the FROM clause
is an input relation to the top-level query, which then does a form
of aggregation on that.  There are no guarantees that a SELECT
statement will return values in an order based on any of its input
relations, unless you specify it that way.

All the top-level SELECT sees is a list of msgid values, and you
are asking it to eliminate duplicates to return a set of distinct
values.  It is doing exactly that in the way which it estimates
will be the fastest.  If you actually want something different from
that, you must specify it.

I'm not any more sure than the planner what it is that you *really*
want, but maybe this?:

SELECT msgid
  FROM (
 SELECT distinct messagekind, refid_messages as msgid
   FROM messagehistorywithcontent
   WHERE 
(lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' ')) 
LIKE '%gg%')
   ) as foo
  ORDER BY
    messagekind DESC,
    msgid;

If that doesn't give you what you're after, we need a better
explanation of what you want.  A self-contained test case, where
you create a table and load it with data and show a query, would be
best.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] DISTINCT ON changes sort order

2013-04-24 Thread Shaun Thomas

On 04/24/2013 09:03 AM, Alexander Reichstadt wrote:


The order is correct. Now from the outer SELECT I would expect then to get:
53
29
46


Please re-read the manual on DISTINCT ON.

"SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of 
each set of rows where the given expressions evaluate to equal. The 
DISTINCT ON expressions are interpreted using the same rules as for 
ORDER BY (see above). Note that the "first row" of each set is 
unpredictable unless ORDER BY is used to ensure that the desired row 
appears first."


You're running into the "unpredictable" part in that stanza. The 
distinct on may change the query plan, which will result in a different 
ordering of tuples than your inner clause, unless you used an 
optimization fence.


Use an order-by clause. It's good practice to choose the column you want 
so that it always comes first, and eliminates other candidates.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] DISTINCT ON changes sort order

2013-04-24 Thread Alexander Reichstadt
Hi,

following a query:
SELECT DISTINCT ON (msgid) msgid FROM (SELECT refid_messages as msgid FROM 
messagehistorywithcontent WHERE 1=1 AND  
(lower(ARRAY_TO_STRING(ARRAY[login,firstname,lastname,content,msgtitle], ' ')) 
LIKE '%gg%') ORDER BY messagekind DESC) as foo

This query rearranges the sort order though.

When I only execute the inner SELECT I get this:

53
29
46
46
51
52
53
29
46
47
48
48
49
49
49
49
49
49
49
49
49
49
49
49
49
49
49
49
49
50
49
49
46
46
46
46
43
43
43
43
43
4
(Ignore that the last entry is 4, it's a copy-paste error and should be 43 as 
well. Anyway.)


The order is correct. Now from the outer SELECT I would expect then to get:
53
29
46
51
52
53
46
.
.
.
.
43

But this is not the case. 43 is the id of the only record with messagekind 'AM' 
where all other have messagekind 'PD'. Yet the order in which the full query 
returns the results is:
29
43
46
47
.
.
.

Which is wrong. I can't figure out why this is wrong, but from toying around I 
found that depending on wether I use DISTINCT msgid or DISTINCT ON (msgid) 
msgid I get different results. Still, both results are wrong.

The 'must sort by what you distinct on' behavior gives me a total headache. 
What does one have to do to sort by a criterion and at the same time not use 
that criterion in the distinct clause?

Thank you
A. Reichtadt



Re: [GENERAL] pg_stop_backup running for 2h10m?

2013-04-24 Thread François Beausoleil

Le 2013-04-24 à 09:15, hubert depesz lubaczewski a écrit :

> On Tue, Apr 23, 2013 at 03:08:52PM -0400, François Beausoleil wrote:
>> I used omnipitr to launch a base backup, but I fumbled a couple of
>> things, so I Ctrl+C'd *once* the console where I had
>> omnipitr-backup-master running. omnipitr-backup-master correctly
>> launched pg_stop_backup, but pg_stop_backup has been active for 2h10
>> minutes, as reported by pg_stat_activity.
> 
> Most likely your archive_command is not doing its job. Why - hard to
> tell without knowing more about the setup, but that's the direction you
> should be looking in.

Yes, after looking at more logs, I found out I had a permission issue. I ran 
backup-master as root, and omnipitr-archive wasn't able to write the the local 
backup directory. There should be a mention of who should run the tool in the 
docs, as I was confused.

Thanks!
François

smime.p7s
Description: S/MIME cryptographic signature


Re: [GENERAL] Regarding template1 database

2013-04-24 Thread Adrian Klaver

On 04/24/2013 04:57 AM, S H wrote:

 $ sudo -u postgres psql template1
 psql (9.2.4)
 Type "help" for help.

 template1=# \d
 No relations found.

I am getting as mentioned above. I am not sure what is taking vacuuming
long time.


If you are getting 'No relations found' then there is nothing in template1.

From psql try:

\l+

This will show a list of databases in the cluster and their sizes.



- What is the recommendation of vacuuming for wraparound issue for
template1 - Once in month/week/year?

- Is it required to run analyze on template1 and postgres database
(system databases).

Since it is already in production i can maximum change the interval and
vacuum options and nothing more...



--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgdump error "Could not open file pg_clog/0B8E: No such file or directory"

2013-04-24 Thread Adrian Klaver

On 04/24/2013 03:35 AM, jesse.wat...@gmail.com wrote:

I am receiving an error when running a pg_dump. These are older legacy
systems and upgrading them is not in plan.
Any help will be appreciated.


CentOS 5.3 (64bit)
psql (PostgreSQL) 8.3.11

full error message:
  pg_dump: SQL command failed
  pg_dump: Error message from server: ERROR:  could not access status
of transaction 3101708884
  DETAIL:  Could not open file pg_clog/0B8E: No such file or directory.


Does the above file actually exist in the pg_clog directory?


  pg_dump: The command was: COPY blob_store.blobs (blob_id, mime_type,
binary_data, create_ts) TO stdout;

Command used, pretty standard
pg_dump -Fc  -f 


TIA,

  Jesse Waters





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stop_backup running for 2h10m?

2013-04-24 Thread hubert depesz lubaczewski
On Tue, Apr 23, 2013 at 03:08:52PM -0400, François Beausoleil wrote:
> I used omnipitr to launch a base backup, but I fumbled a couple of
> things, so I Ctrl+C'd *once* the console where I had
> omnipitr-backup-master running. omnipitr-backup-master correctly
> launched pg_stop_backup, but pg_stop_backup has been active for 2h10
> minutes, as reported by pg_stat_activity.

Most likely your archive_command is not doing its job. Why - hard to
tell without knowing more about the setup, but that's the direction you
should be looking in.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding template1 database

2013-04-24 Thread S H
$ sudo -u postgres psql template1
psql (9.2.4)
Type "help" for help.

template1=# \d
No relations found.

I am getting as mentioned above. I am not sure what is taking vacuuming long 
time.
- What is the recommendation of vacuuming for wraparound issue for template1 - 
Once in month/week/year?
- Is it required to run analyze on template1 and postgres database (system 
databases).
Since it is already in production i can maximum change the interval and vacuum 
options and nothing more...

[GENERAL] Confusing error message.

2013-04-24 Thread Dmitriy Igrishin
Hey,

It seems to me, that this is confusing:

dmitigr=> create schema test;
CREATE SCHEMA
dmitigr=> create table test.test();
CREATE TABLE
dmitigr=> table "test.test";
ERROR:  relation "test.test" does not exist
LINE 1: table "test.test";
  ^
dmitigr=> table test.test1;
ERROR:  relation "test.test1" does not exist
LINE 1: table test.test1;

Shouldn't be first error looks like:

ERROR:  relation ""test.test"" does not exist
LINE 1: table "test.test";
  ^
(with quoted relation name *as specified in the query*)

I've spend some time to find a bug in the application,
which performed query with entire quoted schema-qualified
relation name (i.e. "schema.relation" instead of "schema"."relation"
or just schema.relation), and the error handler printed to the log an
error message a confusing message.

Thoughts?

-- 
// Dmitriy.


Re: [GENERAL] run COPY as user other than postgres

2013-04-24 Thread Bill Moran

> On 2013-04-23, Kirk Wythers  wrote:
> > I would like to run the COPY command as a user other than "postgres". I 
> > find it a bit of a pain (or at least requiring an extra step or two) to 
> > have the postgres user own the files that I am creating with COPY TO. Here 
> > is a simple example where the location '/some/path/to/file/file.csv' is 
> > owned by another user and it would be very spiffy if I could run the COPY 
> > TO as that user. Any ideas? 

Write your own client that uses the copy interface to
load a file from wherever and send it to the server.

Or just use the one built in to psql, as Jasen suggested.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] run COPY as user other than postgres

2013-04-24 Thread Jasen Betts
On 2013-04-23, Kirk Wythers  wrote:
> I would like to run the COPY command as a user other than "postgres". I find 
> it a bit of a pain (or at least requiring an extra step or two) to have the 
> postgres user own the files that I am creating with COPY TO. Here is a simple 
> example where the location '/some/path/to/file/file.csv' is owned by another 
> user and it would be very spiffy if I could run the COPY TO as that user. Any 
> ideas? 
>
>
>
> COPY (
>   SELECT * FROM
>   some_table
>   WHERE
>   2012 = EXTRACT (YEAR FROM some_column) --AND value IS NOT NULL
> ) 
>   TO '/some/path/to/file/file.csv' WITH CSV HEADER;

Can you use the \copy macro in psql?



-- 
⚂⚃ 100% natural



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pgdump error "Could not open file pg_clog/0B8E: No such file or directory"

2013-04-24 Thread jesse . waters
I am receiving an error when running a pg_dump. These are older legacy
systems and upgrading them is not in plan.
Any help will be appreciated.


CentOS 5.3 (64bit)
psql (PostgreSQL) 8.3.11

full error message:
 pg_dump: SQL command failed
 pg_dump: Error message from server: ERROR:  could not access status
of transaction 3101708884
 DETAIL:  Could not open file pg_clog/0B8E: No such file or directory.
 pg_dump: The command was: COPY blob_store.blobs (blob_id, mime_type,
binary_data, create_ts) TO stdout;

Command used, pretty standard
pg_dump -Fc  -f 


TIA,

 Jesse Waters


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regression test fails v9.2.4

2013-04-24 Thread Manning John
>> [regression tests have different plans or row orderings]

>> It seems that the problem only occurs when configuring the make
>> with these settings :
>> 
>> --with-libraries=/lib64 --with-blocksize=2 --with-wal-blocksize=2

>> is this problem common, i.e. the expected results files need to
>> be changed ... ?

>I don't find it too surprising that a different page size could
>cause different plans to be chosen or different row orderings. 
>After verifying that the differences are benign, you could add
>alternative "expected" files for your build environment.

>I can't help being a little curious why you are overriding these
>defaults.

Thanks Kevin, we made the changes for query performance reasons. Our schema is 
such that the changes above make enough of an impact to make it worthwhile.

I think we're now looking at a two-phase validation process; an initial 
regression test on any new version we take then a further check using our 
settings. I guess it's actually more useful that we must now inspect the 
results since it might help us to squeeze even more performance boosts out of 
Postgresql.

All the best, John



Unless otherwise stated, this email has been sent from Fujitsu Services 
Limited, from Fujitsu (FTS) Limited, or from Fujitsu Telecommunications Europe 
Limited, together "Fujitsu".

This email is only for the use of its intended recipient.  Its contents are 
subject to a duty of confidence and may be privileged.  Fujitsu does not 
guarantee that this email has not been intercepted and amended or that it is 
virus-free.

Fujitsu Services Limited, registered in England No 96056, registered office 22 
Baker Street, London W1U 3BW.

Fujitsu (FTS) Limited, registered in England No 03808613, registered office 22 
Baker Street, London W1U 3BW.

PFU Imaging Solutions Europe Limited, registered in England No 1578652, 
registered office Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE.

Fujitsu Telecommunications Europe Limited, registered in England No 2548187, 
registered office Solihull Parkway, Birmingham Business Park, Birmingham, B37 
7YU.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] is there a way to deliver an array over column from a query window?

2013-04-24 Thread Rafał Pietrak

W dniu 03/24/2013 12:11 PM, Rafał Pietrak pisze:

W dniu 03/24/2013 12:06 PM, Misa Simic pisze:

maybe,

SELECT DISTINCT issuer,amount, array_agg(REFERENCE) over (partition 
by invoice_nr) from invoices;


RIGHT. Thenx. (and the first thing I did, I've read the doc on 
array_agg() what stress makes from people :(




Actually, I have a problem with that (which I haven't noticed earlier 
because the data I'm having, don't have to many "duplicates" that cause 
it). The problem is, that:

--
SELECT DISTINCT a, b, c, array_agg(k.d) OVER (PARTITION BY k.c ) FROM 
testy k where k.e <> 'email' and k.c='1035049' ;

  a   |b |c|   array_agg
--+--+-+---
 1035 | 10410053 | 1035049 | {9902031328529,5951948640868}
---

is _almost_ fine. But I actually need to have control over the order in 
which the array gathered its values. So I try:


SELECT DISTINCT a, b, c, array_agg(k.d) OVER (PARTITION BY k.c ORDER BY 
k.e)  FROM testy k where k.e <> 'email' and k.c='1035049' ;

  a   |b |c|   array_agg
--+--+-+---
 1035 | 10410053 | 1035049 | {5951948640868}
 1035 | 10410053 | 1035049 | {5951948640868,9902031328529}
(2 rows)
--

And this is not at all what I've expected - the aggerate function 
returned different values over the selected partition.


I understand, that this behavior (of changing the aggregate function 
return values) is there for the purpose of having sum() - and the like - 
aggregate functions return accumulating/averaged/etc values as of the 
example in postgres documentation ( 
http://www.postgresql.org/docs/9.1/static/tutorial-window.html)


But the array_agg() is significantly different from other aggregate 
functions - it maintains all the trasspassed values within; under such 
circumstances: is it reasonable to copy that functionality (of PARTITION 
OVER ... ORDER BY...) in it?
A particular value relevant to a particular row (when SELECT withiout 
DISTINCT) can be retrieved by RANK() function used as an index into the 
resulting array.


But, if (unfortunately) this functionality have to stay: Can somebody 
pls help me cooking an SQL that returns the same value of array_agg() 
over the entire partition, while letting me control the order of 
aggregated values, based on the order of column "E"?


My table for the showcase was:
---
SELECT * FROM testy;
  a   |b |c|  d   |   e
--+--+-+--+---
 1035 | 10410053 | 1035049 | 9902031328529| tel
 1035 | 10410053 | 1035049 | 5291286...@gmail.com | email
 1035 | 10410053 | 1035049 | 5951948640868| tel2
(3 rows)
--

thx

-R


[GENERAL] session_replication_role 'replica' behavior

2013-04-24 Thread Manos Tsahakis
Hello all,

In our application we are enabling session_replication_role TO 'replica' in
certain situations so that triggers will not fire in a table during DML
operations. However, we observed that when setting session_replication_role
TO 'replica' referential integrity constraints will not fire on a table
either.

A simple example is given bellow:

dynacom=# create table parent (id serial primary key, name text not null);

dynacom=# create table child (id serial primary key, name text not null,pid
int NOT NULL REFERENCES parent(id) ON DELETE CASCADE);

dynacom=# insert into parent (name) values ('test 1');
INSERT 0 1

dynacom=# insert into parent (name) values ('test 2');
INSERT 0 1

dynacom=# insert into child (name,pid) values ('test kid2',2);
INSERT 0 1
dynacom=# begin ;
BEGIN
dynacom=# set session_replication_role TO 'replica';
SET
dynacom=# delete from parent where id=2;
DELETE 1
dynacom=# commit ;
COMMIT

dynacom=# select * from child;
 id |   name| pid
+---+-
  2 | test kid2 |   2
(1 row)

dynacom=# select * from parent;
 id | name
+--
(0 rows)

So we are a left, basically, with an inconsistent database.

1. 9.2 documentation (
http://www.postgresql.org/docs/9.2/static/sql-altertable.html) in the "
DISABLE/ENABLE [ REPLICA | ALWAYS ] TRIGGER" section, makes a distinction
between USER (non system-constraint related) and ALL triggers, but does not
state that simply(??) enabled system (non-user) constraint triggers will
not fire in case of session_replication_role = replica. Shouldn't non-user
triggers *not* be affected by session_replication_role ?

2. Is there any way to just find the name of the FK constraint trigger and
convert it to
ENABLE ALWAYS?

For the above test we used postgresql 9.2, currently we are running
postgresql 9.0 in production.

Kind Regards,
manos