Re: [GENERAL] pg_sample

2016-10-18 Thread Adrian Klaver

On 10/18/2016 08:15 PM, Charles Clavadetscher wrote:

Hello

On 10/19/2016 04:58 AM, Greg Sabino Mullane wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Patrick B  writes:
...

However, this new database test server doesn't need to have all the
data. I
would like to have only the first 100 rows(example) of each table in my
database.

...

This should do what you ask.

If the order does not matter, leave out the ORDER BY.

This assumes everything of interest is in the public schema.

$ createdb testdb
$ pg_dump realdb --schema-only | psql -q testdb
$ psql realdb

psql> \o dump.some.rows.sh
psql> select format($$psql realdb -c 'COPY (select * from %I order by
1 limit %s) TO STDOUT' | psql testdb -c 'COPY %I FROM STDIN' $$,
table_name, 100, table_name)
  from information_schema.tables where table_schema = 'public' and
table_type = 'BASE TABLE';
psql> \q

$ sh dump.some.rows.sh


I may be overseeing something, but what about dependencies between
tables, sequencies, indexes, etc.? I guess that if one takes the first
100 rows of a table referenced by another table, there is no guarantee
that in the first 100 rows of the referencing table there will not be
some foreign key that does not exist.


Well there is:

https://github.com/18F/rdbms-subsetter

That still does not guarantee that the rows selected cover your test 
cases though.




Regards
Charles



- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201610182256
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlgG4NkACgkQvJuQZxSWSsge4ACePhBOBtBFnGNxXt5qpY7X+w3o
d04AoKTzAgxcaqy8qfIE0LPuzG9x0KIU
=sS+m
-END PGP SIGNATURE-









--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_sample

2016-10-18 Thread Charles Clavadetscher

Hello

On 10/19/2016 04:58 AM, Greg Sabino Mullane wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Patrick B  writes:
...

However, this new database test server doesn't need to have all the data. I
would like to have only the first 100 rows(example) of each table in my
database.

...

This should do what you ask.

If the order does not matter, leave out the ORDER BY.

This assumes everything of interest is in the public schema.

$ createdb testdb
$ pg_dump realdb --schema-only | psql -q testdb
$ psql realdb

psql> \o dump.some.rows.sh
psql> select format($$psql realdb -c 'COPY (select * from %I order by 1 limit 
%s) TO STDOUT' | psql testdb -c 'COPY %I FROM STDIN' $$, table_name, 100, 
table_name)
  from information_schema.tables where table_schema = 'public' and 
table_type = 'BASE TABLE';
psql> \q

$ sh dump.some.rows.sh


I may be overseeing something, but what about dependencies between 
tables, sequencies, indexes, etc.? I guess that if one takes the first 
100 rows of a table referenced by another table, there is no guarantee 
that in the first 100 rows of the referencing table there will not be 
some foreign key that does not exist.


Regards
Charles



- --
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201610182256
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlgG4NkACgkQvJuQZxSWSsge4ACePhBOBtBFnGNxXt5qpY7X+w3o
d04AoKTzAgxcaqy8qfIE0LPuzG9x0KIU
=sS+m
-END PGP SIGNATURE-






--
Swiss PostgreSQL Users Group
c/o Charles Clavadetscher
Treasurer
Motorenstrasse 18
CH – 8005 Zürich

http://www.swisspug.org

+---+
|     __  ___   |
|  /)/  \/   \  |
| ( / ___\) |
|  \(/ o)  ( o)   ) |
|   \_  (_  )   \ ) _/  |
| \  /\_/\)/|
|  \/ |
|   _|  |   |
|   \|_/|
|   |
| PostgreSQL 1996-2016  |
|  20 Years of Success  |
|   |
+---+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_sample

2016-10-18 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Patrick B  writes:
...
> However, this new database test server doesn't need to have all the data. I
> would like to have only the first 100 rows(example) of each table in my
> database.
...

This should do what you ask.

If the order does not matter, leave out the ORDER BY.

This assumes everything of interest is in the public schema.

$ createdb testdb
$ pg_dump realdb --schema-only | psql -q testdb
$ psql realdb

psql> \o dump.some.rows.sh
psql> select format($$psql realdb -c 'COPY (select * from %I order by 1 limit 
%s) TO STDOUT' | psql testdb -c 'COPY %I FROM STDIN' $$, table_name, 100, 
table_name)
  from information_schema.tables where table_schema = 'public' and 
table_type = 'BASE TABLE';
psql> \q

$ sh dump.some.rows.sh

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201610182256
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlgG4NkACgkQvJuQZxSWSsge4ACePhBOBtBFnGNxXt5qpY7X+w3o
d04AoKTzAgxcaqy8qfIE0LPuzG9x0KIU
=sS+m
-END PGP SIGNATURE-




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_sample

2016-10-18 Thread Melvin Davidson
On Tue, Oct 18, 2016 at 10:21 PM, Adrian Klaver 
wrote:

> On 10/18/2016 06:30 PM, Patrick B wrote:
>
>>
>>
>> 2016-10-19 13:39 GMT+13:00 Michael Paquier > >:
>>
>> On Wed, Oct 19, 2016 at 9:24 AM, Patrick B > > wrote:
>> > However, this new database test server doesn't need to have all the
>> data. I
>> > would like to have only the first 100 rows(example) of each table
>> in my
>> > database.
>> >
>> > I'm using pg_sample to do that, but unfortunately it doesn't work
>> well.
>> > It doesn't get the first 100 rows. It gets random 100 rows.
>>
>> Why aren't 100 random rows enough to fulfill what you are looking for?
>> What you are trying here is to test the server with some sample data,
>> no? In this case, having the first 100 rows, or a set of random ones
>> should not matter much (never tried pg_sample to be honest).
>> --
>> Michael
>>
>>
>>
>> Actually it does matter because there is some essential data that has to
>> be in there so the code can work.
>>
>
> Well random does not know essential, it is after all random. If you want
> to test specific cases then you will need to build appropriate data sets.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

The following query should generate statements you can use to get the first
100 rows of every table.
You may need to tweak a bit as order is not guaranteed.

SELECT 'COPY ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname)
|| ' TO '''
--|| 'C:\temp\'
|| '/tmp/'
|| quote_ident(n.nspname) || '_' || quote_ident(c.relname) || '.csv' ||

|| ' WITH CSV HEADER FORCE_QUOTE *;'
  FROM pg_class c
  JOIN pg_namespace n ON n.oid = c.relnamespace
 WHERE relkind = 'r'
   AND relname NOT LIKE 'pg_%'
   AND relname NOT LIKE 'sql_%'
   LIMIT 100;

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] pg_sample

2016-10-18 Thread Adrian Klaver

On 10/18/2016 06:30 PM, Patrick B wrote:



2016-10-19 13:39 GMT+13:00 Michael Paquier >:

On Wed, Oct 19, 2016 at 9:24 AM, Patrick B > wrote:
> However, this new database test server doesn't need to have all the data. 
I
> would like to have only the first 100 rows(example) of each table in my
> database.
>
> I'm using pg_sample to do that, but unfortunately it doesn't work well.
> It doesn't get the first 100 rows. It gets random 100 rows.

Why aren't 100 random rows enough to fulfill what you are looking for?
What you are trying here is to test the server with some sample data,
no? In this case, having the first 100 rows, or a set of random ones
should not matter much (never tried pg_sample to be honest).
--
Michael



Actually it does matter because there is some essential data that has to
be in there so the code can work.


Well random does not know essential, it is after all random. If you want 
to test specific cases then you will need to build appropriate data sets.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_sample

2016-10-18 Thread Patrick B
2016-10-19 13:39 GMT+13:00 Michael Paquier :

> On Wed, Oct 19, 2016 at 9:24 AM, Patrick B 
> wrote:
> > However, this new database test server doesn't need to have all the
> data. I
> > would like to have only the first 100 rows(example) of each table in my
> > database.
> >
> > I'm using pg_sample to do that, but unfortunately it doesn't work well.
> > It doesn't get the first 100 rows. It gets random 100 rows.
>
> Why aren't 100 random rows enough to fulfill what you are looking for?
> What you are trying here is to test the server with some sample data,
> no? In this case, having the first 100 rows, or a set of random ones
> should not matter much (never tried pg_sample to be honest).
> --
> Michael
>


Actually it does matter because there is some essential data that has to be
in there so the code can work.


Re: [GENERAL] pg_sample

2016-10-18 Thread Michael Paquier
On Wed, Oct 19, 2016 at 9:24 AM, Patrick B  wrote:
> However, this new database test server doesn't need to have all the data. I
> would like to have only the first 100 rows(example) of each table in my
> database.
>
> I'm using pg_sample to do that, but unfortunately it doesn't work well.
> It doesn't get the first 100 rows. It gets random 100 rows.

Why aren't 100 random rows enough to fulfill what you are looking for?
What you are trying here is to test the server with some sample data,
no? In this case, having the first 100 rows, or a set of random ones
should not matter much (never tried pg_sample to be honest).
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] json rendering without pretty option (compact)

2016-10-18 Thread Michael Paquier
On Wed, Oct 19, 2016 at 2:00 AM, Nicolas Paris  wrote:
> I want to minimize postgresql json size when I fetch them.
> I translate columnar table to json thought json_build_object/array or even
> row_to_jeon.
>
> While row_to_json do have a "pretty_bool" option, the latter do not. Each
> json object/array I build contains spaces.
>
> Is there a workaround ?

Not any direct way that I know of :( If you have a lot of spaces in a
JSON blob between each object, it may be worth casting it once to
jsonb. This takes an extra round of conversion overhead though.

> Is there a plan to add this feature on future version ?

I don't recall plans in this area. Maybe there's room for a
jsonb_compact(). core has already jsonb_pretty(), and that would be
what has the less overhead when doing the binary/text conversion.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_sample

2016-10-18 Thread Patrick B
Hi guys,

I got a very big database, that I need to export (dump) into a new test
server.

However, this new database test server doesn't need to have all the data. I
would like to have only the first 100 rows(example) of each table in my
database.

I'm using pg_sample to do that, but unfortunately it doesn't work well.
It doesn't get the first 100 rows. It gets random 100 rows.

Do you guys have any idea how could I do this?
Thanks
Patrick


Re: [GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Thomas Kellerer

Tom Lane schrieb am 18.10.2016 um 16:11:

I thought pg_depend only stores the dependency if the the sequence was assigned
an owning column (through OWNED BY).


No, there will be regular expression dependencies as well.

That 'a' dependency is the one that pg_get_serial_sequence() looks for,
but the default-to-sequence dependency will be there in any case.

regression=# create table t2(f2 int default nextval('t1_f1_seq'));
CREATE TABLE
regression=# select pg_describe_object(classid,objid,objsubid) as
obj, pg_describe_object(refclassid,refobjid,refobjsubid) as ref,
deptype from pg_depend where objid = 't1_f1_seq'::regclass or
refobjid = 't1_f1_seq'::regclass;
  obj   |ref | deptype
++-
 type t1_f1_seq | sequence t1_f1_seq | i
 sequence t1_f1_seq | schema public  | n
 default for table t1 column f1 | sequence t1_f1_seq | n
 sequence t1_f1_seq | table t1 column f1 | a
 default for table t2 column f2 | sequence t1_f1_seq | n


Great, thanks. I meant to include that dependency in my SQL Workbench as well,
but could never find the correct way of joining the tables.







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Simon Riggs
On 18 October 2016 at 22:06, Tom Lane  wrote:
> Simon Riggs  writes:
>> On 18 October 2016 at 19:34, Tom Lane  wrote:
>>> If you don't want to have an implicit bias towards earlier blocks,
>>> I don't think that either standard tablesample method is really what
>>> you want.
>>>
>>> The contrib/tsm_system_rows tablesample method is a lot closer, in
>>> that it will start at a randomly chosen block, but if you just do
>>> "tablesample system_rows(1)" then you will always get the first row
>>> in whichever block it lands on, so it's still not exactly unbiased.
>
>> Is there a reason why we can't fix the behaviours of the three methods
>> mentioned above by making them all start at a random block and a
>> random item between min and max?
>
> The standard tablesample methods are constrained by other requirements,
> such as repeatability.  I am not sure that loading this one on top of
> that is a good idea.  The bias I referred to above is *not* the fault
> of the sample methods, rather it's the fault of using "LIMIT 1".

Hmm, yeh, that would make it a little too much of a special case.

> It does seem like maybe it'd be nice for tsm_system_rows to start at a
> randomly chosen entry in the first block it visits, rather than always
> dumping that entire block.  Then "tablesample system_rows(1)" would
> actually give you a pretty random row, and I think we aren't giving up
> any useful properties it has now.

OK, will patch that.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Tom Lane
Simon Riggs  writes:
> On 18 October 2016 at 19:34, Tom Lane  wrote:
>> If you don't want to have an implicit bias towards earlier blocks,
>> I don't think that either standard tablesample method is really what
>> you want.
>> 
>> The contrib/tsm_system_rows tablesample method is a lot closer, in
>> that it will start at a randomly chosen block, but if you just do
>> "tablesample system_rows(1)" then you will always get the first row
>> in whichever block it lands on, so it's still not exactly unbiased.

> Is there a reason why we can't fix the behaviours of the three methods
> mentioned above by making them all start at a random block and a
> random item between min and max?

The standard tablesample methods are constrained by other requirements,
such as repeatability.  I am not sure that loading this one on top of
that is a good idea.  The bias I referred to above is *not* the fault
of the sample methods, rather it's the fault of using "LIMIT 1".

It does seem like maybe it'd be nice for tsm_system_rows to start at a
randomly chosen entry in the first block it visits, rather than always
dumping that entire block.  Then "tablesample system_rows(1)" would
actually give you a pretty random row, and I think we aren't giving up
any useful properties it has now.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Simon Riggs
On 18 October 2016 at 19:34, Tom Lane  wrote:
> Andy Colson  writes:
>> On 10/18/2016 11:44 AM, Francisco Olarte wrote:
>>> This should be faster, but to me it seems it does a different thing.
>
>> Ah, yes, you're right, there is a bit of a difference there.
>
> If you don't want to have an implicit bias towards earlier blocks,
> I don't think that either standard tablesample method is really what
> you want.
>
> The contrib/tsm_system_rows tablesample method is a lot closer, in
> that it will start at a randomly chosen block, but if you just do
> "tablesample system_rows(1)" then you will always get the first row
> in whichever block it lands on, so it's still not exactly unbiased.

Is there a reason why we can't fix the behaviours of the three methods
mentioned above by making them all start at a random block and a
random item between min and max?

It wasn't ever intended to be biased and bernoulli in particular ought
to have a strict no bias.

Happy to patch if we agree.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem changing default data_directory in PG 9.6 + CentOS6

2016-10-18 Thread Devrim Gündüz

Hi,

On Mon, 2016-10-17 at 17:38 -0300, Edilmar LISTAS wrote:
> I have an env running a changed data_directory fine in a devel machine PG 9.4
> using Fedora23. 
> Now, I have a server machine with CentOS where I downloaded the RPMs from
> repo https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/p
> gdg-centos96-9.6-3.noarch.rpm.
> All the configs run fine if I doesn't change the default data_directory. But
> I need to use the path /sistemas/sat4/bdpg.
> 
> I did these commands:
> 
> mkdir /sistemas/sat4/bdpg
> chown postgres /sistemas/sat4/bdpg
> chmod 700 /sistemas/sat4/bdpg
> su - postgres 
> /usr/pgsql-9.6/bin/initdb -D /sistemas/sat4/bdpg
> exit
> 
> Then, I changed data_directory to /sistemas/sat4/bdpg and tried to restart
> PG:
> service postgresql-9.6 restart
> STOP => OK
> START => FAILED

On CentOS, the appropriate way to do this is described in README file shipped
with the RPMs. Related part is:

===
As an example, let us create a secondary postmaster called, creatively enough,
'secondary'.  Here are the steps:
1.) create a hard link in /etc/rc.d/init.d (or equivalent location)
to postgresql-9.6 named 'secondary-9.6' : ln postgresql
secondary   Pick
a name not already used in /etc/rc.d/init.d!
2.) create a file in /etc/sysconfig/pgsql named secondary.  This file is
a shell script -- typically you would define PGDATA, and PGOPTS
here.  Since $PGDATA/postgresql.conf will override many of these
settings, except PGDATA, you might be surprised on startup.
3.) create the target PGDATA.
4.) Initdb the targe PGDATA as documented in the main documentation.
Automatic initdb may or may not work for you, so a manual one is
preferred.  This must be done as user 'postgres'
5.) Edit postgresql.conf to change the port, address, tcpip settings, etc.
6.) Start the postmaster with 'service secondary-9.6 start'.


Regards,
-- 
Devrim GÜNDÜZ
EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Dump all the indexes/constraints/roles

2016-10-18 Thread Melvin Davidson
On Tue, Oct 18, 2016 at 3:11 PM, Patrick B  wrote:

> Thank you guys... good to know that pg_dump does all the job for me :)
>
> So.. If I only dump using the --schema-only option, it will dump all the
> schemas, constraints, indexes and tables?
>
> Because probably, I'll have to import the data manually. NOt in a single
> pg_restore I mean. (AWS issue)
>

>...If I only dump using the --schema-only option, it will dump all the
schemas, constraints, indexes and tables?


*Yes! Please RTFM, it's there for a reason. --->
https://www.postgresql.org/docs/9.5/static/app-pgdump.html
*


*Also, please note you can use -a to only dump data*

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Dump all the indexes/constraints/roles

2016-10-18 Thread Patrick B
Thank you guys... good to know that pg_dump does all the job for me :)

So.. If I only dump using the --schema-only option, it will dump all the
schemas, constraints, indexes and tables?

Because probably, I'll have to import the data manually. NOt in a single
pg_restore I mean. (AWS issue)


Re: [GENERAL] out-of-order XID insertion in KnownAssignedXids

2016-10-18 Thread Andres Freund
Hi,

On 2016-10-18 14:57:52 +0200, fred...@huitfeldt.com wrote:
> we are running many postgresql master/slave setups. The slaves are
> initialised from a pg_basebackup from the master and are sync
> streaming from the master. When we determine the master has failed,
> the slave is promoted. Some time after that, the old master is again
> initialised with a pg_basebackup and starts streaming from the new
> master.

Could you describe in a bit more detail how exactly you're setting up
the standbys? E.g. the exact recovery.conf used, whether you remove any
files during starting a standby. Also how exactly you're promoting
standbys?

> Recently, we have gotten this error a fair amount of times: "out-of-order XID 
> insertion in KnownAssignedXids" when postgresql attempts to start after being 
> initialised with a pg_basebackup from the current master.


Which version are you encountering this on precisely?


> Once the issue has occurred, a subsequent re-initialisation (with a 
> completely new pg_basebackup) does not resolve the issue.

How have you recovered from this so far?


> I have a setup in the failing state, so I can produce any kind of log mesages 
> / details that would be helpful.

Could you use pg_xlogdump to dump the WAL file on which replay failed?
And then attach the output in a compressed manner?

Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Francisco Olarte
Andy:

On Tue, Oct 18, 2016 at 7:17 PM, Andy Colson  wrote:
> Ah, yes, you're right, there is a bit of a difference there.
>
> Speed wise:
> 1) select one from ones order by random() limit 1;
>> about 360ms
> 2) select one from ones tablesample bernoulli(1) limit 1 ;
>> about 4ms
> 3) select one from ones tablesample bernoulli(1) order by random() limit 1;
>> about 80ms

Expected. It would be nice if you had provided some tbale structure / size data.
>
> Using the third option in batch, I'm getting about 15 transactions a second.
>
> Oddly:
> select one from ones tablesample bernoulli(0.25) order by random()
> takes almost 80ms also.

mmm, it depends a lot on you total rows and average rows per

> bernoulli(0.25) returns 3k rows
> bernoulli(1) returns 14k rows

This hints at 1M4 rows (14k / 1%). If your rows are small and you have
more than 400 rows per page I would expect that, as .25% sample would
hit every page.

Tome hinted you at an extension. Also, if you are in a function (
which can loop ) you can do a little trick, instead of bernouilli(1)
use bernouilli (N/table_size). This way you will select very few rows
and speed up the last phase. Anyway, I fear bernouilly must read all
the table too, to be able to discard randomly, so you may not win
nothing ( I would compare the query time against a simple 'count(one)
query', to have a benchmark of how much time the server expends
reading the table. I would bet for 'about 80 ms'.

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Tom Lane
Andy Colson  writes:
> On 10/18/2016 11:44 AM, Francisco Olarte wrote:
>> This should be faster, but to me it seems it does a different thing.

> Ah, yes, you're right, there is a bit of a difference there.

If you don't want to have an implicit bias towards earlier blocks,
I don't think that either standard tablesample method is really what
you want.

The contrib/tsm_system_rows tablesample method is a lot closer, in
that it will start at a randomly chosen block, but if you just do
"tablesample system_rows(1)" then you will always get the first row
in whichever block it lands on, so it's still not exactly unbiased.
Maybe you could select "tablesample system_rows(100)" or so and then
do the order-by-random trick on that sample.  This would be a lot
faster than selecting 100 random rows with either built-in sample
method, since the rows it grabs will be consecutive.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Andy Colson

On 10/18/2016 11:44 AM, Francisco Olarte wrote:

On Tue, Oct 18, 2016 at 5:06 PM, Andy Colson  wrote:

I wanted to report an awesome performance boost using tablesample.
In my stored function I was getting a random row using:
select one into x from ones order by random() limit 1;
When the table was smaller it worked fine, but the performance has slowly
gotten worse.  This morning I was getting around 8 transactions a second.


Which is not a surprise, as it has to at least read all the rows and
generate a random() for each one and keep track of the minimum.


I just replaced it with:
select one into x from ones tablesample bernoulli(1) limit 1;


This should be faster, but to me it seems it does a different thing.
This seems to select each row of the table with probability 1% and
return the first selected, i.e., something similar to

select one into x from ones where random()>0.01 limit 1.

Which has the ( diminishing with table size ) risk of selecting zero
rows and is going to select one of the first 100 or so rows with high
probability, unless I'm missing something.

I say this because docs state ir returns a 'randomly chosen', sample,
not a 'randomly ORDERED' one, and the straightforward implementation
of sampling returns rows in the primitive scan order. I supose it
could be easily tested by selecting bernouilli(100), but have not
server access now to verify it.

With a big table it seems:

select one into x from ones where random()>0.01 order by random() limit 1
or
select one into x from ones tablesample bernoulli(1) order by random() limit 1;

Is more similar to what you originally did ( and the run time should
possibly be something in between ).


I would recomend you to execute the function and verify it does what
you want ( as you say it's fast, I would try selecting a several
thousands and eyeballing the result, if it does what I fear the
grouping should be obvious ).

Maybe you do not mind it, in which case it's ok, but a one minute run
should let you know wahat you are exactly doing.

Francisco Olarte.




Ah, yes, you're right, there is a bit of a difference there.

Speed wise:
1) select one from ones order by random() limit 1;
> about 360ms
2) select one from ones tablesample bernoulli(1) limit 1 ;
> about 4ms
3) select one from ones tablesample bernoulli(1) order by random() limit 1;
> about 80ms

Using the third option in batch, I'm getting about 15 transactions a second.

Oddly:
select one from ones tablesample bernoulli(0.25) order by random()

takes almost 80ms also.

bernoulli(0.25) returns 3k rows
bernoulli(1) returns 14k rows


Thanks,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem changing default data_directory in PG 9.6 + CentOS6

2016-10-18 Thread Edilmar LISTAS

Hi,

After some tests, really the problem is with original script from PG 9.6 
RPM /etc/rc.d/init.d/postgresql-9.6. If I run pg_ctl, start/stop with 
the new data_directory works fine!


Em 18-10-2016 13:49, Edilmar LISTAS escreveu:

Em 18-10-2016 11:33, Melvin Davidson escreveu:



On Tue, Oct 18, 2016 at 10:20 AM, Edilmar LISTAS
> wrote:

1) I changed /etc/rc.d/init.d/postgresql-9.6 like this:
PGDATA=/sistemas/sat4/bdpg
#PGDATA=/var/lib/pgsql/9.6/data

2) I copied postgresql.conf and pg_hba.conf from
/var/lib/pgsql/9.6/data to /sistemas/sat4/bdpg

3) I changed postgresql.conf like this:
data_directory = '/sistemas/sat4/bdpg'

4) service postgresql-9.6 start
Iniciando o serviço postgresql-9.6: [FAILED]

In my devel machine, I only did step 3), PG starts lookup for
default configs in /var/lib/pgsql/data and uses my databases in the
alternative path /sistemas/sat4/bdpg.


Em 17-10-2016 21:22, Adrian Klaver escreveu:

On 10/17/2016 01:38 PM, Edilmar LISTAS wrote:

I have an env running a changed data_directory fine in a
devel machine
PG 9.4 using Fedora23.
Now, I have a server machine with CentOS where I downloaded
the RPMs
from repo

https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/pgdg-centos96-9.6-3.noarch.rpm


.


All the configs run fine if I doesn't change the default
data_directory.
But I need to use the path /sistemas/sat4/bdpg.

I did these commands:

mkdir /sistemas/sat4/bdpg
chown postgres /sistemas/sat4/bdpg
chmod 700 /sistemas/sat4/bdpg
su - postgres
/usr/pgsql-9.6/bin/initdb -D /sistemas/sat4/bdpg
exit

Then, I changed data_directory to /sistemas/sat4/bdpg and
tried to


Changed data_directory where?

restart PG:
service postgresql-9.6 restart
STOP => OK
START => FAILED

I disabled se_linux.
The file /var/lib/pgsql/9.6/pgstartup.log just said to see
future output
in pg_log.
The file data/pg_log/postgresql-Mon.log doesn't say anything
about
errors.

The strange is that startup arises a FAILED message, but the
process
"/usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data"
is running


So /var/lib/pgsql/9.6/data is where the original install is?

Best guess is that some script is starting the original install
and when
you go to start your custom location it fails because the new
cluster is
trying to use the port(5432 I am assuming) as the original
cluster.

Have you tried giving the new cluster a different port number,
say 5442,
and the starting it?


(and the children logger/checkpointer/etc). But I don't get
to connect
using pgAdmin3. Then, I have to kill manually postmaster
(because
service script doesn't understand postmaster.pid in the new
data dir),
comment data_directory to use default place, start and
connect to
pgAdmin3. Then, start/stop/start/etc run fine lookup for
postmaster.pid
in /var/lib/pgsql/9.6/data.


So you either need to change the start script to point to the new
cluster or create a new one for it.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




I*nstead of

4) service postgresql-9.6 start

*
*
Try:
*
*sudo su postgres
*
*pg_ctl start -D /sistemas/sat4/bdpg*
*
*
*If that works, then your problem is in the postgresql service file
*
*-- *
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


This works...
/usr/pgsql-9.6/bin/pg_ctl -D /sistemas/sat4/bdpg -l logfile start
but psql doesn't log:
-bash-4.1$ psql
Senha:
psql: fe_sendauth: no password supplied

If I use the default data_directory = /var/lib/pgsql/9.6/data, psql runs
fine without password.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] json rendering without pretty option (compact)

2016-10-18 Thread Nicolas Paris
Hello,

I want to minimize postgresql json size when I fetch them.
I translate columnar table to json thought json_build_object/array or even
row_to_jeon.

While row_to_json do have a "pretty_bool" option, the latter do not. Each
json object/array I build contains spaces.

Is there a workaround ?
Is there a plan to add this feature on future version ?

(PG 9.6)

Thank you by advance,


Re: [GENERAL] Problem changing default data_directory in PG 9.6 + CentOS6

2016-10-18 Thread Edilmar LISTAS

Em 18-10-2016 11:33, Melvin Davidson escreveu:



On Tue, Oct 18, 2016 at 10:20 AM, Edilmar LISTAS
> wrote:

1) I changed /etc/rc.d/init.d/postgresql-9.6 like this:
PGDATA=/sistemas/sat4/bdpg
#PGDATA=/var/lib/pgsql/9.6/data

2) I copied postgresql.conf and pg_hba.conf from
/var/lib/pgsql/9.6/data to /sistemas/sat4/bdpg

3) I changed postgresql.conf like this:
data_directory = '/sistemas/sat4/bdpg'

4) service postgresql-9.6 start
Iniciando o serviço postgresql-9.6: [FAILED]

In my devel machine, I only did step 3), PG starts lookup for
default configs in /var/lib/pgsql/data and uses my databases in the
alternative path /sistemas/sat4/bdpg.


Em 17-10-2016 21:22, Adrian Klaver escreveu:

On 10/17/2016 01:38 PM, Edilmar LISTAS wrote:

I have an env running a changed data_directory fine in a
devel machine
PG 9.4 using Fedora23.
Now, I have a server machine with CentOS where I downloaded
the RPMs
from repo

https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/pgdg-centos96-9.6-3.noarch.rpm

.

All the configs run fine if I doesn't change the default
data_directory.
But I need to use the path /sistemas/sat4/bdpg.

I did these commands:

mkdir /sistemas/sat4/bdpg
chown postgres /sistemas/sat4/bdpg
chmod 700 /sistemas/sat4/bdpg
su - postgres
/usr/pgsql-9.6/bin/initdb -D /sistemas/sat4/bdpg
exit

Then, I changed data_directory to /sistemas/sat4/bdpg and
tried to


Changed data_directory where?

restart PG:
service postgresql-9.6 restart
STOP => OK
START => FAILED

I disabled se_linux.
The file /var/lib/pgsql/9.6/pgstartup.log just said to see
future output
in pg_log.
The file data/pg_log/postgresql-Mon.log doesn't say anything
about
errors.

The strange is that startup arises a FAILED message, but the
process
"/usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data"
is running


So /var/lib/pgsql/9.6/data is where the original install is?

Best guess is that some script is starting the original install
and when
you go to start your custom location it fails because the new
cluster is
trying to use the port(5432 I am assuming) as the original cluster.

Have you tried giving the new cluster a different port number,
say 5442,
and the starting it?


(and the children logger/checkpointer/etc). But I don't get
to connect
using pgAdmin3. Then, I have to kill manually postmaster
(because
service script doesn't understand postmaster.pid in the new
data dir),
comment data_directory to use default place, start and
connect to
pgAdmin3. Then, start/stop/start/etc run fine lookup for
postmaster.pid
in /var/lib/pgsql/9.6/data.


So you either need to change the start script to point to the new
cluster or create a new one for it.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




I*nstead of

4) service postgresql-9.6 start

*
*
Try:
*
*sudo su postgres
*
*pg_ctl start -D /sistemas/sat4/bdpg*
*
*
*If that works, then your problem is in the postgresql service file
*
*-- *
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


This works...
/usr/pgsql-9.6/bin/pg_ctl -D /sistemas/sat4/bdpg -l logfile start
but psql doesn't log:
-bash-4.1$ psql
Senha:
psql: fe_sendauth: no password supplied

If I use the default data_directory = /var/lib/pgsql/9.6/data, psql runs 
fine without password.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tablesample performance

2016-10-18 Thread Francisco Olarte
On Tue, Oct 18, 2016 at 5:06 PM, Andy Colson  wrote:
> I wanted to report an awesome performance boost using tablesample.
> In my stored function I was getting a random row using:
> select one into x from ones order by random() limit 1;
> When the table was smaller it worked fine, but the performance has slowly
> gotten worse.  This morning I was getting around 8 transactions a second.

Which is not a surprise, as it has to at least read all the rows and
generate a random() for each one and keep track of the minimum.

> I just replaced it with:
> select one into x from ones tablesample bernoulli(1) limit 1;

This should be faster, but to me it seems it does a different thing.
This seems to select each row of the table with probability 1% and
return the first selected, i.e., something similar to

select one into x from ones where random()>0.01 limit 1.

Which has the ( diminishing with table size ) risk of selecting zero
rows and is going to select one of the first 100 or so rows with high
probability, unless I'm missing something.

I say this because docs state ir returns a 'randomly chosen', sample,
not a 'randomly ORDERED' one, and the straightforward implementation
of sampling returns rows in the primitive scan order. I supose it
could be easily tested by selecting bernouilli(100), but have not
server access now to verify it.

With a big table it seems:

select one into x from ones where random()>0.01 order by random() limit 1
or
select one into x from ones tablesample bernoulli(1) order by random() limit 1;

Is more similar to what you originally did ( and the run time should
possibly be something in between ).


I would recomend you to execute the function and verify it does what
you want ( as you say it's fast, I would try selecting a several
thousands and eyeballing the result, if it does what I fear the
grouping should be obvious ).

Maybe you do not mind it, in which case it's ok, but a one minute run
should let you know wahat you are exactly doing.

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump all the indexes/constraints/roles

2016-10-18 Thread Jeff Janes
On Mon, Oct 17, 2016 at 8:32 PM, Patrick B  wrote:

> Hi guys,
>
> I need to export an entire database to another server, for testing purpose.
>
> Is there any way to export all indexes and constraints ?
> Postgres 9.2
> Patrick
>

By default pg_dump will export the constraints.  It will export the
definition of the indexes, but not the physical contents of them, so they
have to be recreated upon import.  That can be slow depending on the sizes
and types involved.

Or you can clone the whole thing with pg_basebackup.  (If you have multiple
databases on the same cluster, this will take all of them.  The only thing
you can do about that is drop the ones you don't want.)


Re: [GENERAL] Generic way to test input arguments

2016-10-18 Thread Pavel Stehule
2016-10-18 16:42 GMT+02:00 Saïd Assemlal :

> I am writing database functions with plpgsql. (I am using Postgresql 9.4
> with centos 6)
>
> Here an example on what I would like to improve:
>
> CREATE OR REPLACE FUNCTION usp_locking_trial(p_trial_code VARCHAR(50),
> p_trial_key VARCHAR(500))
> RETURNS TEXT AS $$
> DECLARE
> BEGIN
> IF is_empty_or_null(p_trial_code) THEN
> RAISE EXCEPTION 'trial code argument is empty/null.';
> END IF;
>
> IF is_empty_or_null(p_trial_key) THEN
> RAISE EXCEPTION 'trial key argument is empty/null';
> END IF;
> ..
>
> END;
> $$ LANGUAGE plpgsql;
>
> I have many functions where I check if the arguments are null or empty.
> This code is repetitive and could be the almost the same between functions.
>
> For a given example: a function who takes all input arguments and it
> checks one by one if it's null and raise an exception with the name of the
> argument.
>
> Would it be a good idea ?
>
some smarter (generic) function can be written in C language (you should to
use polymorphics type "any"). With plpgsql you cannot do enything else what
you do.

Regards

Pavel


> Thanks.
> Le 2016-10-17 à 3:09 PM, Raymond O'Donnell a écrit :
>
> On 17/10/16 16:40, said assemlal wrote:
>
> Hello,
>
> I am looking for a way to test generically input arguments to raise an
> exception if one is either null or empty.
>
> I was thinking to create a function who takes an array to check them but
> not sure if it's really good.
>
>
> It's not clear what you want to do here. Can you explain in more detail?
>
> What do you mean by "input arguments"? Are you writing functions in the
> database? If so, which language? Some examples of what you've tried so far
> would help too.
>
> Ray.
>
>
>
>


Re: [GENERAL] Generic way to test input arguments

2016-10-18 Thread oyoun

I will try to reproduce this way on postgresql9.4


Thanks


Le 2016-10-17 à 3:30 PM, Pavel Stehule a écrit :

Hi

2016-10-17 21:09 GMT+02:00 Raymond O'Donnell >:


On 17/10/16 16:40, said assemlal wrote:

Hello,

I am looking for a way to test generically input arguments to
raise an
exception if one is either null or empty.

I was thinking to create a function who takes an array to
check them but
not sure if it's really good.


It's not clear what you want to do here. Can you explain in more
detail?

What do you mean by "input arguments"? Are you writing functions
in the database? If so, which language? Some examples of what
you've tried so far would help too.

Ray.


9.6 has functions for this purpose - num_nulls and num_nonnulls

Regards

Pavel




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general







[GENERAL] tablesample performance

2016-10-18 Thread Andy Colson

I wanted to report an awesome performance boost using tablesample.

In my stored function I was getting a random row using:
select one into x from ones order by random() limit 1;

When the table was smaller it worked fine, but the performance has 
slowly gotten worse.  This morning I was getting around 8 transactions a 
second.


I just replaced it with:
select one into x from ones tablesample bernoulli(1) limit 1;

And now I'm getting 376 transactions a second!

Thank you dev's!  Thank you PG 9.5!

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Generic way to test input arguments

2016-10-18 Thread Saïd Assemlal
I am writing database functions with plpgsql. (I am using Postgresql 9.4 
with centos 6)


Here an example on what I would like to improve:

CREATE OR REPLACE FUNCTION usp_locking_trial(p_trial_code VARCHAR(50), 
p_trial_key VARCHAR(500))

RETURNS TEXT AS $$
DECLARE
BEGIN
IF is_empty_or_null(p_trial_code) THEN
RAISE EXCEPTION 'trial code argument is empty/null.';
END IF;

IF is_empty_or_null(p_trial_key) THEN
RAISE EXCEPTION 'trial key argument is empty/null';
END IF;

..

END;
$$ LANGUAGE plpgsql;

I have many functions where I check if the arguments are null or empty. 
This code is repetitive and could be the almost the same between functions.


For a given example: a function who takes all input arguments and it 
checks one by one if it's null and raise an exception with the name of 
the argument.


Would it be a good idea ?

Thanks.

Le 2016-10-17 à 3:09 PM, Raymond O'Donnell a écrit :

On 17/10/16 16:40, said assemlal wrote:

Hello,

I am looking for a way to test generically input arguments to raise an
exception if one is either null or empty.

I was thinking to create a function who takes an array to check them but
not sure if it's really good.


It's not clear what you want to do here. Can you explain in more detail?

What do you mean by "input arguments"? Are you writing functions in 
the database? If so, which language? Some examples of what you've 
tried so far would help too.


Ray.






Re: [GENERAL] Problem changing default data_directory in PG 9.6 + CentOS6

2016-10-18 Thread Melvin Davidson
On Tue, Oct 18, 2016 at 10:20 AM, Edilmar LISTAS 
wrote:

> 1) I changed /etc/rc.d/init.d/postgresql-9.6 like this:
> PGDATA=/sistemas/sat4/bdpg
> #PGDATA=/var/lib/pgsql/9.6/data
>
> 2) I copied postgresql.conf and pg_hba.conf from /var/lib/pgsql/9.6/data
> to /sistemas/sat4/bdpg
>
> 3) I changed postgresql.conf like this:
> data_directory = '/sistemas/sat4/bdpg'
>
> 4) service postgresql-9.6 start
> Iniciando o serviço postgresql-9.6: [FAILED]
>
> In my devel machine, I only did step 3), PG starts lookup for default
> configs in /var/lib/pgsql/data and uses my databases in the alternative
> path /sistemas/sat4/bdpg.
>
>
> Em 17-10-2016 21:22, Adrian Klaver escreveu:
>
>> On 10/17/2016 01:38 PM, Edilmar LISTAS wrote:
>>
>>> I have an env running a changed data_directory fine in a devel machine
>>> PG 9.4 using Fedora23.
>>> Now, I have a server machine with CentOS where I downloaded the RPMs
>>> from repo
>>> https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhe
>>> l-6-x86_64/pgdg-centos96-9.6-3.noarch.rpm.
>>>
>>> All the configs run fine if I doesn't change the default data_directory.
>>> But I need to use the path /sistemas/sat4/bdpg.
>>>
>>> I did these commands:
>>>
>>> mkdir /sistemas/sat4/bdpg
>>> chown postgres /sistemas/sat4/bdpg
>>> chmod 700 /sistemas/sat4/bdpg
>>> su - postgres
>>> /usr/pgsql-9.6/bin/initdb -D /sistemas/sat4/bdpg
>>> exit
>>>
>>> Then, I changed data_directory to /sistemas/sat4/bdpg and tried to
>>>
>>
>> Changed data_directory where?
>>
>> restart PG:
>>> service postgresql-9.6 restart
>>> STOP => OK
>>> START => FAILED
>>>
>>> I disabled se_linux.
>>> The file /var/lib/pgsql/9.6/pgstartup.log just said to see future output
>>> in pg_log.
>>> The file data/pg_log/postgresql-Mon.log doesn't say anything about
>>> errors.
>>>
>>> The strange is that startup arises a FAILED message, but the process
>>> "/usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data" is running
>>>
>>
>> So /var/lib/pgsql/9.6/data is where the original install is?
>>
>> Best guess is that some script is starting the original install and when
>> you go to start your custom location it fails because the new cluster is
>> trying to use the port(5432 I am assuming) as the original cluster.
>>
>> Have you tried giving the new cluster a different port number, say 5442,
>> and the starting it?
>>
>>
>> (and the children logger/checkpointer/etc). But I don't get to connect
>>> using pgAdmin3. Then, I have to kill manually postmaster (because
>>> service script doesn't understand postmaster.pid in the new data dir),
>>> comment data_directory to use default place, start and connect to
>>> pgAdmin3. Then, start/stop/start/etc run fine lookup for postmaster.pid
>>> in /var/lib/pgsql/9.6/data.
>>>
>>
>> So you either need to change the start script to point to the new
>> cluster or create a new one for it.
>>
>>
>>
>>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


I

*nstead of >4) service postgresql-9.6 start*


*Try:*

*sudo su postgres*
*pg_ctl start -D /sistemas/sat4/bdpg*


*If that works, then your problem is in the postgresql service file*
*-- *
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Problem changing default data_directory in PG 9.6 + CentOS6

2016-10-18 Thread Edilmar LISTAS

1) I changed /etc/rc.d/init.d/postgresql-9.6 like this:
PGDATA=/sistemas/sat4/bdpg
#PGDATA=/var/lib/pgsql/9.6/data

2) I copied postgresql.conf and pg_hba.conf from /var/lib/pgsql/9.6/data 
to /sistemas/sat4/bdpg


3) I changed postgresql.conf like this:
data_directory = '/sistemas/sat4/bdpg'

4) service postgresql-9.6 start
Iniciando o serviço postgresql-9.6: [FAILED]

In my devel machine, I only did step 3), PG starts lookup for default 
configs in /var/lib/pgsql/data and uses my databases in the alternative 
path /sistemas/sat4/bdpg.



Em 17-10-2016 21:22, Adrian Klaver escreveu:

On 10/17/2016 01:38 PM, Edilmar LISTAS wrote:

I have an env running a changed data_directory fine in a devel machine
PG 9.4 using Fedora23.
Now, I have a server machine with CentOS where I downloaded the RPMs
from repo
https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-6-x86_64/pgdg-centos96-9.6-3.noarch.rpm.

All the configs run fine if I doesn't change the default data_directory.
But I need to use the path /sistemas/sat4/bdpg.

I did these commands:

mkdir /sistemas/sat4/bdpg
chown postgres /sistemas/sat4/bdpg
chmod 700 /sistemas/sat4/bdpg
su - postgres
/usr/pgsql-9.6/bin/initdb -D /sistemas/sat4/bdpg
exit

Then, I changed data_directory to /sistemas/sat4/bdpg and tried to


Changed data_directory where?


restart PG:
service postgresql-9.6 restart
STOP => OK
START => FAILED

I disabled se_linux.
The file /var/lib/pgsql/9.6/pgstartup.log just said to see future output
in pg_log.
The file data/pg_log/postgresql-Mon.log doesn't say anything about
errors.

The strange is that startup arises a FAILED message, but the process
"/usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data" is running


So /var/lib/pgsql/9.6/data is where the original install is?

Best guess is that some script is starting the original install and when
you go to start your custom location it fails because the new cluster is
trying to use the port(5432 I am assuming) as the original cluster.

Have you tried giving the new cluster a different port number, say 5442,
and the starting it?



(and the children logger/checkpointer/etc). But I don't get to connect
using pgAdmin3. Then, I have to kill manually postmaster (because
service script doesn't understand postmaster.pid in the new data dir),
comment data_directory to use default place, start and connect to
pgAdmin3. Then, start/stop/start/etc run fine lookup for postmaster.pid
in /var/lib/pgsql/9.6/data.


So you either need to change the start script to point to the new
cluster or create a new one for it.







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Tom Lane
Thomas Kellerer  writes:
> Tom Lane schrieb am 18.10.2016 um 15:20:
>> Personally, I'd try looking in pg_depend to see if the column's default
>> expression has a dependency on a relation of type sequence.  That avoids
>> all the fun of parsing the expression and turns it into a simple SQL
>> join problem.

> I thought pg_depend only stores the dependency if the the sequence was 
> assigned 
> an owning column (through OWNED BY). 

No, there will be regular expression dependencies as well.

regression=# create table t1 (f1 serial);
CREATE TABLE
regression=# select * from pg_depend where objid = 't1_f1_seq'::regclass or 
refobjid = 't1_f1_seq'::regclass;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype 
-+---+--++--+-+-
1247 | 47198 |0 |   1259 |47197 |   0 | i
1259 | 47197 |0 |   2615 | 2200 |   0 | n
2604 | 47202 |0 |   1259 |47197 |   0 | n
1259 | 47197 |0 |   1259 |47199 |   1 | a
(4 rows)

regression=# select pg_describe_object(classid,objid,objsubid) as obj, 
pg_describe_object(refclassid,refobjid,refobjsubid) as ref, deptype from 
pg_depend where objid = 't1_f1_seq'::regclass or refobjid = 
't1_f1_seq'::regclass;
  obj   |ref | deptype 
++-
 type t1_f1_seq | sequence t1_f1_seq | i
 sequence t1_f1_seq | schema public  | n
 default for table t1 column f1 | sequence t1_f1_seq | n
 sequence t1_f1_seq | table t1 column f1 | a
(4 rows)

That 'a' dependency is the one that pg_get_serial_sequence() looks for,
but the default-to-sequence dependency will be there in any case.

regression=# create table t2(f2 int default nextval('t1_f1_seq'));
CREATE TABLE
regression=# select pg_describe_object(classid,objid,objsubid) as obj, 
pg_describe_object(refclassid,refobjid,refobjsubid) as ref, deptype from 
pg_depend where objid = 't1_f1_seq'::regclass or refobjid = 
't1_f1_seq'::regclass;
  obj   |ref | deptype 
++-
 type t1_f1_seq | sequence t1_f1_seq | i
 sequence t1_f1_seq | schema public  | n
 default for table t1 column f1 | sequence t1_f1_seq | n
 sequence t1_f1_seq | table t1 column f1 | a
 default for table t2 column f2 | sequence t1_f1_seq | n
(5 rows)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Thomas Kellerer
Tom Lane schrieb am 18.10.2016 um 15:20:
>> Furthermore, what's stored in the column seems to be a string of the
>> format "nextval('sequencename'::regclass)". Is there a function to
>> parse this, to return just the sequence name, or will the sequence
>> name always be without for instance a schema name so that a naive
>> parser of our own will do? Googling found no candidates.
> 
> Personally, I'd try looking in pg_depend to see if the column's default
> expression has a dependency on a relation of type sequence.  That avoids
> all the fun of parsing the expression and turns it into a simple SQL
> join problem.


I thought pg_depend only stores the dependency if the the sequence was assigned 
an owning column (through OWNED BY). 

I don't see any entries in pg_depend for a simple "default 
nextval('some_sequence')" expression 
but maybe I am just missing something. 

Thomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Tom Lane
Hanne Moa  writes:
> Until now we've been using pg_get_serial_sequence() to discover
> which sequence is in use, but can no longer do so due to two tables
> needing to share the same sequence (prior to being properly merged. No
> duplicate values, luckily). For one of the tables,
> pg_get_serial_sequence() won't be returning anything useful since it
> tracks which table *owns* a sequence and not which sequence is used
> by which column.

> The necessary information seems to be in the table
> "information_schema.columns", in  "column_default". Is this to be
> regarded as internal API or is it safe to use this to find the correct
> sequence? It works in all cases and on all the version of postgres
> that are relevant to us. The production system is currently running
> 9.3 (I'm pining for 9.5...)

> Furthermore, what's stored in the column seems to be a string of the
> format "nextval('sequencename'::regclass)". Is there a function to
> parse this, to return just the sequence name, or will the sequence
> name always be without for instance a schema name so that a naive
> parser of our own will do? Googling found no candidates.

Personally, I'd try looking in pg_depend to see if the column's default
expression has a dependency on a relation of type sequence.  That avoids
all the fun of parsing the expression and turns it into a simple SQL
join problem.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade not able to cope with pg_largeobject being in a different tablespace

2016-10-18 Thread Andreas Joseph Krogh
På tirsdag 18. oktober 2016 kl. 12:39:03, skrev Magnus Hagander <
mag...@hagander.net >:
    On Thu, Oct 13, 2016 at 7:35 AM, Andreas Joseph Krogh > wrote: På torsdag 13. oktober 2016 kl. 16:09:34, 
skrev Bruce Momjian >:
On Thu, Oct 13, 2016 at 10:14:08AM +0200, Andreas Joseph Krogh wrote:
 > I would assume that having pg_largeobject in a separate tablespace is more 
and
 > more common these days, having real-cheap SAN vs. fast-SSD for normal 
tables/
 > indexes/wal.

 So common that no one has ever asked for this feature before?
 
 
Sometimes one gets the feeling that one is the only one in the universe doing 
something one considers "quite common":-)
 
> So - I'm wondering if we can fund development of pg_upgrade to cope with this
 > configuration or somehow motivate to getting this issue fixed?
 >  
 > Would any of the PG-companies (2ndQ, EDB, PgPro) take a stab at this?
 >  
 > Any feedback welcome, thanks.

 You would need to get buy-in that that community wants the relocation of
 pg_largeobject to be supported via an SQL command, and at that point
 pg_upgrade would be modified to support that.  It is unlikely pg_upgrade
 is going to be modified to support something that isn't supported at the
 SQL level.  Of course, you can create a custom version of pg_upgrade to
 do that.
 
Doesn't "ALTER TABLE pg_largeobject SET TABLESPACE myspace_lo" count as being 
"at the SQL-level"?
 
Well, it requires that you set allow_system_table_mods on, which is documented 
as a developer option. It's documented with things like "The following 
parameters are intended for work on the PostgreSQL source code, and in some 
cases to assist with recovery of severely damaged databases. There should be no 
reason to use them on a production database.".

 Perhaps we should add another disclaimer there saying that if you make 
changes in this mode, tools like pg_upgrade (or for that matter, pg_dump or 
pretty much anything at all) may not work anymore?
 
 
The whole problem seems to come from the fact that BLOBs are stored in 
pg_largeobject which for some reason is implemented as a system-catalogue in 
PG, which imposes all kinds of weird problems, from a DBA-perspective.
 
Yes, there are several issues related to how lo style large objects work. I've 
often gone to similar implementations but in userspace on top of custom tables 
to work around those.
 
 
Can we pay you at EDB for making such a custom version of pg_upgrade for 9.6?
 
 
You're assuming pg_upgrade is the only potential problem. If you are willing 
to spend towards it, it would probably be better to spend towards the "upper 
layer" problem which would be to make it possible to move pg_largeobject to a 
different tablespace *without* turning on system_table_mods.
 
That said, I cannot comment to the complexity of either doing that *or* doing 
a custom pg_upgrade that would support it. But solving a long-term problem 
seems better than solving a one-off one.



 
I totally agree that investing in a long-term solution is the best. However, I 
need (would like very much) to upgrade a 9.5 cluster to 9.6 and would rather 
not wait for a solution to land in 10.x.
 
IIRC there was a discussion on -hackers not too long ago about pg_largeobject 
and releasing it from being a "system catalogue", but i think it stranded and 
got nowhere.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 

 


[GENERAL] out-of-order XID insertion in KnownAssignedXids

2016-10-18 Thread fredrik
Hi All,

we are running many postgresql master/slave setups. The slaves are initialised 
from a pg_basebackup from the master and are sync streaming from the master. 
When we determine the master has failed, the slave is promoted. Some time after 
that, the old master is again initialised with a pg_basebackup and starts 
streaming from the new master.

Our setup seems pretty "stock" and has been running for us for some years (with 
different versions of postgresql but also different OSes).

Recently, we have gotten this error a fair amount of times: "out-of-order XID 
insertion in KnownAssignedXids" when postgresql attempts to start after being 
initialised with a pg_basebackup from the current master.

The only reference I can find on that particular error, is from 2012 and the 
resulting issue is long fixed in our version of postgresql (9.5.3) ... 
()


Once the issue has occurred, a subsequent re-initialisation (with a completely 
new pg_basebackup) does not resolve the issue.

I have a setup in the failing state, so I can produce any kind of log mesages / 
details that would be helpful.


Thank you for your support,
Fredrik





Re: [GENERAL] Collations and codepages

2016-10-18 Thread Albe Laurenz
Raimo Jormakka wrote:
> In Windows 7, and using PostgreSQL 9.4.5, the collation gets set to 
> "English_United States.1252" when
> I select the "English, United States" locale in the installer. In Linux, the 
> collation is set to
> "en_US.UTF-8". The encoding is set to UTF-8 in both instances.
>
> Will these two instances behave identically in terms of collation logic? And 
> if not, is there
> something I can do about it? In general, what's the impact of the codepage 
> part of a collation to
> begin with?

The two collations will probably not behave identically, since PostgreSQL uses 
the
operating system collations instead of having ist own, and odds are that 
Microsoft's
collations and glibc's are slightly different.

I don't know if the impact will be large; maybe run a couple of tests to see if 
the
ordering is similar enough for your purposes.

I don't think that the actual encoing (UTF-8 or Windows-1252) has any impact on 
the ordering,
but I am not certain.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Immutable datastore library?

2016-10-18 Thread Mike Sofen
From: Guyren Howe   Sent: Monday, October 17, 2016 1:40 PM
I would like to use Postgres as an immutable data store. A subject table
would have a timestamp column, and that would be added to what would
otherwise be the primary key.  Trap updates and turn them into inserts. Have
an associated _deleted table. Trap deletes and turn them into inserts of the
primary key into that table.  Create a view that only shows the non-deleted
records with the most recent timestamps.

Stored procedure to do all that to a table. Event trigger to drop and
re-create the view on changes to the table columns.
--
A couple years ago at another firm, I designed and built a real time ODS
(operational data store) for a large healthcare company, on SQL Server 2012
that supported your exact requirements, since the ODS was the primary data
feed for their data warehouse.

My solution leveraged the Merge tsql function ("Upsert" in PG) to detect
inserts/updates/deletes.  We don't allow physical row deletes in medical
data, so these were logical deletes aka an update to an
"InactivatedDatetime" column making it not null.  I used a checksum function
in the Update branch to detect if the inbound data had changed at all, to
avoid creating dry updates (no change in the data but a new Update row would
be written otherwise).

Ok that's the internals for the write to the ODS.  I wrapped the entire
Merge statement inside of another insert statement using the equivalent of
the PG "Into" function, which took every column from the ODS write and wrote
the same data to a "History" table of the same name - those writes were
always inserts, creating a persistent, complete picture of every write to
the ODS.  Each row going into the History tables was marked with a "D"
(delete), "I" (insert) or "U" (update).  The History data was used for both
auditing and for nightly batches feeding the data warehouse, where row type
(D, I, or U) drove their Type 2 processing.  As you can imagine, the table
design was crucial to the success of this model.

This was ultra-efficient on the real time data flowing in from the hospital
and clinical EMRs (different systems) - in one transaction/one query, I was
able to double-write the data and ensure both writes completed or fail both
and error.  The "batches" were small - up to 100k rows or less, and
processed in under 100ms.  But even when there was a network outage and we
had to do a catch up load with millions of rows, it ran very quickly.  IOWs,
the double write overhead was very modest, especially with modern disk
performance.

Mike Sofen (Synthetic Genomics)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade not able to cope with pg_largeobject being in a different tablespace

2016-10-18 Thread Magnus Hagander
On Thu, Oct 13, 2016 at 7:35 AM, Andreas Joseph Krogh 
wrote:

> På torsdag 13. oktober 2016 kl. 16:09:34, skrev Bruce Momjian <
> br...@momjian.us>:
>
> On Thu, Oct 13, 2016 at 10:14:08AM +0200, Andreas Joseph Krogh wrote:
> > I would assume that having pg_largeobject in a separate tablespace is
> more and
> > more common these days, having real-cheap SAN vs. fast-SSD for normal
> tables/
> > indexes/wal.
>
> So common that no one has ever asked for this feature before?
>
>
>
> Sometimes one gets the feeling that one is the only one in the universe
> doing something one considers "quite common":-)
>
>
> > So - I'm wondering if we can fund development of pg_upgrade to cope with
> this
> > configuration or somehow motivate to getting this issue fixed?
> >
> > Would any of the PG-companies (2ndQ, EDB, PgPro) take a stab at this?
> >
> > Any feedback welcome, thanks.
>
> You would need to get buy-in that that community wants the relocation of
> pg_largeobject to be supported via an SQL command, and at that point
> pg_upgrade would be modified to support that.  It is unlikely pg_upgrade
> is going to be modified to support something that isn't supported at the
> SQL level.  Of course, you can create a custom version of pg_upgrade to
> do that.
>
>
> Doesn't "ALTER TABLE pg_largeobject SET TABLESPACE myspace_lo" count as
> being "at the SQL-level"?
>

Well, it requires that you set allow_system_table_mods on, which is
documented as a developer option. It's documented with things like "The
following parameters are intended for work on the PostgreSQL source code,
and in some cases to assist with recovery of severely damaged databases. There
should be no reason to use them on a production database.".

Perhaps we should add another disclaimer there saying that if you make
changes in this mode, tools like pg_upgrade (or for that matter, pg_dump or
pretty much anything at all) may not work anymore?



> The whole problem seems to come from the fact that BLOBs are stored in
> pg_largeobject which for some reason is implemented as a system-catalogue
> in PG, which imposes all kinds of weird problems, from a DBA-perspective.
>

Yes, there are several issues related to how lo style large objects work.
I've often gone to similar implementations but in userspace on top of
custom tables to work around those.



> Can we pay you at EDB for making such a custom version of pg_upgrade for
> 9.6?
>
>
You're assuming pg_upgrade is the only potential problem. If you are
willing to spend towards it, it would probably be better to spend towards
the "upper layer" problem which would be to make it possible to move
pg_largeobject to a different tablespace *without* turning on
system_table_mods.

That said, I cannot comment to the complexity of either doing that *or*
doing a custom pg_upgrade that would support it. But solving a long-term
problem seems better than solving a one-off one.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Thomas Kellerer
> Is this to be regarded as internal API or is it safe to use this to
> find the correct sequence?

I think it's safe to use. 


> Furthermore, what's stored in the column seems to be a string of the
> format "nextval('sequencename'::regclass)". Is there a function to
> parse this, to return just the sequence name, or will the sequence
> name always be without for instance a schema name so that a naive
> parser of our own will do? Googling found no candidates.

In my experience, this could also be in the form 
"nextval('schema.sequencename'::regclass)" 
if the sequence is not in the same schema as the table.

Thomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Getting the currently used sequence for a SERIAL column

2016-10-18 Thread Hanne Moa
Hi,

Until now we've been using pg_get_serial_sequence() to discover
which sequence is in use, but can no longer do so due to two tables
needing to share the same sequence (prior to being properly merged. No
duplicate values, luckily). For one of the tables,
pg_get_serial_sequence() won't be returning anything useful since it
tracks which table *owns* a sequence and not which sequence is used
by which column.

The necessary information seems to be in the table
"information_schema.columns", in  "column_default". Is this to be
regarded as internal API or is it safe to use this to find the correct
sequence? It works in all cases and on all the version of postgres
that are relevant to us. The production system is currently running
9.3 (I'm pining for 9.5...)

Furthermore, what's stored in the column seems to be a string of the
format "nextval('sequencename'::regclass)". Is there a function to
parse this, to return just the sequence name, or will the sequence
name always be without for instance a schema name so that a naive
parser of our own will do? Googling found no candidates.


HM


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dump all the indexes/constraints/roles

2016-10-18 Thread Condor

On 18-10-2016 06:32, Patrick B wrote:

Hi guys,

I need to export an entire database to another server, for testing
purpose.

Is there any way to export all indexes and constraints ?
Postgres 9.2
Patrick



Hello,

pg_dump database is that you need, but if you asking do you can export 
data + indexes like binary data of indexes to can
you restore data fast without to wait all indexes to be create / rebuild 
on another server answer is: NO, you can't export them.



Regards,
Hristo S


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Collations and codepages

2016-10-18 Thread Raimo Jormakka
Hi all,

In Windows 7, and using PostgreSQL 9.4.5, the collation gets set to
"English_United States.1252" when I select the "English, United States"
locale in the installer. In Linux, the collation is set to "en_US.UTF-8".
The encoding is set to UTF-8 in both instances.

Will these two instances behave identically in terms of collation logic?
And if not, is there something I can do about it? In general, what's the
impact of the codepage part of a collation to begin with?

Cheers,
Raimo