Re: [GENERAL] Recursive Arrays 101

2015-11-05 Thread Achilleas Mantzios

On 04/11/2015 17:53, Rob Sargent wrote:

On 11/04/2015 03:03 AM, Achilleas Mantzios wrote:

Sorry for being kind of late to the party (I was in 2015.PgConf.EU !!), and not 
having read
most of the replies, what we have been successfully doing for this problem for 
our app
is do it this way :
parents int[] -- where parents stores the path from the node to the root of the 
tree
and then have those indexes :
btree (first(parents))
btree (level(parents)) -- length
btree (last(parents))
gin (parents gin__int_ops) -- the most important

This has been described as "genealogical tree" approach, and works very good, 
IMHO much better
than nested sets.

Is there a more complete description of this approach available? By the title one might assume could be applied to populations as opposed to phylogeny (the OP's use case).  Does it deal with 
consanguinity?  Does it perform well going "up" the tree (which is of course branched at every level)?


From here https://en.wikipedia.org/wiki/Phylogenetic_tree I assume that 
phylogenetic trees are normal
trees, and I see no reason why not be modeled with the genealogical approach 
described. The earliest paper
I based my work on was :
https://www.google.com/url?sa=t=j==s=web=2=0CCUQFjABahUKEwiR6auUlvnIAhXGvhQKHVyDA-s=https%3A%2F%2Fdownload.samba.org%2Fpub%2Funpacked%2Fldb%2Fldb_sqlite3%2Ftrees.ps=AFQjCNEktJsibP435MBki5cdGmO_CzKmwg=I9yC_tpyeWrEueDJTXbyAA=bv.106674449,d.d24=rja

Finding the root is O(1). Going "up" the tree or finding common ancestry is 
reduced to the problem
of finding overlap/intersections/contains/contained between postgresql arrays.

The indexes, functions and operators provided by contrib/intarray were a basic 
element for the success of this
approach.


--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: [GENERAL] Deadlock detected after pg_repack receives SIGINT

2015-11-05 Thread Kevin Grittner
On Thursday, November 5, 2015 12:16 AM, Jiří Hlinka  
wrote:

> My opinion is, that pg_repack should guarantee a consistent,
> deadlock-free behaviour via proper locking policy

I would be very interesting in seeing a description of what locking
policy would guarantee deadlock-free behavior when run concurrently
with unknown software.  If you have a link to a paper on the topic,
that would serve as well as a description here.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-11-05 Thread Rob Sargent

On 11/05/2015 11:08 AM, Gavin Flower wrote:

On 06/11/15 04:33, Rob Sargent wrote:

On 11/05/2015 04:56 AM, Achilleas Mantzios wrote:

On 04/11/2015 17:53, Rob Sargent wrote:

On 11/04/2015 03:03 AM, Achilleas Mantzios wrote:
Sorry for being kind of late to the party (I was in 2015.PgConf.EU 
!!), and not having read
most of the replies, what we have been successfully doing for this 
problem for our app

is do it this way :
parents int[] -- where parents stores the path from the node to 
the root of the tree

and then have those indexes :
btree (first(parents))
btree (level(parents)) -- length
btree (last(parents))
gin (parents gin__int_ops) -- the most important

This has been described as "genealogical tree" approach, and works 
very good, IMHO much better

than nested sets.

Is there a more complete description of this approach available?  
By the title one might assume could be applied to populations as 
opposed to phylogeny (the OP's use case). Does it deal with 
consanguinity?  Does it perform well going "up" the tree (which is 
of course branched at every level)?


From here https://en.wikipedia.org/wiki/Phylogenetic_tree I assume 
that phylogenetic trees are normal
trees, and I see no reason why not be modeled with the genealogical 
approach described. The earliest paper

I based my work on was :
https://www.google.com/url?sa=t=j==s=web=2=0CCUQFjABahUKEwiR6auUlvnIAhXGvhQKHVyDA-s=https%3A%2F%2Fdownload.samba.org%2Fpub%2Funpacked%2Fldb%2Fldb_sqlite3%2Ftrees.ps=AFQjCNEktJsibP435MBki5cdGmO_CzKmwg=I9yC_tpyeWrEueDJTXbyAA=bv.106674449,d.d24=rja 



Finding the root is O(1). Going "up" the tree or finding common 
ancestry is reduced to the problem
of finding overlap/intersections/contains/contained between 
postgresql arrays.


The indexes, functions and operators provided by contrib/intarray 
were a basic element for the success of this

approach.

Going "up" a genealogy to me means getting two parents, four 
grandparents, 8 great grandparents etc.  On a good day, at least when 
there are no loops.  This isn't, to my understanding, how phylogeny 
works (but my genetics degree was thirty year ago) so perhaps I'm 
still confused by the titles used.  And certainly not to say that 
your approach isn't what the OP really needs!



You're actually going 'DOWN' the tree, in terms of how trees are used 
in computer science & graph theory!


See http://www.mathcove.net/petersen/lessons/get-lesson?les=32


Cheers,
Gavin



Fine.  Be that way :)  Still the question of loops/consanguinity?





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Lock contention in TransactionIdIsInProgress()

2015-11-05 Thread Julian v. Bock
Hi

I have been investigating a problem with one of our production databases
where every few weeks the database suddenly came to a grinding
halt. When this happened, queries needed much more time to complete and
just opening a local db connection with psql could take up to 30
seconds.

The PostgreSQL version is 9.2.13.

The database has a central table (about 60GB including indexes) with
records which can be searched via a web interface. The table is a
materialized view that is updated via several custom triggers on other
tables. When data is changed, all relevant record are DELETEd and then
INSERTed from a view.

The table is replicated via slony1 (2.1.4). All search queries are
executed on a slave db. The databases run on Xeon servers with 24 cores
and 96GB of ram.

I managed to reproduce the problem using the following steps:

- Perform an update on the master server which causes about 50.000 rows
  to be repeatedly deleted and inserted in a single transaction. For
  testing I did this 20 times for about a million deletes and inserts.(*)

- Slony starts to replicate the individual deletes and inserts to the
  slave db.

- After this has been running for a few minutes, I run a particular
  query(**) on the slave repeatedly in up to 20 psql connections. For
  each query a new connection is opened.

- After a few seconds the backend processes go up to almost 100% cpu
  usage and take a long time to complete. When i run psql on the console
  it takes anywhere from 0s (almost immediately) to tens of seconds to
  get to the prompt. (The average time gets worse the longer the
  replication is running.)

After doing some debugging it seems that the backend processes are
spending most of their time in spinlocks in TransactionIdIsInProgress()
trying to get a lock on ProcArrayLock. This function is called more
often (when processing searches) the longer the replication transaction
is running.

TransactionIdIsInProgress() conveniently has some debug counters
enabled via #defining XIDCACHE_DEBUG. When the backend processes
processing the searches finish their transaction, the xc_by_main_xid
counter is about the same for each search. The longer the slony
replication transaction is running, the higher the number gets.

Serious slowdowns start at about 100.000 and I had it get up to more
than 2.000.000 (not sure whether that corresponds to the number of
deleted / inserted tuples).


Is this a known problem?

Unfortunately, I did not yet manage to create a self contained
example. Just copying the table to another database on the slave and
deleting / inserting the tuples without slony did not reproduce the
issue. I am not sure whether this is due to reduced bloat in the table
after pg_dump / pg_restore or whether it is relevant that slony deletes
and inserts the rows individually (or something entirely different).


(*) On the production server a poor interaction between triggers and the
way the application updates data caused the rows to be deleted /
inserted many times. Fortunately, we could work around this so there
should be no immediate danger that this comes up on the live db again.

(**) The search query I used to reproduce the problem does not look
particularly nasty and returns about 260 rows in a few hundred
milliseconds under normal circumstances. It does no full table scan but
uses a few indexes.

The returned rows are among the rows that get deleted and inserted
repeatedly.


Regards,
Julian v. Bock

-- 
Julian v. Bock   Projektleitung Software-Entwicklung
OpenIT GmbH  Tel +49 211 239 577-0
In der Steele 33a-41 Fax +49 211 239 577-10
D-40599 Düsseldorf   http://www.openit.de

HRB 38815 Amtsgericht Düsseldorf USt-Id DE 812951861
Geschäftsführer: Oliver Haakert, Maurice Kemmann


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-11-05 Thread Gavin Flower

On 06/11/15 04:33, Rob Sargent wrote:

On 11/05/2015 04:56 AM, Achilleas Mantzios wrote:

On 04/11/2015 17:53, Rob Sargent wrote:

On 11/04/2015 03:03 AM, Achilleas Mantzios wrote:
Sorry for being kind of late to the party (I was in 2015.PgConf.EU 
!!), and not having read
most of the replies, what we have been successfully doing for this 
problem for our app

is do it this way :
parents int[] -- where parents stores the path from the node to the 
root of the tree

and then have those indexes :
btree (first(parents))
btree (level(parents)) -- length
btree (last(parents))
gin (parents gin__int_ops) -- the most important

This has been described as "genealogical tree" approach, and works 
very good, IMHO much better

than nested sets.

Is there a more complete description of this approach available?  By 
the title one might assume could be applied to populations as 
opposed to phylogeny (the OP's use case).  Does it deal with 
consanguinity?  Does it perform well going "up" the tree (which is 
of course branched at every level)?


From here https://en.wikipedia.org/wiki/Phylogenetic_tree I assume 
that phylogenetic trees are normal
trees, and I see no reason why not be modeled with the genealogical 
approach described. The earliest paper

I based my work on was :
https://www.google.com/url?sa=t=j==s=web=2=0CCUQFjABahUKEwiR6auUlvnIAhXGvhQKHVyDA-s=https%3A%2F%2Fdownload.samba.org%2Fpub%2Funpacked%2Fldb%2Fldb_sqlite3%2Ftrees.ps=AFQjCNEktJsibP435MBki5cdGmO_CzKmwg=I9yC_tpyeWrEueDJTXbyAA=bv.106674449,d.d24=rja

Finding the root is O(1). Going "up" the tree or finding common 
ancestry is reduced to the problem
of finding overlap/intersections/contains/contained between 
postgresql arrays.


The indexes, functions and operators provided by contrib/intarray 
were a basic element for the success of this

approach.

Going "up" a genealogy to me means getting two parents, four 
grandparents, 8 great grandparents etc.  On a good day, at least when 
there are no loops.  This isn't, to my understanding, how phylogeny 
works (but my genetics degree was thirty year ago) so perhaps I'm 
still confused by the titles used.  And certainly not to say that your 
approach isn't what the OP really needs!



You're actually going 'DOWN' the tree, in terms of how trees are used in 
computer science & graph theory!


See http://www.mathcove.net/petersen/lessons/get-lesson?les=32


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lock contention in TransactionIdIsInProgress()

2015-11-05 Thread Jeff Janes
On Thu, Nov 5, 2015 at 10:38 AM, Julian v. Bock  wrote:

> After doing some debugging it seems that the backend processes are
> spending most of their time in spinlocks in TransactionIdIsInProgress()
> trying to get a lock on ProcArrayLock. This function is called more
> often (when processing searches) the longer the replication transaction
> is running.

This is a known issue and is fixed for 9.6 in commit 8a7d0701814a4e.

I don' t know why you can't reproduce it without Slony, though.  I
know that hot standby gets bit by this problem particularly hard
because it sometimes refrains from setting hint bits as aggressively
as it could.  I would not think that Slony would have that problem,
though.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] is there any difference DROP PRIMARY KEY in oracle and postgres?

2015-11-05 Thread Melvin Davidson
In PostgreSQL, there is no “DROP PRIMARY KEY”. Instead, you need to use
DROP INDEX index_name;

http://www.postgresql.org/docs/9.4/interactive/sql-dropindex.html


On Thu, Nov 5, 2015 at 10:52 PM, M Tarkeshwar Rao <
m.tarkeshwar@ericsson.com> wrote:

> Hi,
>
>
>
> one thing in oracle is there any difference between “DROP PRIMARY KEY”
> used directly in oracle to drop primary key, or “DROP CONSTRAINT
> CDRAUDITPOINT_pk”, as first syntax is not available in postgres and we need
> to give primary key name as constraint to delete a key. SO right now to
> delete primary key I am using second approach, so is there any difference
> between two?
>
>
>
> Regards
>
> Tarkeshwar
>
>
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


[GENERAL] is there any difference DROP PRIMARY KEY in oracle and postgres?

2015-11-05 Thread M Tarkeshwar Rao
Hi,

one thing in oracle is there any difference between "DROP PRIMARY KEY" used 
directly in oracle to drop primary key, or "DROP CONSTRAINT CDRAUDITPOINT_pk", 
as first syntax is not available in postgres and we need to give primary key 
name as constraint to delete a key. SO right now to delete primary key I am 
using second approach, so is there any difference between two?

Regards
Tarkeshwar



Re: [GENERAL] is there any difference DROP PRIMARY KEY in oracle and postgres?

2015-11-05 Thread Joshua D. Drake

On 11/05/2015 07:52 PM, M Tarkeshwar Rao wrote:

Hi,

one thing in oracle is there any difference between “DROP PRIMARY KEY”
used directly in oracle to drop primary key, or “DROP CONSTRAINT
CDRAUDITPOINT_pk”, as first syntax is not available in postgres and we
need to give primary key name as constraint to delete a key. SO right
now to delete primary key I am using second approach, so is there any
difference between two?


No. As a PRIMARY KEY is technically a constraint, we use ALTER TABLE 
DROP CONSTRAINT.


JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
New rule for social situations: "If you think to yourself not even
JD would say this..." Stop and shut your mouth. It's going to be bad.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deadlock detected after pg_repack receives SIGINT

2015-11-05 Thread Jiří Hlinka
Hi Kevin,

my point was, that pg_repack deadlocked itself - I think it should be
possible to guarantee deadlock-free behavior at least via advisory lock for
operations of pg_repack itself (I understand it is not possible to
guarantee this across more apps). If it is not true, I'd be glad to hear
I'm wrong (really!).

Thanks,
Jiri

On Thu, Nov 5, 2015 at 5:43 PM, Kevin Grittner  wrote:

> On Thursday, November 5, 2015 12:16 AM, Jiří Hlinka 
> wrote:
>
> > My opinion is, that pg_repack should guarantee a consistent,
> > deadlock-free behaviour via proper locking policy
>
> I would be very interesting in seeing a description of what locking
> policy would guarantee deadlock-free behavior when run concurrently
> with unknown software.  If you have a link to a paper on the topic,
> that would serve as well as a description here.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Bc. Jiří Hlinka
Tel.: 725 315 263


Re: [GENERAL] Deadlock detected after pg_repack receives SIGINT

2015-11-05 Thread Michael Paquier
On Fri, Nov 6, 2015 at 4:08 PM, Jiří Hlinka  wrote:
> my point was, that pg_repack deadlocked itself - I think it should be
> possible to guarantee deadlock-free behavior at least via advisory lock for
> operations of pg_repack itself (I understand it is not possible to guarantee
> this across more apps). If it is not true, I'd be glad to hear I'm wrong
> (really!).

Er, well. Based on the information given there is actually no evidence
that pg_repack is actually deadlocked. The code path calling the
cleanup callback after receiving SIGINT waiting for the transaction
working on the trigger being dropped to finish. We could say that
there is a deadlock if the transaction inserting data to repack.log%
is actually holding a lock that conflicts with the trigger being
dropped.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] is there any difference DROP PRIMARY KEY in oracle and postgres?

2015-11-05 Thread Thomas Kellerer
M Tarkeshwar Rao schrieb am 06.11.2015 um 04:52:
> one thing in oracle is there any difference between “DROP PRIMARY
> KEY” used directly in oracle to drop primary key, or “DROP CONSTRAINT
> CDRAUDITPOINT_pk”, as first syntax is not available in postgres and
> we need to give primary key name as constraint to delete a key. SO
> right now to delete primary key I am using second approach, so is
> there any difference between two?


Unlike Oracle, Postgres gives the PK constraint a sensible (and reproducible) 
name. 

So even if you did not specify a constraint name when creating the index, you 
know the name: it's always "tablename_pkey".

The statement: 

   create table foo (id integer primary key);

will create a PK constraint named "foo_pkey", and therefore you can drop it 
using:

   alter table foo drop constraint foo_pkey;

I don't know which name gets chosen when the table name is so long that adding 
_pkey it would yield an identifier that is too long (>63 characters)

But having an "alter table drop primary key" would indeed be nice. 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Recursive Arrays 101

2015-11-05 Thread Rob Sargent

On 11/05/2015 04:56 AM, Achilleas Mantzios wrote:

On 04/11/2015 17:53, Rob Sargent wrote:

On 11/04/2015 03:03 AM, Achilleas Mantzios wrote:
Sorry for being kind of late to the party (I was in 2015.PgConf.EU 
!!), and not having read
most of the replies, what we have been successfully doing for this 
problem for our app

is do it this way :
parents int[] -- where parents stores the path from the node to the 
root of the tree

and then have those indexes :
btree (first(parents))
btree (level(parents)) -- length
btree (last(parents))
gin (parents gin__int_ops) -- the most important

This has been described as "genealogical tree" approach, and works 
very good, IMHO much better

than nested sets.

Is there a more complete description of this approach available?  By 
the title one might assume could be applied to populations as opposed 
to phylogeny (the OP's use case).  Does it deal with consanguinity?  
Does it perform well going "up" the tree (which is of course branched 
at every level)?


From here https://en.wikipedia.org/wiki/Phylogenetic_tree I assume 
that phylogenetic trees are normal
trees, and I see no reason why not be modeled with the genealogical 
approach described. The earliest paper

I based my work on was :
https://www.google.com/url?sa=t=j==s=web=2=0CCUQFjABahUKEwiR6auUlvnIAhXGvhQKHVyDA-s=https%3A%2F%2Fdownload.samba.org%2Fpub%2Funpacked%2Fldb%2Fldb_sqlite3%2Ftrees.ps=AFQjCNEktJsibP435MBki5cdGmO_CzKmwg=I9yC_tpyeWrEueDJTXbyAA=bv.106674449,d.d24=rja

Finding the root is O(1). Going "up" the tree or finding common 
ancestry is reduced to the problem
of finding overlap/intersections/contains/contained between postgresql 
arrays.


The indexes, functions and operators provided by contrib/intarray were 
a basic element for the success of this

approach.

Going "up" a genealogy to me means getting two parents, four 
grandparents, 8 great grandparents etc.  On a good day, at least when 
there are no loops.  This isn't, to my understanding, how phylogeny 
works (but my genetics degree was thirty year ago) so perhaps I'm still 
confused by the titles used.  And certainly not to say that your 
approach isn't what the OP really needs!