Re: [HACKERS] Enhancement to pg_dump

2008-11-27 Thread Rob Kirkbride

Gregory Stark wrote:

There is documentation

http://www.postgresql.org/docs/8.3/static/explicit-locking.html

However I found it very confusing when I was first learning. It's not really
the documentation's fault either, there are just a lot of different lock
levels with a lot of different combinations possible.

All DML, even selects, take a table-level shared lock on the tables involved
which blocks the tables from being dropped or truncated while the query is
running.

DELETE and UPDATE (and SELECT FOR UPDATE) take exclusive row-level locks. A
SELECT can read the old version of the record but another UPDATE will block
until your transaction finishes so it can update the most recent version. But
an update which doesn't need to look at that record won't be affected at all.

TRUNCATE and DROP take exclusive table-level locks which blocks anyone else
from even selecting from the table. It also means they can't proceed until all
queries which have already started reading the table finish.

DROP is still a lot heavier than TRUNCATE because it also has to drop (or
search for and throw an error) anything else dependent on the table. triggers,
views, etc.

  


Thanks for that  - it's very useful. As you say I believe the 
documentation is pretty good, it's just that we're not dealing in simple 
issues here.


I definitely think I should do a delete rather than a truncate (or drop) 
in my case.



Regards

Rob

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-26 Thread Gregory Stark
"Rob Kirkbride" <[EMAIL PROTECTED]> writes:

> I must admit I've not read up on the various locks that are set so that's a
> good point. Is there a good reference for me to read and understand these?
>
> I'm guessing though that a delete from and then an insert never requires an
> exclusive lock, what about adding/deleting constraints?

There is documentation

http://www.postgresql.org/docs/8.3/static/explicit-locking.html

However I found it very confusing when I was first learning. It's not really
the documentation's fault either, there are just a lot of different lock
levels with a lot of different combinations possible.

All DML, even selects, take a table-level shared lock on the tables involved
which blocks the tables from being dropped or truncated while the query is
running.

DELETE and UPDATE (and SELECT FOR UPDATE) take exclusive row-level locks. A
SELECT can read the old version of the record but another UPDATE will block
until your transaction finishes so it can update the most recent version. But
an update which doesn't need to look at that record won't be affected at all.

TRUNCATE and DROP take exclusive table-level locks which blocks anyone else
from even selecting from the table. It also means they can't proceed until all
queries which have already started reading the table finish.

DROP is still a lot heavier than TRUNCATE because it also has to drop (or
search for and throw an error) anything else dependent on the table. triggers,
views, etc.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-26 Thread Rob Kirkbride
I must admit I've not read up on the various locks that are set so that's a
good point. Is there a good reference for me to read and understand these?

I'm guessing though that a delete from and then an insert never requires an
exclusive lock, what about adding/deleting constraints?

Rob



2008/11/26 Gregory Stark <[EMAIL PROTECTED]>

> Rob Kirkbride <[EMAIL PROTECTED]> writes:
>
> > Richard,
> >
> > Yes, I've changed it use TRUNCATE rather than DELETE and it's working
> well for
> > us now.
>
> I'm a bit surprised actually as it sounded like you were aiming to avoid
> the
> table lock. A TRUNCATE does require an exclusive lock on the table. It
> still
> has advantages over DROP in that there is no window when the table does not
> exist and any existing references to the table from views or functions will
> continue to function.
>
>
> --
>   Gregory Stark
>  EnterpriseDB  http://www.enterprisedb.com
>   Ask me about EnterpriseDB's RemoteDBA services!
>


Re: [HACKERS] Enhancement to pg_dump

2008-11-26 Thread Gregory Stark
Rob Kirkbride <[EMAIL PROTECTED]> writes:

> Richard,
>
> Yes, I've changed it use TRUNCATE rather than DELETE and it's working well for
> us now.

I'm a bit surprised actually as it sounded like you were aiming to avoid the
table lock. A TRUNCATE does require an exclusive lock on the table. It still
has advantages over DROP in that there is no window when the table does not
exist and any existing references to the table from views or functions will
continue to function.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-26 Thread Rob Kirkbride

Richard,

Yes, I've changed it use TRUNCATE rather than DELETE and it's working 
well for us now.


The switching of the database is a good idea - thanks. Unfortunately, 
we've not got enough disk space currently to do that, but if we get 
problems in the future that will definitely be something we'll consider.


Rob

Richard Huxton wrote:

Rob Kirkbride wrote:
  

I've introduced a --delete-not-drop option which simply does a DELETE FROM %
rather than 'DROP and then CREATE'.



Beware foreign-keys slowing you - TRUNCATE all relevant tables should be
the fastest method if possible.

  

I hope this sounds sensible and I haven't missed something - I'm still
learning!



Have you considered restoring to a completely different database
(report1/report2) and just switching between them?

  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-26 Thread Richard Huxton
Rob Kirkbride wrote:
> I've introduced a --delete-not-drop option which simply does a DELETE FROM %
> rather than 'DROP and then CREATE'.

Beware foreign-keys slowing you - TRUNCATE all relevant tables should be
the fastest method if possible.

> I hope this sounds sensible and I haven't missed something - I'm still
> learning!

Have you considered restoring to a completely different database
(report1/report2) and just switching between them?

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-25 Thread Rob Kirkbride
OK thanks for the advice.

What I'm trying to overcome is where we've got a long report running and the
process that is taking data from the main database cannot complete because
of the drop table. I believe a DELETE (and possibly TRUNCATE?) doesn't need
an exclusive lock on the table and therefore can continue.

I've introduced a --delete-not-drop option which simply does a DELETE FROM %
rather than 'DROP and then CREATE'.

I hope this sounds sensible and I haven't missed something - I'm still
learning!

Rob


2008/11/25 Gregory Stark <[EMAIL PROTECTED]>

> "Rob Kirkbride" <[EMAIL PROTECTED]> writes:
>
> > Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?
>
> I would say you should post *before* you have a patch you're happy with. As
> soon as you have a specific plan of what you want to do it's best to post
> an
> outline of it. That way you at least have a chance of avoiding wasting work
> in
> the wrong direction.
>
> Sometimes things don't really work out that way -- sometimes the plan
> sounds
> good and it only becomes apparent there's a better way later -- but you're
> best off getting the best chance you can.
>
> Incidentally, I don't know exactly what the use case you're trying to cover
> here is but you should consider looking at TRUNCATE instead of DELETE if
> you're really deleting all the records in the table and can accept locking
> the
> table.
>
> --
>  Gregory Stark
>  EnterpriseDB  http://www.enterprisedb.com
>  Ask me about EnterpriseDB's Slony Replication support!
>


Re: [HACKERS] Enhancement to pg_dump

2008-11-25 Thread Gregory Stark
"Rob Kirkbride" <[EMAIL PROTECTED]> writes:

> Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?

I would say you should post *before* you have a patch you're happy with. As
soon as you have a specific plan of what you want to do it's best to post an
outline of it. That way you at least have a chance of avoiding wasting work in
the wrong direction.

Sometimes things don't really work out that way -- sometimes the plan sounds
good and it only becomes apparent there's a better way later -- but you're
best off getting the best chance you can.

Incidentally, I don't know exactly what the use case you're trying to cover
here is but you should consider looking at TRUNCATE instead of DELETE if
you're really deleting all the records in the table and can accept locking the
table.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancement to pg_dump

2008-11-25 Thread Rob Kirkbride
Dave,

Ok thanks. Yes, we've got over 1/2 billion rows in one of our tables which
is interesting!

Will post back soon.

Rob

2008/11/25 Dave Page <[EMAIL PROTECTED]>

> On Tue, Nov 25, 2008 at 8:39 PM, Rob Kirkbride <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >
> > I'm very new to hacking postgresql but am using on a very big site
> > (http://ojp.nationalrail.co.uk). One of the issues that we have is
> moving
> > data from a live database to a reports one. I've hacked an extra option
> to
> > pg_dump to delete from tables rather than dropping them.
>
> National Rail use Postgres for their journey planner? Cool :-)
>
> > Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?
>
> Yes (and please add details to
> http://wiki.postgresql.org/wiki/CommitFestOpen so it doesn't get
> lost), but please note that we're in the middle of the final phase of
> the development cycle at the moment, so new patches are unlikely to be
> looked at for at least a couple of months.
>
>
> --
> Dave Page
> EnterpriseDB UK:   http://www.enterprisedb.com
>


Re: [HACKERS] Enhancement to pg_dump

2008-11-25 Thread Dave Page
On Tue, Nov 25, 2008 at 8:39 PM, Rob Kirkbride <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm very new to hacking postgresql but am using on a very big site
> (http://ojp.nationalrail.co.uk). One of the issues that we have is moving
> data from a live database to a reports one. I've hacked an extra option to
> pg_dump to delete from tables rather than dropping them.

National Rail use Postgres for their journey planner? Cool :-)

> Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?

Yes (and please add details to
http://wiki.postgresql.org/wiki/CommitFestOpen so it doesn't get
lost), but please note that we're in the middle of the final phase of
the development cycle at the moment, so new patches are unlikely to be
looked at for at least a couple of months.


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Enhancement to pg_dump

2008-11-25 Thread Rob Kirkbride
Hi,

I'm very new to hacking postgresql but am using on a very big site (
http://ojp.nationalrail.co.uk). One of the issues that we have is moving
data from a live database to a reports one. I've hacked an extra option to
pg_dump to delete from tables rather than dropping them.

Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?

Thanks

Rob


Re: [HACKERS] enhancement to pg_dump: supress columns

2005-10-14 Thread Merlin Moncure
> On N, 2005-10-13 at 15:13 -0400, Merlin Moncure wrote:
> > I have a situation where I need to hack pg_dump not to dump columns
with
> > a particular name.  If this is of interest to the community I can
spend
> > a little extra effort and work up a patch.  I'd be curious to see if
> > anyone else thinks this is worthwhile.
> >
> > Why would I want to do this?  I use a global sequence for a database
> > wide unique identifier for purposes of locking (to hook into the
user
> > lock module).  This works great but our clients frequently like to
make
> > copies of data for testing purposes and a dump/reload into a
separate
> > schema makes a copy of the generated identifier in the database.
> >
> > Basically, I need a field to revert to default in a dump/reload
cycle.
> > A command line switch to pg_dump seems the easiest way to handle
this.
> > A specialized domain qualifier which prevents the column from being
> > dumped is perhaps more elegant but more work.
> 
> I think that general ability to *exclude* things (schemas, tables,
> functions) from pg_dump output would be great:
> 
> pg_dump mydb -X schema.view -X "schema.function(int,text,text)" -X
> "schema.table.*fieldnamepart*"


hmm, ok, that works.  Also schema.domain

Merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] enhancement to pg_dump: supress columns

2005-10-14 Thread Hannu Krosing
On N, 2005-10-13 at 15:13 -0400, Merlin Moncure wrote:
> I have a situation where I need to hack pg_dump not to dump columns with
> a particular name.  If this is of interest to the community I can spend
> a little extra effort and work up a patch.  I'd be curious to see if
> anyone else thinks this is worthwhile.
> 
> Why would I want to do this?  I use a global sequence for a database
> wide unique identifier for purposes of locking (to hook into the user
> lock module).  This works great but our clients frequently like to make
> copies of data for testing purposes and a dump/reload into a separate
> schema makes a copy of the generated identifier in the database.
> 
> Basically, I need a field to revert to default in a dump/reload cycle.
> A command line switch to pg_dump seems the easiest way to handle this.
> A specialized domain qualifier which prevents the column from being
> dumped is perhaps more elegant but more work.  

I think that general ability to *exclude* things (schemas, tables,
functions) from pg_dump output would be great:

pg_dump mydb -X schema.view -X "schema.function(int,text,text)" -X
"schema.table.*fieldnamepart*"

-- 
Hannu Krosing <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] enhancement to pg_dump: supress columns

2005-10-14 Thread Merlin Moncure
Christopher wrote:
> A general ability to be able to dump views as if they were tables
would
> be more broadly applicable methinks?
> 
> Merlin Moncure wrote:
> > I have a situation where I need to hack pg_dump not to dump columns
with
> > a particular name.  If this is of interest to the community I can
spend
> > a little extra effort and work up a patch.  I'd be curious to see if
> > anyone else thinks this is worthwhile.
> >

Yes, although that would not solve my particular problem.  Since I don't
want to dump a particular column that is present in all tables, using
views would mean creating a lot of views for one purpose :(.  Basically,
I have a simulated oid which is superior to the provided oid in all ways
except one, in that the oid allows you to keep it from dumping.

Another example of a column you might not want to dump is
passwords...although that's a fairly weak case.  hmm, I might be out in
the wilderness on this one.

Merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] enhancement to pg_dump: supress columns

2005-10-13 Thread Christopher Kings-Lynne
A general ability to be able to dump views as if they were tables would 
be more broadly applicable methinks?


Merlin Moncure wrote:

I have a situation where I need to hack pg_dump not to dump columns with
a particular name.  If this is of interest to the community I can spend
a little extra effort and work up a patch.  I'd be curious to see if
anyone else thinks this is worthwhile.

Why would I want to do this?  I use a global sequence for a database
wide unique identifier for purposes of locking (to hook into the user
lock module).  This works great but our clients frequently like to make
copies of data for testing purposes and a dump/reload into a separate
schema makes a copy of the generated identifier in the database.

Basically, I need a field to revert to default in a dump/reload cycle.
A command line switch to pg_dump seems the easiest way to handle this.
A specialized domain qualifier which prevents the column from being
dumped is perhaps more elegant but more work.  


Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] enhancement to pg_dump: supress columns

2005-10-13 Thread Merlin Moncure
I have a situation where I need to hack pg_dump not to dump columns with
a particular name.  If this is of interest to the community I can spend
a little extra effort and work up a patch.  I'd be curious to see if
anyone else thinks this is worthwhile.

Why would I want to do this?  I use a global sequence for a database
wide unique identifier for purposes of locking (to hook into the user
lock module).  This works great but our clients frequently like to make
copies of data for testing purposes and a dump/reload into a separate
schema makes a copy of the generated identifier in the database.

Basically, I need a field to revert to default in a dump/reload cycle.
A command line switch to pg_dump seems the easiest way to handle this.
A specialized domain qualifier which prevents the column from being
dumped is perhaps more elegant but more work.  

Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly