Re: [PERFORM] unlogged tables

2015-04-14 Thread Jeff Janes
On Tue, Apr 14, 2015 at 8:41 AM, Yves Dorfsman y...@zioup.com wrote:

 On 2015-04-13 17:49, Jeff Janes wrote:
 
  One way would be to lock dirty buffers from unlogged relations into
  shared_buffers (which hardly seems like a good thing) until the start of
 a
  super-checkpoint and then write them all out as fast as possible
 (which kind
  of defeats checkpoint_completion_target).  And then if the crash happened
  during a super-checkpoint, the data would still be inconsistent and need
 to be
  truncated.
 

 What do you call a super-checkpoint?


A hypothetical checkpoint which includes writing and flushing pages of
unlogged tables.

Presumably you wouldn't want every checkpoint to do this, because if done
the way I described the super-checkpoint is a vulnerable period.  Crashes
that happen during it would result in truncation of the unlogged relation.
Since that is the very thing we want to avoid, you would want to make these
vulnerable periods rare.

Cheers,

Jeff


Re: [PERFORM] unlogged tables

2015-04-14 Thread Jeff Janes
On Mon, Apr 13, 2015 at 8:28 PM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Mon, Apr 13, 2015 at 7:45 PM, Jim Nasby jim.na...@bluetreble.com
 wrote:



 There's been recent discussion of adding support for read-only tables. If
 we had those, we might be able to support something like...

 INSERT INTO unlogged;
 ALTER TABLE unlogged SET READ ONLY;
 CHECKPOINT;
 /* take backup */

 This should be safe as long as we WAL log changes to read-only status
 (which presumably we would).

 How much work that would entail though, I don't know.


Right.  I've been keeping an eye on that discussion with the same
intention.  The big question is how, during recovery, does it know what
state the table was in without being able to read from the system
catalogs?  Perhaps it would be the checkpointer's duty at the end of the
checkpoint to remove the init fork for unlogged relations which were turned
to read only before that checkpoint started.



 Ultimately you still have to get the data over to the other machine
 anyway. ISTM it'd be a LOT more useful to look at ways to make the WAL
 logging of bulk inserts (and especially COPY into a known empty table) a
 lot more efficient.


 ​Jeff Janes makes a comment about wanting ...to avoid blowing out the
 log archive...; which I also don't quite follow...


I think the WAL logging of bulk COPY is pretty space-efficient already,
provided it is not indexed at the time of the COPY.  But no amount of
efficiency improvement is going to make them small enough for me want to
keep the WAL logs around beyond the next base backup.

What I would really want is a way to make two separate WAL streams; changes
to this set of tables goes to the keep forever, for PITR stream, and
changes to this other set of tables go to the keep until pg_basebackup is
next run stream.  Of course you couldn't have fk constraints between the
two different sets of tables.

Having to get the data over to the other machine doesn't bother me, it is
just a question of how to do it without permanently intermingling it with
WAL logs which I want to keep forever.

The FDW would be a good option, except the overhead (both execution
overhead and the overhead of poor plans) seems to be too large.  I haven't
explored it as much as I would like.

Cheers,

Jeff


Re: [PERFORM] unlogged tables

2015-04-14 Thread dgabriel
David G Johnston wrote
 Well, that is half right anyway.  UNLOGGED tables obey checkpoints just
 like any other table.  The missing feature is an option to leaved restored
 the last checkpoint.  Instead, not knowing whether there were changes
 since the last checkpoint, the system truncated the relation.
 
 What use case is there for a behavior that the last checkpoint data is
 left on the relation upon restarting - not knowing whether it was possible
 the other data could have been written subsequent?

If is possible to restore the table at last checkpoint state that will be
more than enough. I don't care about the changes since last checkpoint, I am
willing to lose those changes.  There are use cases where is acceptable to
lose some data, for example in a cache system, it is not a big issue if we
lose some cached data.  



--
View this message in context: 
http://postgresql.nabble.com/unlogged-tables-tp4985453p5845650.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] unlogged tables

2015-04-14 Thread Yves Dorfsman
On 2015-04-13 17:49, Jeff Janes wrote:
 
 One way would be to lock dirty buffers from unlogged relations into
 shared_buffers (which hardly seems like a good thing) until the start of a
 super-checkpoint and then write them all out as fast as possible (which kind
 of defeats checkpoint_completion_target).  And then if the crash happened
 during a super-checkpoint, the data would still be inconsistent and need to be
 truncated.
  

What do you call a super-checkpoint?

-- 
http://yves.zioup.com
gpg: 4096R/32B0F416



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance of vacuumlo

2015-04-14 Thread Andreas Joseph Krogh
Hi all.   I have a pg_largeobject of ~300GB size and when I run vacuumlo -n 
dbname, I get:   Would remove 82172 large objects from database dbname.  
So I'm running without -n to do the actual work, but it seems to take 
forever. The disks are 8 SAS 10K HDD drives in RAID5.   Any hints on how long 
this is supposed to take?   Thanks.   -- Andreas Joseph Krogh CTO / Partner - 
Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com 
www.visena.com https://www.visena.com  https://www.visena.com

Re: [PERFORM] Some performance testing?

2015-04-14 Thread Josh Berkus
Scott,

 Can confirm that for pg purposes, 3.2 is basically broken in some not
 to great ways. We've had servers that were overloaded at load factors
 of 20 or 30 drop down to 5 or less with just a change from 3.2 to
 3.11/3.13 on ubuntu 12.04

That's correct, and 3.5 shares the same problems.  The underlying issue
was that 3.X was tweaked to be MUCH more aggressive about
cache-clearing, to the point where it would be evicting data from the FS
cache which had just been read in and hadn't even been used yet.  For
some reason, this aggressive eviction got worse the more processes on
the system which were using the FS cache, so where you really see it is
when you have more processes with cache than you have cores.

It's pretty easy to demonstrate just using pgbench, with a database
larger than RAM, and 2X as many clients as cores.  You'll see that
kernels 3.2 and 3.5 will do 3X to 5X as much IO for the same workload as
3.10 and later will do.

Greame,

On 04/09/2015 04:01 AM, Graeme B. Bell wrote: performance with 2.6:
(pgbench, size 100, 32 clients)

 48 651 transactions per second (read only)
 6 504 transactions per second (read-write)


 performance with 3.18 (pgbench, size 100, 32 clients)

 129 303 transactions per second  (read only)
 16 895 transactions (read-write)

Thanks for that data!  I'm glad to see that 3.18 has improved so much.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] unlogged tables

2015-04-14 Thread Jim Nasby

On 4/14/15 10:56 AM, dgabriel wrote:

David G Johnston wrote

Well, that is half right anyway.  UNLOGGED tables obey checkpoints just
like any other table.  The missing feature is an option to leaved restored
the last checkpoint.  Instead, not knowing whether there were changes
since the last checkpoint, the system truncated the relation.

What use case is there for a behavior that the last checkpoint data is
left on the relation upon restarting - not knowing whether it was possible
the other data could have been written subsequent?


If is possible to restore the table at last checkpoint state that will be
more than enough. I don't care about the changes since last checkpoint, I am
willing to lose those changes.  There are use cases where is acceptable to
lose some data, for example in a cache system, it is not a big issue if we
lose some cached data.


It is not. Unless you ensure that data is written to WAL (on disk) 
BEFORE it is written to the data pages, you will probably have 
corruption after a crash, and have no way to prevent or possibly even 
detect the corruption.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance