subject:"Re\: \[HACKERS\] \[RFC\] Incremental backup v3\: incremental PoC"

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-27 Thread Giuseppe Broccolo

Hi Marco,

On 16/01/15 16:55, Marco Nenciarini wrote:
> On 14/01/15 17:22, Gabriele Bartolini wrote:
> >
> > My opinion, Marco, is that for version 5 of this patch, you:
> >
> > 1) update the information on the wiki (it is outdated - I know you have
> > been busy with LSN map optimisation)
>
> Done.
>
> > 2) modify pg_basebackup in order to accept a directory (or tar file) and
> > automatically detect the LSN from the backup profile
>
> New version of patch attached. The -I parameter now requires a backup
> profile from a previous backup. I've added a sanity check that forbid
> incremental file level backups if the base timeline is different from
> the current one.
>
> > 3) add the documentation regarding the backup profile and pg_basebackup
> >
>
> Next on my TODO list.
>
> > Once we have all of this, we can continue trying the patch. Some
> > unexplored paths are:
> >
> > * tablespace usage
>
> I've improved my pg_restorebackup python PoC. It now supports tablespaces.

About tablespaces, I noticed that any pointing to tablespace locations is
lost during the recovery of an incremental backup changing the tablespace
mapping (-T option). Here the steps I followed:

   - creating and filling a test database obtained through pgbench

   psql -c "CREATE DATABASE pgbench"
   pgbench -U postgres -i -s 5 -F 80 pgbench

   - a first base backup with pg_basebackup:

   mkdir -p backups/$(date '+%d%m%y%H%M')/data && pg_basebackup -v -F
p -D backups/$(date '+%d%m%y%H%M')/data -x

   - creation of a new tablespace, alter the table "pgbench_accounts" to
   set the new tablespace:

   mkdir -p /home/gbroccolo/pgsql/tbls
   psql -c "CREATE TABLESPACE tbls LOCATION '/home/gbroccolo/pgsql/tbls'"
   psql -c "ALTER TABLE pgbench_accounts SET TABLESPACE tbls" pgbench

   - Doing some work on the database:

   pgbench -U postgres -T 120 pgbench

   - a second incremental backup with pg_basebackup specifying the new
   location for the tablespace through the tablespace mapping:

   mkdir -p backups/$(date '+%d%m%y%H%M')/data backups/$(date
'+%d%m%y%H%M')/tbls && pg_basebackup -v -F p -D backups/$(date
'+%d%m%y%H%M')/data -x -I backups/2601151641/data/backup_profile -T
/home/gbroccolo/pgsql/tbls=/home/gbroccolo/pgsql/backups/$(date
'+%d%m%y%H%M')/tbls

   - a recovery based on the tool pg_restorebackup.py attached in
   http://www.postgresql.org/message-id/54b9428e.9020...@2ndquadrant.it

   ./pg_restorebackup.py backups/2601151641/data
backups/2601151707/data /tmp/data -T
/home/gbroccolo/pgsql/backups/2601151707/tbls=/tmp/tbls


In the last step, I obtained the following stack trace:

Traceback (most recent call last):
  File "./pg_restorebackup.py", line 74, in 
shutil.copy2(base_file, dest_file)
  File "/home/gbroccolo/.pyenv/versions/2.7.5/lib/python2.7/shutil.py",
line 130, in copy2
copyfile(src, dst)
  File "/home/gbroccolo/.pyenv/versions/2.7.5/lib/python2.7/shutil.py",
line 82, in copyfile
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory:
'backups/2601151641/data/base/16384/16406_fsm'


Any idea on what's going wrong?

Thanks,
Giuseppe.
-- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.brocc...@2ndquadrant.it | www.2ndQuadrant.it

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-16 Thread Marco Nenciarini

On 14/01/15 17:22, Gabriele Bartolini wrote:
> 
> My opinion, Marco, is that for version 5 of this patch, you:
> 
> 1) update the information on the wiki (it is outdated - I know you have
> been busy with LSN map optimisation)

Done.

> 2) modify pg_basebackup in order to accept a directory (or tar file) and
> automatically detect the LSN from the backup profile

New version of patch attached. The -I parameter now requires a backup
profile from a previous backup. I've added a sanity check that forbid
incremental file level backups if the base timeline is different from
the current one.

> 3) add the documentation regarding the backup profile and pg_basebackup
> 

Next on my TODO list.

> Once we have all of this, we can continue trying the patch. Some
> unexplored paths are:
> 
> * tablespace usage

I've improved my pg_restorebackup python PoC. It now supports tablespaces.

> * tar format
> * performance impact (in both "read-only" and heavily updated contexts)

From the server point of view, the current code generates a load similar
to normal backup. It only adds an initial scan of any data file to
decide whether it has to send it. One it found a single newer page it
immediately stop scanning and start sending the file. The IO impact
should not be that big due to the filesystem cache, but I agree with you
that it has to be measured.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From f7cf8b9dd7d32f64a30dafaeeaeb56cbcd2eafff Mon Sep 17 00:00:00 2001
From: Marco Nenciarini 
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v5

Add backup profile to pg_basebackup
INCREMENTAL option implementaion
---
 src/backend/access/transam/xlog.c  |   7 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 335 +++--
 src/backend/replication/repl_gram.y|   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  | 147 +--
 src/include/access/xlog.h  |   3 +-
 src/include/replication/basebackup.h   |   4 +
 8 files changed, 473 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 629a457..1e50625 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
   char **labelfile)
  {
boolexclusive = (labelfile == NULL);
--- 9249,9256 
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
!  XLogRecPtr incremental_startpoint, 
TimeLineID *starttli_p,
   char **labelfile)
  {
boolexclusive = (labelfile == NULL);
*** do_pg_start_backup(const char *backupids
*** 9468,9473 
--- 9469,9478 
 (uint32) (startpoint >> 32), (uint32) startpoint, 
xlogfilename);
appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
 (uint32) (checkpointloc >> 32), 
(uint32) checkpointloc);
+   if (incremental_startpoint > 0)
+   appendStringInfo(&labelfbuf, "INCREMENTAL FROM 
LOCATION: %X/%X\n",
+(uint32) 
(incremental_startpoint >> 32),
+(uint32) 
incremental_startpoint);
appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
 exclusive ? "pg_start_backup" 
: "streamed");
appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c 
b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
   errmsg("must be superuser or replication role to run a 
backup")));
  
!   startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
PG_RETURN_LSN(startpoint);
  }
--- 59,65 
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
   errmsg("must be superuser or replication role to run a 
backup")));
  
!   startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 07030a2..05

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-14 Thread Gabriele Bartolini

Hi Marco,

  thank you for sending an updated patch. I am writing down a report of
this initial (and partial) review.

IMPORTANT: This patch is not complete, as stated by Marco. See the
"Conclusions" section for my proposed TODO list.

== Patch application

I have been able to successfully apply your patch and compile it.
Regression tests passed.

== Initial run

I have created a fresh new instance of PostgreSQL and activated streaming
replication to be used by pg_basebackup. I have done a pgbench run with
scale 100.

I have taken a full consistent backup with pg_basebackup (in plain format):

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -x


I have been able to verify that the backup_profile is correctly placed in
the destination PGDATA directory. Here is an excerpt:

POSTGRESQL BACKUP PROFILE 1
START WAL LOCATION: 0/358 (file 00010003)
CHECKPOINT LOCATION: 0/38C
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2015-01-14 10:07:07 CET
LABEL: pg_basebackup base backup
FILE LIST
\N  \N  t   1421226427  206 backup_label
\N  \N  t   1421225508  88  postgresql.auto.conf
...


As suggested by Marco, I have manually taken the LSN from this file (next
version must do this automatically).
I have then executed pg_basebackup and activated the incremental feature by
using the LSN from the previous backup, as follows:

LSN=$(awk '/^START WAL/{print $4}' backup_profile)

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -I $LSN -x


The time taken by this operation has been much lower than the previous one
and the size is much lower (I have not done any operation in the meantime):

du -hs backup-1421226*
1,5Gbackup-1421226427
17M backup-1421226427


I have done some checks on the file system and then used the prototype of
recovery script in Python written by Marco.

./recover.py backup-1421226427 backup-1421226427 new-data

The cluster started successfully. I have then run a pg_dump of the pgbench
database and were able to reload it on the initial cluster.

== Conclusions

The first run of this patch seems promising.

While the discussion on the LSN map continues (which is mainly an
optimisation of this patch), I would really like to see this patch progress
as it would be a killer feature in several contexts (not in every context).

Just in this period we are releasing file based incremental backup for
Barman and customers using the alpha version are experiencing on average a
deduplication ratio between 50% to 70%. This is for example an excerpt of
"barman show-backup" from one of our customers (a daily saving of 550GB is
not bad):

 Base backup information:
Disk usage   : 1.1 TiB (1.1 TiB with WALs)
Incremental size : 564.6 GiB (-50.60%)
...

My opinion, Marco, is that for version 5 of this patch, you:

1) update the information on the wiki (it is outdated - I know you have
been busy with LSN map optimisation)
2) modify pg_basebackup in order to accept a directory (or tar file) and
automatically detect the LSN from the backup profile
3) add the documentation regarding the backup profile and pg_basebackup

Once we have all of this, we can continue trying the patch. Some unexplored
paths are:

* tablespace usage
* tar format
* performance impact (in both "read-only" and heavily updated contexts)
* consistency checks

I would then leave for version 6 the pg_restorebackup utility (unless you
want to do everything at once).

One limitation of the current recovery script is that it cannot accept
multiple incremental backups (it just accepts three parameters: base
backup, incremental backup and merge destination). Maybe you can change the
syntax as follows:

./recover.py DESTINATION BACKUP_1 BACKUP_2 [BACKUP_3, ...]

Thanks a lot for working on this.

I am looking forward to continuing the review.

Ciao,
Gabriele
--
 Gabriele Bartolini - 2ndQuadrant Italia - Managing Director
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

2015-01-13 17:21 GMT+01:00 Marco Nenciarini :

> Il 13/01/15 12:53, Gabriele Bartolini ha scritto:
> > Hi Marco,
> >
> >   could you please send an updated version the patch against the current
> > HEAD in order to facilitate reviewers?
> >
>
> Here is the updated patch for incremental file based backup.
>
> It is based on the current HEAD.
>
> I'm now working to the client tool to rebuild a full backup starting
> from a file based incremental backup.
>
> Regards,
> Marco
>
> --
> Marco Nenciarini - 2ndQuadrant Italy
> PostgreSQL Training, Services and Support
> marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
>

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-13 Thread Marco Nenciarini

Il 13/01/15 12:53, Gabriele Bartolini ha scritto:
> Hi Marco,
> 
>   could you please send an updated version the patch against the current
> HEAD in order to facilitate reviewers?
> 

Here is the updated patch for incremental file based backup.

It is based on the current HEAD.

I'm now working to the client tool to rebuild a full backup starting
from a file based incremental backup.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From 50ff0872d3901a30b6742900170052eabe0e06dd Mon Sep 17 00:00:00 2001
From: Marco Nenciarini 
Date: Tue, 14 Oct 2014 14:31:28 +0100
Subject: [PATCH] File-based incremental backup v4

Add backup profile to pg_basebackup
INCREMENTAL option implementaion
---
 src/backend/access/transam/xlog.c  |   7 +-
 src/backend/access/transam/xlogfuncs.c |   2 +-
 src/backend/replication/basebackup.c   | 335 +++--
 src/backend/replication/repl_gram.y|   6 +
 src/backend/replication/repl_scanner.l |   1 +
 src/bin/pg_basebackup/pg_basebackup.c  |  83 ++--
 src/include/access/xlog.h  |   3 +-
 src/include/replication/basebackup.h   |   4 +
 8 files changed, 409 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 839ea7c..625a5df 100644
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
*** XLogFileNameP(TimeLineID tli, XLogSegNo 
*** 9249,9255 
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,
   char **labelfile)
  {
boolexclusive = (labelfile == NULL);
--- 9249,9256 
   * permissions of the calling user!
   */
  XLogRecPtr
! do_pg_start_backup(const char *backupidstr, bool fast,
!  XLogRecPtr incremental_startpoint, 
TimeLineID *starttli_p,
   char **labelfile)
  {
boolexclusive = (labelfile == NULL);
*** do_pg_start_backup(const char *backupids
*** 9468,9473 
--- 9469,9478 
 (uint32) (startpoint >> 32), (uint32) startpoint, 
xlogfilename);
appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
 (uint32) (checkpointloc >> 32), 
(uint32) checkpointloc);
+   if (incremental_startpoint > 0)
+   appendStringInfo(&labelfbuf, "INCREMENTAL FROM 
LOCATION: %X/%X\n",
+(uint32) 
(incremental_startpoint >> 32),
+(uint32) 
incremental_startpoint);
appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
 exclusive ? "pg_start_backup" 
: "streamed");
appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
diff --git a/src/backend/access/transam/xlogfuncs.c 
b/src/backend/access/transam/xlogfuncs.c
index 2179bf7..ace84d8 100644
*** a/src/backend/access/transam/xlogfuncs.c
--- b/src/backend/access/transam/xlogfuncs.c
*** pg_start_backup(PG_FUNCTION_ARGS)
*** 59,65 
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
   errmsg("must be superuser or replication role to run a 
backup")));
  
!   startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
  
PG_RETURN_LSN(startpoint);
  }
--- 59,65 
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
   errmsg("must be superuser or replication role to run a 
backup")));
  
!   startpoint = do_pg_start_backup(backupidstr, fast, 0, NULL, NULL);
  
PG_RETURN_LSN(startpoint);
  }
diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 07030a2..05b19c5 100644
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***
*** 30,40 
--- 30,42 
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
  #include "replication/walsender_private.h"
+ #include "storage/bufpage.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
  #include "utils/ps_status.h"
+ #include "utils/pg_lsn.h"
  #include "utils/timestamp.h"
  
  
*** typedef struct
*** 46,56 
boolnowait;
boolincludewal;
uint32  maxrate;
  } basebackup_options;
  
  
! static int64 sendDir(char *path, int basepathlen, bool sizeonly, List 
*tablespaces);
! static int64 sendTablespace(char *path, bool sizeonly);
  static bool sendFile(char *readfilename, char *tarfilename,
 struct stat * statbuf, bool missing_ok);
  static void sendFileWithConte

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-13 Thread Gabriele Bartolini

Hi Marco,

  could you please send an updated version the patch against the current
HEAD in order to facilitate reviewers?

Thanks,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia - Managing Director
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

2015-01-07 11:00 GMT+01:00 Marco Nenciarini :

> Il 06/01/15 14:26, Robert Haas ha scritto:
> > I suggest leaving this out altogether for the first version.  I can
> > think of three possible ways that we can determine which blocks need
> > to be backed up.  One, just read every block in the database and look
> > at the LSN of each one.  Two, maintain a cache of LSN information on a
> > per-segment (or smaller) basis, as you suggest here.  Three, scan the
> > WAL generated since the incremental backup and summarize it into a
> > list of blocks that need to be backed up.  This last idea could either
> > be done when the backup is requested, or it could be done as the WAL
> > is generated and used to populate the LSN cache.  In the long run, I
> > think some variant of approach #3 is likely best, but in the short
> > run, approach #1 (scan everything) is certainly easiest.  While it
> > doesn't optimize I/O, it still gives you the benefit of reducing the
> > amount of data that needs to be transferred and stored, and that's not
> > nothing.  If we get that much working, we can improve things more
> > later.
> >
>
> Hi,
> The patch now uses the approach #1, but I've just sent a patch that uses
> the #2 approach.
>
> 54ad016e.9020...@2ndquadrant.it
>
> Regards,
> Marco
>
> --
> Marco Nenciarini - 2ndQuadrant Italy
> PostgreSQL Training, Services and Support
> marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
>
>

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-07 Thread Marco Nenciarini

Il 06/01/15 14:26, Robert Haas ha scritto:
> I suggest leaving this out altogether for the first version.  I can
> think of three possible ways that we can determine which blocks need
> to be backed up.  One, just read every block in the database and look
> at the LSN of each one.  Two, maintain a cache of LSN information on a
> per-segment (or smaller) basis, as you suggest here.  Three, scan the
> WAL generated since the incremental backup and summarize it into a
> list of blocks that need to be backed up.  This last idea could either
> be done when the backup is requested, or it could be done as the WAL
> is generated and used to populate the LSN cache.  In the long run, I
> think some variant of approach #3 is likely best, but in the short
> run, approach #1 (scan everything) is certainly easiest.  While it
> doesn't optimize I/O, it still gives you the benefit of reducing the
> amount of data that needs to be transferred and stored, and that's not
> nothing.  If we get that much working, we can improve things more
> later.
> 

Hi,
The patch now uses the approach #1, but I've just sent a patch that uses
the #2 approach.

54ad016e.9020...@2ndquadrant.it

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-06 Thread Jehan-Guillaume de Rorthais

On Tue, 6 Jan 2015 08:26:22 -0500
Robert Haas  wrote:

> Three, scan the WAL generated since the incremental backup and summarize it
> into a list of blocks that need to be backed up.

This can be done from the archive side.  I was talking about some months ago
now:

  http://www.postgresql.org/message-id/51c4dd20.3000...@free.fr

One of the traps I could think of it that it requires "full_page_write=on" so
we can forge each block correctly. So collar is that we need to start a diff
backup right after a checkpoints then.

And even without "full_page_write=on", maybe we could add a function, say
"pg_start_backupdiff()", which would force to log full pages right after it
only, the same way "full_page_write" does after a checkpoint. Diff backups would
be possible from each LSN where we pg_start_backupdiff'ed till whenever.

Building this backup by merging versions of blocks from WAL is on big step.
But then, there is a file format to define, how to restore it and to decide what
tools/functions/GUCs to expose to admins.

After discussing with Magnus, he adviced me to wait for a diff backup file
format to emerge from online tools, like discussed here (by the time, that was
Michael's proposal based on pg_basebackup that was discussed). But I wonder how
easier it would be to do this the opposite way? If this idea of building diff
backup offline from archives is possible, wouldn't it remove a lot of trouble
you are discussing here?

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-06 Thread Robert Haas

On Tue, Oct 14, 2014 at 1:17 PM, Marco Nenciarini
 wrote:
> I would to replace the getMaxLSN function with a more-or-less persistent
> structure which contains the maxLSN for each data segment.
>
> To make it work I would hook into the ForwardFsyncRequest() function in
> src/backend/postmaster/checkpointer.c and update an in memory hash every
> time a block is going to be fsynced. The structure could be persisted on
> disk at some time (probably on checkpoint).
>
> I think a good key for the hash would be a BufferTag with blocknum
> "rounded" to the start of the segment.
>
> I'm here asking for comments and advices on how to implement it in an
> acceptable way.

I'm afraid this is going to be quite tricky to implement.  There's no
way to make the in-memory hash table large enough that it can
definitely contain all of the entries for the entire database.  Even
if it's big enough at a certain point in time, somebody can create
100,000 new tables and now it's not big enough any more.  This is not
unlike the problem we had with the visibility map and free space map
before 8.4 (and you probably remember how much fun that was).

I suggest leaving this out altogether for the first version.  I can
think of three possible ways that we can determine which blocks need
to be backed up.  One, just read every block in the database and look
at the LSN of each one.  Two, maintain a cache of LSN information on a
per-segment (or smaller) basis, as you suggest here.  Three, scan the
WAL generated since the incremental backup and summarize it into a
list of blocks that need to be backed up.  This last idea could either
be done when the backup is requested, or it could be done as the WAL
is generated and used to populate the LSN cache.  In the long run, I
think some variant of approach #3 is likely best, but in the short
run, approach #1 (scan everything) is certainly easiest.  While it
doesn't optimize I/O, it still gives you the benefit of reducing the
amount of data that needs to be transferred and stored, and that's not
nothing.  If we get that much working, we can improve things more
later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-05 Thread Michael Paquier

On Mon, Jan 5, 2015 at 7:56 PM, Marco Nenciarini
 wrote:
> I've noticed that I missed to add this to the commitfest.
>
> I've just added it.
>
> It is not meant to end up in a committable state, but at this point I'm
> searching for some code review and more discusison.
>
> I'm also about to send an additional patch to implement an LSN map as an
> additional fork for heap files.
Moved to CF 2015-02.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-05 Thread Marco Nenciarini

I've noticed that I missed to add this to the commitfest.

I've just added it.

It is not meant to end up in a committable state, but at this point I'm
searching for some code review and more discusison.

I'm also about to send an additional patch to implement an LSN map as an
additional fork for heap files.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-05 Thread Marco Nenciarini

I've noticed that I missed to add this to the commitfest.

I've just added it.

It is not meant to end up in a committable state, but at this point I'm
searching for some code review and more discusison.

I'm also about to send an additional patch to implement an LSN map as an
additional fork for heap files.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

11 matches

Site Navigation

Mail list logo

Footer information