Il 02/03/15 14:21, Fujii Masao ha scritto:
> On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini
> <marco.nenciar...@2ndquadrant.it> wrote:
>> Hi,
>>
>> I've attached an updated version of the patch.
> 
> basebackup.c:1565: warning: format '%lld' expects type 'long long
> int', but argument 8 has type '__off_t'
> basebackup.c:1565: warning: format '%lld' expects type 'long long
> int', but argument 8 has type '__off_t'
> pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code
> 

I'll add the an explicit cast at that two lines.

> When I applied three patches and compiled the code, I got the above warnings.
> 
> How can we get the full backup that we can use for the archive recovery, from
> the first full backup and subsequent incremental backups? What commands should
> we use for that, for example? It's better to document that.
> 

I've sent a python PoC that supports the plain format only (not the tar one).
I'm currently rewriting it in C (with also the tar support) and I'll send a new 
patch containing it ASAP. 

> What does "1" of the heading line in backup_profile mean?
> 

Nothing. It's a version number. If you think it's misleading I will remove it.

> Sorry if this has been already discussed so far. Why is a backup profile file
> necessary? Maybe it's necessary in the future, but currently seems not.

It's necessary because it's the only way to detect deleted files.

> Several infos like LSN, modification time, size, etc are tracked in a backup
> profile file for every backup files, but they are not used for now. If it's 
> now
> not required, I'm inclined to remove it to simplify the code.

I've put LSN there mainly for debugging purpose, but it can also be used to 
check the file during pg_restorebackup execution. The sent field is probably 
redundant (if sent = False and LSN is not set, we should probably simply avoid 
to write a line about that file) and I'll remove it in the next patch.

> 
> We've really gotten the consensus about the current design, especially that
> every files basically need to be read to check whether they have been modified
> since last backup even when *no* modification happens since last backup?

The real problem here is that there is currently no way to detect that a file 
is not changed since the last backup. We agreed to not use file system 
timestamps as they are not reliable for that purpose.
Using LSN have a significant advantage over using checksum, as we can start the 
full copy as soon as we found a block whith a LSN greater than the threshold.
There are two cases: 1) the file is changed, so we can assume that we detect it 
after reading 50% of the file, then we send it taking advantage of file system 
cache; 2) the file is not changed, so we read it without sending anything.
It will end up producing an I/O comparable to a normal backup.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to