Hi,

On Thu, Mar 20, 2014 at 11:05 AM, Alex Parvulescu
<[email protected]> wrote:
> On Thu, Mar 20, 2014 at 1:33 PM, Jukka Zitting <[email protected]>wrote:
>> Perhaps the comparison is between content in the source repository and
>> that in the backup repository? In that case the segment identifiers
>> wouldn't match, and the comparison would slow down as described.
>
> Yes exactly, the backup runs a diff between the source instance and the
> target instance.

Ideally it shouldn't, that's why the backup code tries to instead use
the checkpoint that was used for the previous backup [1]. The
comparison across repositories is much slower and produces some
slightly unexpected behavior like what you're probably seeing here.

> One would expect that the backup is incremental in the
> sense that running it a consecutive time yields only the modifications that
> happened during that time period but these findings show otherwise.

If the comparison is done across repositories, the content diff will
call childNodeChanged() even on unmodified nodes as permitted by
OAK-914 [2]. The reason for this is that the child node identifiers
will not match across repositories, so there's no efficient way for
the content diff to tell whether the subtrees are equal or not.

However, AFAICT these extra childNodeChanged() calls should only
result in some slowdown as ApplyDiff recurses down the tree, not
actual changes to be written by SegmentWriter. Can you for example try
dumping the output of JsopDiff.diffToJsop() before line 74 in
FileStoreBackup to verify that no changes are unexpectedly showing up?

[1] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/backup/FileStoreBackup.java#L61
[2] https://issues.apache.org/jira/browse/OAK-914

BR,

Jukka Zitting

Reply via email to