Matthew Dillon <[EMAIL PROTECTED]> wrote: > > :Hi, > : > :I'm playing with the hammer mirroring feature and noticed > :that streams generated by > :hammer mirror-read filesystem <begin-tid> > :don't always start with <begin-tid>. > : > :E.g: > : > :[EMAIL PROTECTED]:/home/hofmann >hammer mirror-read /hammer/ > 0x000000010382d1a6 | hammer mirror-dump > :... > :Record obj=0000000100000054 key=0000000000000000 rt=01 ot=02 > : tids 00000001061da24b:0000000000000000 data=128 > :Record obj=0000000100000055 key=0000000000000000 rt=01 ot=01 > : tids 00000001061da25d:0000000000000000 data=128 > :.. > : > :But 0x000000010382d1a6 is a valid existing tid and: > : > :[EMAIL PROTECTED]:/home/hofmann >hammer mirror-read /hammer/ > 0x000000010382d1a6 | hammer mirror-dump | grep 000000010382d1a6 > :Mirror-read: Mirror from 000000010382d1a6 to 0000000f958af3c0 > : tids 000000010382d1a6:0000000000000000 data=128 > : > :Is this intended? > : > : Johannes > > The mirroring dump should include all records with a creation or > deletion TID >= the specified TID. > > BUT, it may ALSO include records with lower TIDs. The reason is because > the code needs to supply the B-Tree infrastructure leading up to the > desired records as well as provide the desired records. It is a side > effect of the search. Providing the infrastructure helps the mirroring > target do the merge (including any needed deletions) optimally. > > The search is still optimal, or close to it. You should not get too > many extra records (from a bulk transfer point of view). > > The various mirroring record types: 'Skip', 'Pass', and 'Record', > are used to discern the difference between infrastructure and bulk > data records. > > * Skip records indicate that part of the B-Tree infrastructure is being > skipped and only contain the key range being skipped. > > * Pass records are records which the originator believes the target > should already have. The record header is included but not any data > references. > > * Record records are records (tripple play there :-)) that the original > believes the target might not have. These records contain everything: > key, record header, and any associated bulk data. > > The mirroring target uses these records to optimally scan the target > B-Tree in the target HAMMER filesystem and to properly perform the > merge. > > Because transaction ids are not really in any sort of sorted error, > except for create_tid as a sub-sort, we can end up dumping records, > particular 'Pass' records, with unrelated transaction ids in order > to include a 'Record' record with a related transaction id, so the > mirroring target knows how to properly merge the stream into the > target. i.e. the mirroring target needs to know whether it must delete > physical records on the target or not when performing a merge, and it > can't know that unless it is given all the records in the B-Tree leaf, > even if some are outside the requested transaction id range.
Thanks a lot for the explanation! Johannes
