Re: tdbloader's info on batch count

Andy Seaborne Wed, 29 Feb 2012 05:25:25 -0800

On 29/02/12 13:18, Paolo Castagna wrote:

Andy Seaborne wrote:

An incremental version is quite possible.  It could load to a dataset,
ensuring the id are right, then do index-merging.


Hi Andy,
can you expand a little bit on "ensuring the id are right"
and "index-merging" bits? ;-)

To "ensure ids are right" the incremental loader would need
to re-use the same node table of the exiting db, right?


Yes.

(Hash-ids don't remove the need but they would change the problem toallowing two idenpendent databases to be merged by messing around withthe lowest level data structures.)

I have been thinking on how to merge two TDB indexes, but
it does not seem a trivial problem to me... not with the
current node ids.

The indexes are just a stream of sorted numbers (OK - the numbers are192 bits long but that's what computers are for :-) It's a plain mergeof two already sorted streams, with duplicate removal, using the B+Treerebuilder.


Paolo


        Andy

Re: tdbloader's info on batch count

Reply via email to