> tdbloader2 builds b+trees from bottom to top, given sorted input. As
> such blocks are streamed to disk which is disk-efficient.
>
> It is a series of java programs scripted together by a shell script.
>
> tdbloader is pure java. It builds the b+trees by inserting, which for
> some idndxes is not optimal because it causes random inserts leading to
> random I/O, which is bad for disk performance.
>
> Andy
But why is tdbloader better for smaller datasets, whereas tdbloader2 is better
for very large dataset ("100M+ triples")? Wouldn't the approach of tdbloader2
be superior in all cases?