: In my testing, when the filenames are the same, doing an xdelta on the : files (mainly the file that contains most of the data, the .cfs file), : there is a significant reduction in the size of the patch file created.
AS noted elsewhere in this thread, the filenames themselves are significant because they are tracked in the segments file and indicate generational information. files with the same names should be the same, files with differnet names should be very different -- but if your binary diff tool is finding commonalities between files in new segments as the index grows overtime, and you feel like you can take advantage of this, then i would suggest using a simple tool like "tar" to combine all of the index files int oa single file with a predictable name before running your diff tool. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org