Remy Maucherat wrote:

>>I have the first prototype ready for a java based implementation of Josh
>>MacDonalds xdelta binary diff/delta algorithm. See
>>http://sourceforge.net/projects/xdelta for more information on this
>>algorithm.
>>
>>It is available for download here;
>>
> http://www.vertech.no/java-xdelta.tar.gz
> 
>>This implementation uses the GDIFF file format for storing binary
>>deltas. There is also a patcher that takes a source file and a patch
>>file in this format and produces the original file.
>>
>>When this implementation is mature it might be usefull if someone wants
>>to implement a versioned store implementation for slide with reduced
>>physical storage requirements.
>>
>>I am seeking feedback on this. Warning; this is pre-alpha quality code.
>>
> 
> Obviously, that's a very useful library. I'll try to play with it :)


Some notes; as you'll see, the source 'file' needs to be seekable, so I 
use a RandomAccessFile in this implementation. This is a property of it 
being a copy-insert algorithm, and not a delete-insert like vanilla 
diff. It is possible to create an implementation that doesn't the source 
doesn't need to be seekable. But it *has* to be readable twice.

There are implication eg. if one wants to store the last revision as the 
full one, and the previous as a delta; since they last revision might be 
available as an input stream from a http put. One way of doing it would 
be to write to the store first, and at the same time generate the hash 
fingerprints, then do the delta run on the source from the store. 
Reading the source from the store would probably imply that it is 
seekable to?

When patching, the source needs to be seekable as well, due to the same 
reason as above.

It is not very space optimized yet, as it uses a hashtable. I will use 
an array as the hashtable in the next release, as the algorithm really 
was designed for this.

The GDIFFWriter could really be a special outputstream. The same goes 
for the patcher; it could be an inputstream that took the source input 
and the delta input as parameters.

I have tried the algorithm on different input data, but it obviously 
needs a lot of testing. It would also be nice if someone could proof 
read my io code; I'm not shure I got all the signed / unsigned stuff right.

-- 
-Torgeir


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to