Philip Meyer;576261 Wrote: 
> >One way to do this automatically would be to first look for duplicates
> with a very small number of checksum bytes.  Use <= 4096 to avoid extra
> read operations.  Then take any that are found to be duplicates and
> re-scan with a larger md5_size value.  This can be repeated as many
> times as necessary.
> 
> That could be done on each rescan, to ensure that there are no false
> positive duplicates.  Would need to store the number of chunks used to
> make the checkum value, and would make comparison harder.
> 
> i.e. If this is going to be used to re-attach persistent record data
> for a file that has moved, can't just do a lookup on the checksum
> value.
> 
> And if adding new music that has a duplicate checksum for one chunk of
> data, how would it know if the file was a false positive match, or
> really the same file that has moved to a new location?
> 
Don't worry about the performance optimizations yet, there are many
ways to solve that, I'm sure we can get decent performance on this in
one way or another. At the moment it's more important that we can
guarantee the uniqueness and that it works with all file formats.


-- 
erland

Erland Isaksson ('My homepage' (http://erland.isaksson.info))
(Developer of 'many plugins/applets'
(http://wiki.slimdevices.com/index.php/User:Erland). If my answer
helped you and you like to encourage future presence on this forum
and/or third party plugin/applet development, 'donations are always
appreciated' (http://erland.isaksson.info/donate))
------------------------------------------------------------------------
erland's Profile: http://forums.slimdevices.com/member.php?userid=3124
View this thread: http://forums.slimdevices.com/showthread.php?t=81679

_______________________________________________
beta mailing list
[email protected]
http://lists.slimdevices.com/mailman/listinfo/beta

Reply via email to