Hi, While discussing with Philipp a few days, we noticed that identifying files by a long, as is done now, is a bit dangerous in terms of possible collisions. The probabilities are low (see https://en.wikipedia.org/wiki/Birthday_attack), but a bit higher than they should.
So I implemented a not so nice FileId (see my branch https://github.com/fabrice-rossi/syncany/tree/longer-file-id) to have a 16 bytes id. Philipp was not happy for two reasons: the abstraction was leaking (you needed two longs to initialize a FileId) and there was no way we could customize the number of bytes used to identify a file. I've now implemented a better FileId based on Philipp's input. It's basically a trimmed down ByteArray. For now, it is not configurable in term of size, but that would be super easy to do. As this is based on ObjectId, it can be generalized for other ids. Philipp, does that suits you better? If this is the case, I will add proper documentation and tests, and unify this with other ids. To all, do you know a simple way to have something which does not waste memory, is type safe (reasonably) and still configurable at runtime? What I could like to have is a hierarchy of ObjectId, in order to have type safety (then you cannot mistakingly search for a file using a chunk id, for instance). That's easy. But I would also like to have small memory footprint ids. For instance, if I use 16 bytes, then I can pack them in two longs rather than putting them in a byte[] (long story short, this moves the memory occupation from 48 bytes par id to 32 bytes on hotspot running in 64 bits mode with compressed pointers). This is important if we want to aim at very large repositories. Again, this is easy: have ObjectId be an abstract class and then implement different size constrained subclasses (such as TwoLongObjectId) and again specific subclasses, like FileId which will derived from TwoLongObjectId. But this prevents configuring the actual memory size at runtime, at least in a non super annoying way. I mean that I can of course design a factory and have multiple classes implementing a FileId interface (as a type marker) and inheriting from the different size constrained subclasses, but this feels heavy. Any other solution? Cheers, Fabrice -- Mailing list: https://launchpad.net/~syncany-team Post to : [email protected] Unsubscribe : https://launchpad.net/~syncany-team More help : https://help.launchpad.net/ListHelp

