Many distributed systems (ie git, dynamo) use a 16-byte or greater
psudorandom (or random) identifier for documents.

It would be nice to refactor Lucene to return a variable-width document ID
so that indices could be implemented over databases such as HBase,
Accumulo, Cassandra, etc. using a large, non-sequential identifier instead
of the current system which requires ID's to be sequential and 4 bytes.

Has anyone thought about doing this? Is there interest in such a
refactoring or prototype?

Reply via email to