how about a timrstamp with either a GUID appended on the end of it?
Dennis Gearon Signature Warning ---------------- It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die. --- On Sun, 10/31/10, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > From: Toke Eskildsen <t...@statsbiblioteket.dk> > Subject: RE: Ensuring stable timestamp ordering > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Date: Sunday, October 31, 2010, 12:18 PM > Dennis Gearon [gear...@sbcglobal.net] > wrote: > > Even microseconds may not be enough on some really > good, fast machine. > > True, especially since the timer might not provide > microsecond granularity although the returned value is in > microseconds. However, an unique timestamp generator should > keep track of the previous timestamp to guard against > duplicates. Uniqueness can thus be guaranteed by waiting a > bit or cheating on the decimals. With microseconds can > produce 1 million timestamps / second. While I agree that > duplicates within microseconds can occur on a fast machine, > guaranteeing uniqueness by waiting should only be a > performance problem when the number of duplicates is high. > That's still a few years off, I think. > > As Michael pointed out, using normal timestamps as unique > IDs might not be such a great idea as it effectively locks > index-building to a single JVM. By going the ugly route and > expressing the time in nanos with only microsecond > granularity and use the last 3 decimals for a builder ID > this could be fixed. Not very clean though, as the contract > is not expressed in the data themselves but must > nevertheless be obeyed by all builders to avoid collisions. > It also raises the question of who should assign the builder > IDs. Not trivial in an anarchistic setup where new builders > can be added by different controllers. > > Pragmatists might use the PID % 1000 or similar for the > builder ID as it does not require coordination, but this is > where the Birthday Paradox hits us again: The chance of two > processes on different machines having the same PID is 10% > if just 15 machines are used (1% for 5 machines, 50% for 37 > machines). I don't like those odds and that's assuming that > the PIDs will be randomly distributed, which they won't. It > could be lowered by reserving more decimals for the salt, > but then we would decrease the maximum amount of timestamps > / second, still without guaranteed uniqueness. Guys a lot > smarter than me has spend time on the unique ID problem and > it's clearly not easy: Java's UUID takes up 128 bits. > > - Toke