You can't use timestamps - they're not strictly increasing, for
various reasons.
Why does it need to be strictly increasing? As already explained the
version identifiers should IMO be opaque and just be a server
implementation issue, I still can't see any reason why this needs to be
set in stone the protocol as a MUST, should only be a RECOMMENDED so
that server implementors have an idea of where to start from as a way to
implement this.
Firstly, two roster changes could happen at precisely the same moment.
To be fair, by introducing cluster node identifiers, and having a
strict strong ordering of them, you could avoid this.
It could do yes, why is it a problem if they happen at the same time and
are marked with the same timestamp?, it will just result in them both
being pushed, how is that an issue?.
Secondly, the clock on a computer can, and surprisingly often does, go
backwards. That's a much harder problem to solve.
Maybe so, do you have any more information on how prevalent this is? How
long it lasts for etc?
Thirdly, in a clustering situation, you'd have to ensure that the time
on each cluster node was perfectly synchronized.
No you wouldn't necessarily, not if the timestamping was happening at
the central data storage layer (i.e. the database server), and again
this is just an implementation issue and is easily overcome, not
something that means its impossible.
So the closest you can do would be a modified timestamp that had
additional logic during generation to ensure it never went backwards,
in which case you don't need the cluster identifier anymore, and
that's effectively the same as having a strictly increasing integer
sequence anyway, so it's easier to just do that. But even if you did
want to use timestamps, just representing them as an integer is pretty
trivial. Look at the definition of "modtime" in ACAP (RFC 2244), which
defines a strictly increasing modified timestamp represented using digits.
Yes I know I could represent them as integers, but id rather not if I
don't have do, id prefer to have the flexibility to compress and shorten
them to reduce bandwidth consumption as much as possible.
It's useful for clients to be able to determine the ordering locally,
on occasion. If we removed this, we'd also have to ensure that roster
pushes were sent to the client in-order, which currently we don't
mandate. (Making this a SHOULD is sane, but in the cluster case, it's
quite hard).
Well im pretty sure XMPP dictates in order processing of stanzas so
surely the roster updates should thus be in order? Also you haven't
really answered my concerns about allowing clients to determine meaning
from the version identifier introducing the possibility of bugs and
interoperability problems which IMO is a far more serious issue, and one
that doesn't exist if the client just treats them as opaque strings.
Also even if they were out of order (which I think would be unlikely
because it would be only likely to happen when several roster updates
were happening at the same time) it doesn't really cause much of an
issue as far as I can see, its just that you might have one or two
roster pushes that you have already cached pushed to you again, hardly
the end of the world, and should be something the clients should be able
to cope with, as what would happen if a servers database server crashed
and needed to be restored from a backup and the most recent roster
updates that have already been pushed arn't there, or the server now
thinks it hasn't pushed the changes yet and ends up re-pushing changes,
it shouldn't make any difference to the client, some method to
re-synchronize needs to be in place to handle this, I think to solve
this issue if the version id (be that timestamp or incrementing id) the
client specifies is further on than any of the ones the server has you
would need to re-push the entire list the server has invalidating the
client list somehow (to ensure new now non-existent contacts that were
created in between the db backup and the crash do not hang around).
Plus, nobody can get it wrong.
How exactly are they going to get it wrong if its an identifier that
only the server is interpreting the meaning of?
It's the server I'm worried about. :-)
OK but that doesn't really answer my question.
You just use a 128-bit unsigned integer. There is no upper limit here
- in particular, there is no upper limit specified anywhere in this
document - XSD merely states that a xs:nonNegativeInteger is a
sequence of digits, and has "countably infinite" cardinality.
If you really and truly believe that practical limits of 64-bit
unsigned integers can cause problems in the real world, I honestly
don't know what to say except show you the figures - you could have
thousands of updates every millisecond, and still last over half a
million years - 574,542 roughly, assuming a fixed year length of
365.25 days.
I'm all for designing for the future, but you have to draw the line
somewhere, and besides, I figure we'll be on something bigger than
64-bit well before then - a jump to 128-bit gains us 10^25 years of
breathing space, and I'd like to imagine we can think up a solution
within that time, assuming that's prior to the heat death of the universe.
Sure you can keep increasing the bit size of your integer in your
implementation, but the spec still needs to dictate what happens once
you reach overflow if its going to define that you have to implement it
that way as well as how long the integer should be, although I still
fail to see why strictly increasing numbers should be a MUST at the
protocol level, IMO this is an internal server implementation issue and
not a protocol one, im all for recommending that as a way for the server
to implement it, but still don't think it should be MUST only RECOMMENDED.
Richard