[
https://issues.apache.org/jira/browse/CASSANDRA-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430755#comment-17430755
]
Benedict Elliott Smith edited comment on CASSANDRA-17048 at 10/19/21, 8:19 PM:
-------------------------------------------------------------------------------
Without commenting on the patch overall, I have a preference not to proliferate
new varieties of UUID-like concepts. We already have v1 time UUID, and v7 is in
[draft|https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-4.5]
with the IETF. I think it would be preferable to either use v7 UUID, or to use
v1 UUID but to serialise them to string so that they sort lexicographically
(this is a pretty simple shuffle).
As part of CEP-14 I will be introducing a {{TimeUUID}} class to represent our
v1 uses of UUID, that stores the data internally in its lexicographic order
(this is intended primarily to ensure correctness, to avoid accidentally using
the UUID format timestamp instead of unix), so it would be quite simple extend
this and modify {{toString}} (and {{fromString}}). I would also be
happy to extend this work to support v7 UUID as well, as these are probably
superior.
I also anticipate that in the near future we will begin issuing nodes in the
cluster a globally unique id, so that globally unique UUIDs may be issued
without any probabilistic component, which should be strictly superior to ULID.
I assume this is already in use for supporting S3, but I think the aims of the
patch can probably be achieved without necessarily adopting ULID within the
Cassandra codebase?
was (Author: benedict):
Without commenting on the patch overall, I have a preference not to proliferate
new varieties of UUID-like concepts. We already have v1 time UUID, and v7 is in
[draft|https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-4.5]
with the IETF. I think it would be preferable to either use v7 UUID, or to use
v1 UUID but to serialise them to string so that they sort lexicographically
(this is a pretty simple shuffle).
As part of CEP-14 I will be introducing a {{TimeUUID}} class to represent our
v1 uses of UUID, that stores the data internally in its lexicographic order
(this is intended primarily to ensure correctness, to avoid accidentally using
the UUID format timestamp instead of unix), so it would be quite simple extend
this and modify {{toString}} (and {{fromString}}).
I also anticipate that in the near future we will begin issuing nodes in the
cluster a globally unique id, so that globally unique UUIDs may be issued
without any probabilistic component, so that any other advantages of ULID will
likely be obsolete very soon.
I assume this is already in use for supporting S3, but I think the aims of the
patch can probably be achieved without necessarily adopting ULID within the
Cassandra codebase?
> Replace sequential sstable generation identifier with ULID
> ----------------------------------------------------------
>
> Key: CASSANDRA-17048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17048
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/SSTable
> Reporter: Jacek Lewandowski
> Assignee: Jacek Lewandowski
> Priority: Normal
> Fix For: 4.1
>
>
> Replace the current sequential sstable generation identifier with ULID based.
> ULID is better because we do not need to scan the existing files to pick the
> starting number as well as we can generate globally unique identifiers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]