[jira] [Comment Edited] (CASSANDRA-17048) Replace sequential sstable generation identifier with ULID

Benedict Elliott Smith (Jira) Tue, 19 Oct 2021 13:21:16 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430755#comment-17430755
 ]


Benedict Elliott Smith edited comment on CASSANDRA-17048 at 10/19/21, 8:19 PM:
-------------------------------------------------------------------------------

Without commenting on the patch overall, I have a preference not to proliferate 
new varieties of UUID-like concepts. We already have v1 time UUID, and v7 is in 
[draft|https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-4.5]
 with the IETF. I think it would be preferable to either use v7 UUID, or to use 
v1 UUID but to serialise them to string so that they sort lexicographically 
(this is a pretty simple shuffle).

As part of CEP-14 I will be introducing a {{TimeUUID}} class to represent our 
v1 uses of UUID, that stores the data internally in its lexicographic order 
(this is intended primarily to ensure correctness, to avoid accidentally using 
the UUID format timestamp instead of unix), so it would be quite simple extend 
this and modify {{toString}} (and {{fromString}}). I would also be 
happy to extend this work to support v7 UUID as well, as these are probably 
superior.

I also anticipate that in the near future we will begin issuing nodes in the 
cluster a globally unique id, so that globally unique UUIDs may be issued 
without any probabilistic component, which should be strictly superior to ULID.

I assume this is already in use for supporting S3, but I think the aims of the 
patch can probably be achieved without necessarily adopting ULID within the 
Cassandra codebase?


was (Author: benedict):
Without commenting on the patch overall, I have a preference not to proliferate 
new varieties of UUID-like concepts. We already have v1 time UUID, and v7 is in 
[draft|https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format#section-4.5]
 with the IETF. I think it would be preferable to either use v7 UUID, or to use 
v1 UUID but to serialise them to string so that they sort lexicographically 
(this is a pretty simple shuffle).

As part of CEP-14 I will be introducing a {{TimeUUID}} class to represent our 
v1 uses of UUID, that stores the data internally in its lexicographic order 
(this is intended primarily to ensure correctness, to avoid accidentally using 
the UUID format timestamp instead of unix), so it would be quite simple extend 
this and modify {{toString}} (and {{fromString}}).

I also anticipate that in the near future we will begin issuing nodes in the 
cluster a globally unique id, so that globally unique UUIDs may be issued 
without any probabilistic component, so that any other advantages of ULID will 
likely be obsolete very soon.

I assume this is already in use for supporting S3, but I think the aims of the 
patch can probably be achieved without necessarily adopting ULID within the 
Cassandra codebase?

> Replace sequential sstable generation identifier with ULID
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-17048
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17048
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/SSTable
>            Reporter: Jacek Lewandowski
>            Assignee: Jacek Lewandowski
>            Priority: Normal
>             Fix For: 4.1
>
>
> Replace the current sequential sstable generation identifier with ULID based.
> ULID is better because we do not need to scan the existing files to pick the 
> starting number as well as we can generate globally unique identifiers. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-17048) Replace sequential sstable generation identifier with ULID

Reply via email to