[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964277#comment-15964277 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- The related issues were all closed with essentially "Won't Fix/Later" resolution, just like this ticket. The objections here are still standing, however, so I'm not sure reopening the ticket - or opening a new one - would lead to anything. > Sequences > - > > Key: CASSANDRA-9200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis > > UUIDs are usually the right choice for surrogate keys, but sometimes > application constraints dictate an increasing numeric value. > We could do this by using LWT to reserve "blocks" of the sequence for each > member of the cluster, which would eliminate paxos contention at the cost of > not being strictly increasing. > PostgreSQL syntax: > http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960986#comment-15960986 ] Cyril Scetbon commented on CASSANDRA-9200: -- Hey [~iamaleksey], [~slebresne], if I look at the related issues they are all resolved. Does it mean you're going to reopen it ? I'm confused because of Sylvain's previous comment. > Sequences > - > > Key: CASSANDRA-9200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis > > UUIDs are usually the right choice for surrogate keys, but sometimes > application constraints dictate an increasing numeric value. > We could do this by using LWT to reserve "blocks" of the sequence for each > member of the cluster, which would eliminate paxos contention at the cost of > not being strictly increasing. > PostgreSQL syntax: > http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224709#comment-15224709 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- For the reasons outlined, and ultimately what amounts to double binding vetoes, I'm politely closing the JIRA as Later for now, until and unless the issues raised get addressed. Sorry. > Sequences > - > > Key: CASSANDRA-9200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis > > UUIDs are usually the right choice for surrogate keys, but sometimes > application constraints dictate an increasing numeric value. > We could do this by using LWT to reserve "blocks" of the sequence for each > member of the cluster, which would eliminate paxos contention at the cost of > not being strictly increasing. > PostgreSQL syntax: > http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610732#comment-14610732 ] Jason Brown commented on CASSANDRA-9200: WRT to token/partition-scoping, this assumes that the tokens in a cluster are constant/never change. We will run into data races/edge cases with cached sequences being used as nodes are being added to the cluster, ranges being moved, and so on. Perhaps with stronger consistency when cluster ownership changes (CASSANDRA-9667), this can be worked alleviated. bq. I reject this slippery slope argument Selecting data by a monotonic value and asking that it be ordered is an everyday common type of query that RDBMS users execute and expect with a sequence. Otherwise, why bother to have a monotonic value? In that case what you really want is uniqueness, which doesn't require monotonicity. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610212#comment-14610212 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- With a fixed size ring that we have, and nodes having explicit tokens assigned, can there be any way to project the token ring to a smaller-cardinality ID ring, and use tokens to map nodes to their pre-assigned blocks? Really not a fan of using LWT to pre-allocate blocks. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610346#comment-14610346 ] Tupshin Harper commented on CASSANDRA-9200: --- An example of an application domain where strictly increasing integers are required is the IMAP protocol. https://tools.ietf.org/html/rfc3501#page-8 where this is mandatory. {{A 32-bit value assigned to each message, which when used with the unique identifier validity value (see below) forms a 64-bit value that MUST NOT refer to any other message in the mailbox or any subsequent mailbox with the same name forever. Unique identifiers are assigned in a strictly ascending fashion in the mailbox; as each message is added to the mailbox it is assigned a higher UID than the message(s) which were added previously. Unlike message sequence numbers, unique identifiers are not necessarily contiguous.}} Building this kind of system on top of C* today requires an external CP system (ick operational complexity), though it is likely the case that the sequences here really only need to be modeled as clustering keys and not partition keys. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610437#comment-14610437 ] Robert Stupp commented on CASSANDRA-9200: - bq. providing unique, small, human-readable values, with no other restrictions ... +1 Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610387#comment-14610387 ] Jason Brown commented on CASSANDRA-9200: I'd like to echo [~iamaleksey]'s sentiments. It feels like we're trying to jam more RDBMS features into cassandra that don't fit very well in a distributed system. When you say 'sequence', most RDBMS-expatriates will think it will behave like Oracle/MySQL/SQLServer sequences, which it won't. Further, those users will also expect to be able to list the rows in order by that seqeunce value, as in select * from foo order by bar_id desc - which, for cassandra, is a distributed range query, and we've done our damnedest to steer users away from. Gotta say I'm rather -1 on this in general, and that's before we start diving into the failure cases. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610419#comment-14610419 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- To elaborate on my previous comment - maybe longs are good enough (as in, small enough). Then we can just rely on tokens and ranges as they are. I'm with Jason when it comes to providing RDBMS-like features. No to that. What we might try to address is providing 'small' and 'unique' values, with no other restrictions. No monotonicity guarantees. Just unique, small, human-readable values (smaller than UUIDs). Longs, derived from token ranges, should fit the bill. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610896#comment-14610896 ] Sylvain Lebresne commented on CASSANDRA-9200: - bq. It's expected that Cassandra will not provide you with all the functionality you can possibly need. Having people roll their own is fine sometimes, if the alternative is putting everything in C*. Especially features that are bad architectural fit with Cassandra model. I'm currently leaning towards that sentiment. Surely there isn't *that* many user that re-implement IMAP :). More seriously, sequences are something you should avoid if you can help it in a distributed system, and as far as I can tell, at least the LWT based solution can be done client side, so it's not like it's not doable by clients with Cassandra. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610638#comment-14610638 ] Jonathan Ellis commented on CASSANDRA-9200: --- Are you worried about bugs causing sequence conflicts? Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610599#comment-14610599 ] Jonathan Ellis commented on CASSANDRA-9200: --- bq. oracle and postgresql have the concept of a sequence cache size where sequence ids are reserved per session as I am proposing here per coordinator Oracle blog Ask Tom explains further: bq. A sequence has one purpose: assign unique numbers to stuff. Nothing else. There will be gaps, gaps are normal, expected, good, ok, fine. They will be there, there is no avoiding them [even with cache size of 1]. This is not a problem, it is expected, it is not fixable - a rollback for example will generate a gap if some session selected a sequence. Do not assume they are gap free and all is well in the world. https://asktom.oracle.com/pls/apex/f?p=100:11:0P11_QUESTION_ID:369390500346406705 Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610630#comment-14610630 ] Patrick McFadin commented on CASSANDRA-9200: bq. where sequence ids are reserved per session as I am proposing here per coordinator. The more I think of use cases and potential failure modes, I'm less -1 if we enforce that sequences are never used in partition keys. That will eliminate a ton of potential mis-use and disasters. I've been through sequence hell in RDBMS land. Reset counters or buffer under-run can make for a long weekend. If the proposal is to use a sequence with a partition key, then I'm at a loss as to why that is useful. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610652#comment-14610652 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- bq. Not sure what your counterproposal is. Projecting tokens to longs is going to result in duplicate longs which is not acceptable. (And as tupshin notes, some users need a 32bit solution as well.) It would work for 64 bytes. For 32 bytes some math would be required. After some offline conversations - I'm fine with partition-scoped sequences (assuming someone comes up with a decent plan for them). bq. This is the key, if we don't deliver something good enough users will be forced to roll something worse themselves. This is true for many features X. It's expected that Cassandra will not provide you with all the functionality you can possibly need. Having people roll their own is fine sometimes, if the alternative is putting everything in C*. Especially features that are bad architectural fit with Cassandra model. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610651#comment-14610651 ] Robert Stupp commented on CASSANDRA-9200: - Probably sequences that have been reset to a value that has been previously being used as the primary key (causing sleepless weekends resolving _ORA-1: unique constraint violated_ issues in the application/data). Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610595#comment-14610595 ] Jonathan Ellis commented on CASSANDRA-9200: --- bq. Really not a fan of using LWT to pre-allocate blocks. Not sure what your counterproposal is. Projecting tokens to longs is going to result in duplicate longs which is not acceptable. (And as tupshin notes, some users need a 32bit solution as well.) bq. When you say 'sequence', most RDBMS-expatriates will think it will behave like Oracle/MySQL/SQLServer sequences, which it won't. In this case, a mostly increasing sequence is what I am (implicitly) proposing, and in fact oracle and postgresql have the concept of a sequence cache size where sequence ids are reserved per session as I am proposing here per coordinator. I think this is a reasonable compromise, and while it may be surprising to RDBMS users at first it is a surprise they will discover quickly. (Surprises that you don't see in test but do see in production are worse.) bq. Further, those users will also expect to be able to list the rows in order by that seqeunce value I reject this slippery slope argument. We don't support ORDER BY in the general case already; it's not reasonable to say this will suddenly make people expect it. bq. Building this kind of system on top of C* today requires an external CP system (ick operational complexity) This is the key, if we don't deliver something good enough users will be forced to roll something worse themselves. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610660#comment-14610660 ] Patrick McFadin commented on CASSANDRA-9200: Bugs, user error, sun spots, anything that may cause values to become overlapping. You get the safety of unique checks in RDBMS to stop you from overwriting. If I were to advise someone to use a sequence on a partition key, always use IF NOT EXISTS on insert. If it were a 32 bit value, what do you get after inserting 4 billion keys? If these were scoped per partition, the chance of data loss is much less. In addition, I can see the general usefullness of having an increasing number for ordering in partition without the need for something like a timeUUID or timestamp. [~tupshin] not sure if this matches with your IMAP use case. Seems like it would. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610829#comment-14610829 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- bq. WRT to token/partition-scoping, this assumes that the tokens in a cluster are constant/never change. It doesn't assume that. It assumed that we properly handle those changes. CASSANDRA-9667 would probably make it simpler. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548280#comment-14548280 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- Sure. I'll elaborate if need, later, if needed, to meet the criteria for -1 bindingness (: Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548345#comment-14548345 ] Robert Stupp commented on CASSANDRA-9200: - IMO use of sequences should be avoided whenever possible. But there might be rare needs for them. Generating unique numeric IDs can be a benefit. E.g. generating numbers for bar codes (EAN codes), serial numbers - or generally numbers that a human should be able to read (time uuids are not really human readable). Constraint: people should be aware of the fact that these numbers are not strictly in/decreasing (regarding performance) and without support for wrapping around ({{CYCLE}}). So it would be definitely an _anti-pattern_ to generate strictly in/decreasing values, since Paxos overhead is really heavy. Sure, you can loose _A_ as with any QUORUM or SERIAL statement. The syntax for {{nextval}} is (currently) {{SELECT nextval(keyspace_name.sequence_name) ...}}. There's room for performance optimization like an atomic, serial CAS UPDATE using a system or user-defined function (e.g. {{UPDATE CAS system.seq_reservations SET next_val=system.updateNextval(next_val, bound_val)}}) to avoid the CAS loop with _QUORUM SELECT + SERIAL UPDATE_ on the coordinator. But that's stuff for later - out of scope of this ticket. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548314#comment-14548314 ] Patrick McFadin commented on CASSANDRA-9200: Can somebody expound a bit on the use case dictating this change? In an RDBMS they are critical for unique PKs and scanning sequential records. A UUID vs a numeric is going to be just as random once it is hashed. Are we talking about the ability for a application to scan based on a sequence? Twitter Snowflake tried to accomplish something similar with Zookeeper. I know there were failure modes difficult to manage and is now no longer available on GitHub. I can't think of any project that has done this without introducing very restrictive failure modes. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548364#comment-14548364 ] Brian Hess commented on CASSANDRA-9200: Can you not get a lot of what you want with a UDF that maps a UUID to a BIGINT hash? That will not be increasing/decreasing numbers, but will be an 8-byte integer. Can someone explain more on how the increasing/decreasing quality is needed/leveraged/used by customers? Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548361#comment-14548361 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- [~pmcfadin] Right. What I don't want is another counters situation - a feature that's not a great fit for Cassandra, full of caveats (at least in its initial implementation), biting unsuspecting people. I'll be -1 on it until I'm convinced that there are no cases that might bite you there. Will elaborate later. In other words, my bar for the implementation to go into C* is going to be very high, if at all possible to reach. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548355#comment-14548355 ] Robin Schumacher commented on CASSANDRA-9200: - Unfortunately, I don't have a lot of detail on how customers want to use them. Current requests come from Target, Family Search, and Openwave. My notes just say they've tried counters and instead want Oracle-like sequences. If more detail is needed, we can try and go back to them to see if more info can be had. Lastly, this isn't a constantly recurring request so not exactly a white hot priority. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548371#comment-14548371 ] Jeremiah Jordan commented on CASSANDRA-9200: If you aren't going to be Strictly Increasing is it really more useful than UUIDs? Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548277#comment-14548277 ] Jonathan Ellis commented on CASSANDRA-9200: --- Also for the record, I find the we shouldn't do useful things because some people will misuse them argument to be a weak one in general. :) Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548288#comment-14548288 ] Jonathan Ellis commented on CASSANDRA-9200: --- ... however, I don't think we should hide the CP-ness by adding e.g. auto-sequence PK on insert. If the sequence is exposed explicitly as a nextval call, I think the tradeoff is more clear. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548272#comment-14548272 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- For the record, I'm still not convinced that it is a good idea to have sequences support in Cassandra. It would eventually lead to their overuse, and for global sequences you lose the A in AP under certain conditions. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548436#comment-14548436 ] Brian Hess commented on CASSANDRA-9200: [~jjordan] - I was responding to this comment numbers that a human should be able to read from [~snazy] It would at least cover that space. Hence asking for the real customer need for increasing/decreasing values. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542046#comment-14542046 ] Jonathan Ellis commented on CASSANDRA-9200: --- Why do we need CONSISTENCY LEVEL distinct from the SERIAL one? Put another way, what non-Paxos replicated writes are we doing? Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542070#comment-14542070 ] Robert Stupp commented on CASSANDRA-9200: - It's due to the loop {code} SELECT next_val, exhausted FROM system_distributed.seq_reservations WHERE ... -- done with CONSISTENCY LEVEL nextVal = boundedAddAndExhaustedCheck ( nextVal, exhausted ) UPDATE system_distributed.seq_reservations SET ... IF next_val =?-- done with SERIAL CONSISTENCY LEVEL {code} The other writes are the initial INSERT (w/SERIAL) and final DELETE. Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541556#comment-14541556 ] Robert Stupp commented on CASSANDRA-9200: - The linked branch contains an implementation of sequences. The code is still ugly but works including utests. Syntax is explained in {{CQL.textile}} like this: {code} create-sequence-stmt ::= CREATE ( OR REPLACE )? SEQUENCE ( IF NOT EXISTS )? ( keyspace '.' )? sequencename ( INCREMENT ( BY )? integer )? ( ( MINVALUE integer ) | ( NO MINVALUE ) )? ( ( MAXVALUE integer ) | ( NO MAXVALUE ) )? ( START ( WITH )? integer )? ( CACHE integer )? ( CACHE LOCAL integer )? ( SERIAL CONSISTENCY LEVEL consistencyLevel )? ( CONSISTENCY LEVEL consistencyLevel )? {code} Access to sequences: {code} SELECT nextval('keyspace_name', 'sequence_name') FROM table; {code} The implementation uses three tables: # {{system.schema_sequences}} holding the sequence definitions # {{system_distributed.seq_reservations}} tracking the next available sequence value # {{system.sequence_local}} tracking the range of values exclusively assigned to a node Notes: * {{CACHE}} defines the number of values to acquire from the overall range ({{system_distributed.seq_reservations}}). * {{CACHE LOCAL}} (defaults to {{CACHE}}) defines the number of values to take from {{system.sequence_local}} - not sure whether to keep that option. It's not really necessary and its only limited use is when a node crashes without being gracefully shut down. * the consistency levels are those used to read from/modify {{system_distributed.seq_reservations}} * I've not included a {{CYCLE}} option. Reason for that is that it would require a global synchronization of all nodes to ensure that no node still owns a range before wrapping around. * It also has some limited support for permissions (CASSANDRA-9372). * {{SELECT nextval}} would benefit from virtual tables (CASSANDRA-7622) - i.e. {{SELECT nextval(...) FROM DUAL}} * Implementation of {{nextval()}} function requires to pass constant values to function arguments (sneaked into the branch) Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Assignee: Robert Stupp Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520469#comment-14520469 ] Robert Stupp commented on CASSANDRA-9200: - We could do it a bit better than ”just” integer/numeric sequences. By involving UDFs to generate the next block, people could do very fancy things - somewhat similar to user-defined-aggregates. {code} CREATE FUNCTION wordListIncrementer(previous text) RETURNS text LANGUAGE java AS ' return previous + 'x';' CREATE FUNCTION wordListNextBlock(previous text) RETURNS text LANGUAGE java AS $$ switch (previous) { case alpha: return beta; case beta: return gamma; ... CREATE SEQUENCE wordListSequence START WITH 'alpha' INCREMENT USING wordListIncrementer SKIP_BLOCK USING wordListNextBlock; CREATE TABLE foo ( pk text PRIMARY KEY, ...); INSERT INTO foo (pk, ...) VALUES (wordListSequence.nextval, ...); {code} By writing the pseudo code I realized that we either need something like {{SELECT xyz() FROM DUAL}} or the ability to return a result set from an {{INSERT}}/{{UPDATE}} statement to actually return the generated values to the client. I'm just brain-dumping: We could also distinguish between per-node and global sequences, where the per-node sequences could optionally include the host-id or IP (if and how the user wants it). Maybe the user only wants some sequence per node just to generate some part of a primary key. Per-node sequences would also not need to use cluster-wide synchronization. ”Global” sequences could possibly fail, if one DC is not available. Probably start with local and global integer sequences and add UDFs on top? Sequences - Key: CASSANDRA-9200 URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Fix For: 3.x UUIDs are usually the right choice for surrogate keys, but sometimes application constraints dictate an increasing numeric value. We could do this by using LWT to reserve blocks of the sequence for each member of the cluster, which would eliminate paxos contention at the cost of not being strictly increasing. PostgreSQL syntax: http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)