Yeah I think that’s intuitive enough. I had been thinking about multiple 
condition branches, but was thinking about something closer to 

IF select.column=5
  UPDATE ... SET ... WHERE key=1;
ELSE IF select.column=6
  UPDATE ... SET ... WHERE key=2;
  UPDATE ... SET ... WHERE key=3;

Which would make the proposed COMMIT IF we're talking about now a shorthand. Of 
course this would be follow on work.

> On Jun 8, 2022, at 1:20 PM, wrote:
> I imagine that conditions would be evaluated against the state prior to the 
> execution of statement against which it is being evaluated, but after the 
> prior statements. I think that should be OK to reason about.
> i.e. we might have a contrived example like:
> UPDATE tbl SET a = 1 WHERE k = 1 AS q1
> UPDATE tbl SET a = q1.a + 1 WHERE k = 1 AS q2
> COMMIT TRANSACTION IF q1.a = 0 AND q2.a = 1
> So q1 would read a = 0, but q2 would read a = 1 and set a = 2.
> I think this is probably adequately intuitive? It is a bit atypical to have 
> conditions that wrap the whole transaction though.
> We have another option, of course, which is to offer IF x ROLLBACK 
> TRANSACTION, which is closer to SQL, which would translate the above to:
> SELECT a FROM tbl WHERE k = 1 AS q0
> UPDATE tbl SET a = 1 WHERE k = 1 AS q1
> UPDATE tbl SET a = q1.a + 1 WHERE k = 1 AS q2
> This is less succinct, but might be more familiar to users. We could also 
> eschew the ability to read from UPDATE statements entirely in this scheme, as 
> this would then look very much like SQL.
> From: Blake Eggleston <>
> Date: Wednesday, 8 June 2022 at 20:59
> To: <>
> Subject: Re: CEP-15 multi key transaction syntax
> > It affects not just RETURNING but also conditions that are evaluated 
> > against the row, and if we in future permit using the values from one 
> > select in a function call / write to another table (which I imagine we 
> > will).
> I hadn’t thought about that... using intermediate or even post update values 
> in condition evaluation or function calls seems like it would make it 
> difficult to understand why a condition is or is not applying. On the other 
> hand, it would powerful, especially when using things like database generated 
> values in queries (auto incrementing integer clustering keys or server 
> generated timeuuids being examples that come to mind). Additionally, if we 
> return these values, I guess that would solve the visibility issues I’m 
> worried about. 
> Agreed intermediate values would be straightforward to calculate though.
> On Jun 6, 2022, at 4:33 PM, <> 
> wrote:
> It affects not just RETURNING but also conditions that are evaluated against 
> the row, and if we in future permit using the values from one select in a 
> function call / write to another table (which I imagine we will).
> I think that for it to be intuitive we need it to make sense sequentially, 
> which means either calculating it or restricting what can be stated (or 
> abandoning the syntax).
> If we initially forbade multiple UPDATE/INSERT to the same key, but permitted 
> overlapping DELETE (and as many SELECT as you like) that would perhaps make 
> it simple enough? Require for now that SELECTS go first, then DELETE and then 
> INSERT/UPDATE (or vice versa, depending what we want to make simple)?
> FWIW, I don’t think this is terribly onerous to calculate either, since it’s 
> restricted to single rows we are updating, so we could simply maintain a 
> collections of rows and upsert into them as we process the execution. Most 
> transactions won’t need it, I suspect, so we don’t need to worry about 
> perfect efficiency.
> From: Blake Eggleston < <>>
> Date: Tuesday, 7 June 2022 at 00:21
> To: <> 
> < <>>
> Subject: Re: CEP-15 multi key transaction syntax
> That's a good question. I'd lean towards returning the final state of things, 
> although I could understand expecting to see intermediate state. Regarding 
> range tombstones, we could require them to precede any updates like selects, 
> but there's still the question of how to handle multiple updates to the same 
> cell when the user has requested we return the post-update state of the cell.
> On Jun 6, 2022, at 4:00 PM, <> 
> wrote:
> > if multiple updates end up touching the same cell, I’d expect the last one 
> > to win
> Hmm, yes I suppose range tombstones are a plausible and reasonable thing to 
> mix with inserts over the same key range.
> What’s your present thinking about the idea of handling returning the values 
> as of a given point in the sequential execution then?
> The succinct syntax is I think highly desirable for user experience, but this 
> does complicate it a bit if we want to remain intuitive.
> From: Blake Eggleston < <>>
> Date: Monday, 6 June 2022 at 23:17
> To: <> 
> < <>>
> Subject: Re: CEP-15 multi key transaction syntax
> Hi all,
> Thanks for all the input and questions so far. Glad people are excited about 
> this!
> I didn’t have any free time to respond this weekend, although it looks like 
> Benedict has responded to most of the questions so far, so if I don’t respond 
> to a question you asked here, you can interpret that as “what Benedict said” 
> :).
> Jeff, 
> > Is there a new keyword for “partition (not) exists” or is it inferred by 
> > the select?
> I'd intended this to be worked out from the select statement, ie: if the 
> read/reference is null/empty, then it doesn't exist, whether you're 
> interested in the partition, row, or cell. So I don't think we'd need an 
> additional keyword there. I think that would address partition exists / not 
> exists use cases?
> > And would you allow a transaction that had > 1 named select and no 
> > modification statements, but commit if 1=1 ?
> Yes, an unconditional commit (ie: just COMMIT TRANSACTION; without an IF) 
> would be part of the syntax. Also, running a txn that doesn’t contain updates 
> wouldn’t be a problem.
> Patrick, I think Benedict answered your questions? Glad you got the joke :)
> Alex,
> > 1. Dependant SELECTs
> > 2. Dependant UPDATEs
> > 3. UPDATE from secondary index (or SASI)
> > 5. UPDATE with predicate on non-primary key
> The full primary key must be defined as part of the statement, and you can’t 
> use column references to define them, so you wouldn’t be able to run these.
> > MVs
> To prevent being spread too thin, both in syntax design and implementation 
> work, I’d like to limit read and write operations in the initial 
> implementation to vanilla selects, updates, inserts, and deletes. Once we 
> have a solid implementation of multi-key/table transactions supporting 
> foundational operations, we can start figuring out how the more advanced 
> pieces can be best supported. Not a great answer to your question, but a 
> related tangent I should have included in my initial email.
> > ... RETURNING ...
> I like the idea of the returning statement, but to echo what Benedict said, I 
> think any scheme for specifying data to be returned should apply the same to 
> select and update statements, since updates can have underlying reads that 
> the user may be interested in. I’d mentioned having an optional RETURN 
> statement in addition to automatically returning selects in my original email.
> > ... WITH ...
> I like the idea of defining statement names at the beginning of a statement, 
> since I could imagine mapping names to selects might get difficult if there 
> are a lot of columns in the select or update, but beginning each statement 
> with `WITH <name>` reduces readability imo. Maybe putting the name after the 
> first term of the statement (ie: `SELECT * AS <name> WHERE...`, `UPDATE table 
> AS <name> SET ...`, `INSERT INTO table AS <name> (...) VALUES (...);`) would 
> be improve finding names without harming overall readability?
> Benedict,
> > I agree that SELECT statements should be required to go first.
> +1
> > There only remains the issue of conditions imposed upon 
> > UPDATE/INSERT/DELETE statements when there are multiple statements that 
> > affect the same primary key. I think we can (and should) simply reject such 
> > queries for now, as it doesn’t make much sense to have multiple statements 
> > for the same primary key in the same transaction.
> Unfortunately, I think there are use cases for both multiple selects and 
> updates for the same primary key in a txn. Selects aren’t as problematic, but 
> if multiple updates end up touching the same cell, I’d expect the last one to 
> win. This would make dealing with range tombstones a little trickier, since 
> the default behavior of alternating updates and range tombstones affecting 
> the same cells is not intuitive, but I don’t think it would be too bad.
> Something that’s come up a few times, and that I’ve also been thinking about 
> is whether to return the values that were originally read, or the values 
> written with the update to the client, and there are use cases for both. I 
> don’t remember who suggested it, but I think returning the original values 
> from named select statements, and the post-update values from named update 
> statements is a good way to handle both. Also, while returning the contents 
> of the mutation would be the easiest, implementation wise, swapping cell 
> values from the updates named read would be most useful, since a txn won’t 
> always result in an update, in which case we’d just return the select.
> Thanks,
> Blake
> On Jun 6, 2022, at 9:41 AM, Henrik Ingo < 
> <>> wrote:
> On Mon, Jun 6, 2022 at 5:28 PM 
> <> < 
> <>> wrote:
> > One way to make it obvious is to require the user to explicitly type the 
> > SELECTs and then to require that all SELECTs appear before 
> Yes, I agree that SELECT statements should be required to go first.
> However, I think this is sufficient and we can retain the shorter format for 
> RETURNING. There only remains the issue of conditions imposed upon 
> UPDATE/INSERT/DELETE statements when there are multiple statements that 
> affect the same primary key. I think we can (and should) simply reject such 
> queries for now, as it doesn’t make much sense to have multiple statements 
> for the same primary key in the same transaction.
> I guess I was thinking ahead to a future where and UPDATE write set may or 
> may not intersect with a previous update due to allowing WHERE clause to use 
> secondary keys, etc.
> That said, I'm not saying we SHOULD require explicit SELECT statements for 
> every update. I'm sure that would be annoying more than useful.I was just 
> following a train of thought.
> > Returning the "result" from an UPDATE presents the question should it be 
> > the data at the start of the transaction or end state?
> I am inclined to only return the new values (as proposed by Alex) for the 
> purpose of returning new auto-increment values etc. If you require the prior 
> value, SELECT is available to express this.
> That's a great point!
> > I was thinking the following coordinator-side implementation would allow to 
> > use also old drivers
> I am inclined to return just the first result set to old clients. I think 
> it’s fine to require a client upgrade to get multiple result sets.
> Possibly. I just wanted to share an idea for consideration. IMO the temp 
> table idea might not be too hard to implement*, but sure the syntax does feel 
> a bit bolted on.
> *) I'm maybe the wrong person to judge that, of course :-) 
> henrik
> -- 
> Henrik Ingo
> +358 40 569 7354 <tel:358405697354>

Reply via email to