> Is there a technical limitation that would prevent a range write that 
> functions the same way as a range tombstone, other than probably needing a 
> version bump of the storage format?
The technical limitation would be cost/benefit due to how this intersects w/our 
architecture I think.

Range tombstones have taught us that something that should be relatively simple 
(merge in deletion mask at read time) introduces a significant amount of 
complexity on all the paths Benjamin enumerated with a pretty long tail of bugs 
and data incorrectness issues and edge cases. The work to get there, at a high 
level glance, would be:
 1. Updates to CQL grammar, spec
 2. Updates to write path
 3. Updates to accord. And thinking about how this intersects w/accord's WAL / 
logic (I think? Consider me not well educated on details here)
 4. Updates to compaction w/consideration for edge cases on all the different 
compaction strategies
 5. Updates to iteration and merge logic
 6. Updates to paging logic
 7. Indexing
 8. repair, both full and incremental implications, support, etc
 9. the list probably goes on? There's always >= 1 thing we're not thinking of 
with a change like this. Usually more.
For all of the above we also would need unit, integration, and fuzz testing 
extensively to ensure the introduction of this new spanning concept on a write 
doesn't introduce edge cases where incorrect data is returned on merge.

All of which is to say: it's an interesting problem, but IMO given our 
architecture and what we know about the past of trying to introduce an 
architectural concept like this, the costs to getting something like this to 
production ready are pretty high.

To me the cost/benefit don't really balance out. Just my .02 though.

On Tue, May 14, 2024, at 2:50 PM, Benjamin Lerer wrote:
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.
> 
> It is not simply a gut feeling, Jon. This change impacts read, write, 
> indexing, storage, compaction, repair... The risk and cost associated with it 
> are pretty significant and I am not convinced at this point of its benefit.
> 
> Le mar. 14 mai 2024 à 19:05, Jon Haddad <j...@jonhaddad.com> a écrit :
>> Personally, I don't think that something being scary at first glance is a 
>> good reason not to explore an idea.  The scenario you've described here is 
>> tricky but I'm not expecting it to be any worse than say, SAI, which (the 
>> last I checked) has O(N) complexity on returning result sets with regard to 
>> rows returned.  We've also merged in Vector search which has O(N) overhead 
>> with the number of SSTables.  We're still fundamentally looking at, in most 
>> cases, a limited number of SSTables and some merging of values.
>> 
>> Write updates are essentially a timestamped mask, potentially overlapping, 
>> and I suspect potentially resolvable during compaction by propagating the 
>> values.  They could be eliminated or narrowed based on how they've 
>> propagated by using the timestamp metadata on the SSTable.
>> 
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.  We haven't even moved this past an idea.  
>> 
>> I think it would solve a massive problem for a lot of people and is 100% 
>> worth considering.  Thanks Patrick and David for raising this.
>> 
>> Jon
>> 
>> 
>> 
>> On Tue, May 14, 2024 at 9:48 AM Bowen Song via dev 
>> <dev@cassandra.apache.org> wrote:
>>> __
>>> Ranged update sounds like a disaster for compaction and read performance.
>>> 
>>> Imagine compacting or reading some SSTables in which a large number of 
>>> overlapping but non-identical ranges were updated with different values. It 
>>> gives me a headache by just thinking about it.
>>> 
>>> Ranged delete is much simpler, because the "value" is the same tombstone 
>>> marker, and it also is guaranteed to expire and disappear eventually, so 
>>> the performance impact of dealing with them at read and compaction time 
>>> doesn't suffer in the long term.
>>> 
>>> 
>>> On 14/05/2024 16:59, Benjamin Lerer wrote:
>>>> It should be like range tombstones ... in much worse ;-). A tombstone is a 
>>>> simple marker (deleted). An update can be far more complex.  
>>>> 
>>>> Le mar. 14 mai 2024 à 15:52, Jon Haddad <j...@jonhaddad.com> a écrit :
>>>>> Is there a technical limitation that would prevent a range write that 
>>>>> functions the same way as a range tombstone, other than probably needing 
>>>>> a version bump of the storage format?
>>>>> 
>>>>> 
>>>>> On Tue, May 14, 2024 at 12:03 AM Benjamin Lerer <ble...@apache.org> wrote:
>>>>>> Range restrictions (>, >=, =<, < and BETWEEN) do not work on UPDATEs. 
>>>>>> They do work on DELETE because under the hood C* they get translated 
>>>>>> into range tombstones.
>>>>>> 
>>>>>> Le mar. 14 mai 2024 à 02:44, David Capwell <dcapw...@apple.com> a écrit :
>>>>>>> I would also include in UPDATE… but yeah, <3 BETWEEN and welcome this 
>>>>>>> work.
>>>>>>> 
>>>>>>>> On May 13, 2024, at 7:40 AM, Patrick McFadin <pmcfa...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> This is a great feature addition to CQL! I get asked about it from 
>>>>>>>> time to time but then people figure out a workaround. It will be great 
>>>>>>>> to just have it available. 
>>>>>>>> 
>>>>>>>> And right on Simon! I think the only project I had as a high school 
>>>>>>>> senior was figuring out how many parties I could go to and still 
>>>>>>>> maintain a passing grade. Thanks for your work here. 
>>>>>>>> 
>>>>>>>> Patrick 
>>>>>>>> 
>>>>>>>> On Mon, May 13, 2024 at 1:35 AM Benjamin Lerer <ble...@apache.org> 
>>>>>>>> wrote:
>>>>>>>>> Hi everybody,
>>>>>>>>> 
>>>>>>>>> Just raising awareness that Simon is working on adding support for 
>>>>>>>>> the BETWEEN operator in WHERE clauses (SELECT and DELETE) in 
>>>>>>>>> CASSANDRA-19604. We plan to add support for it in conditions in a 
>>>>>>>>> separate patch.
>>>>>>>>> 
>>>>>>>>> The patch is available.
>>>>>>>>> 
>>>>>>>>> As a side note, Simon chose to do his highschool senior project 
>>>>>>>>> contributing to Apache Cassandra. This patch is his first 
>>>>>>>>> contribution for his senior project (his second feature contribution 
>>>>>>>>> to Apache Cassandra).
>>>>>>>>> 
>>>>>>>>> 

Reply via email to