[jira] [Commented] (CASSANDRA-6167) Add end-slice termination predicate

Sylvain Lebresne (JIRA) Fri, 14 Feb 2014 00:35:22 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901224#comment-13901224
 ]


Sylvain Lebresne commented on CASSANDRA-6167:
---------------------------------------------

Playing devil's advocate here but why wouldn't you just store the last 
aggregated value in a separate table? Granted, that assume you know the last 
aggregation value which in theory means 2 reads, but in practice it doesn't 
sound particularly hard for clients to cache that last aggregated value (of 
course, you'd want to refresh that cached value at some frequency but that can 
be done in the background easily enough).

Because my main problem with that example is that this sound a lot like a hack. 
If I store floats, I want evtval to be a float, not some string that I abuse to 
store an aggregation in the middle of other stuffs (because that's fairly error 
prone for any consumer of the table that don't care about the pre-computed 
aggregation). I really don't think we should "promote" such ways. Note that I 
understand it's "just an example", but it doesn't feels to me like we should 
add such a thing without a bunch of non-hacky examples of that being useful.

Also, there is CASSANDRA-4914. Once we have that, you'd want to use it for 
aggregation. Even if you still want to do the incremental aggregation like in 
your example, you'll still really want to use CASSANDRA-4914 to aggregate the 
values 'since last aggregation'. And I don't really see how the idea of this 
could cleanly cohabit with CASSANDRA-4914 (while it's trivial if you just 
store/cache the aggregation separately).  

> Add end-slice termination predicate
> -----------------------------------
>
>                 Key: CASSANDRA-6167
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6167
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Tupshin Harper
>            Priority: Minor
>              Labels: ponies
>
> When doing performing storage-engine slices, it would sometimes be beneficial 
> to have the slice terminate for other reasons other than number of columns or 
> min/max cell name.
> Since we are able to look at the contents of each cell as we read it, this is 
> potentially doable with very little overhead. 
> Probably more challenging than the storage-engine implementation itself, is 
> to come up with appropriate CQL syntax (Thrift, should we decide to support 
> it, would be trivial).
> Two possibilities ar
> 1) special where function:
> SELECT pk,event from cf WHERE pk IN (1,5,10,11) AND 
> partition_predicate({predicate})
> or a bigger language change, but i think one I prefer. more like:
> 2) SELECT pk,event from cf where pk IN (1,5,10,11) UNTIL PARTITION event 
> {predicate}
> Neither feels perfect, but I do like the fact that the second one at least 
> clearly states what it is intended to do.
> By using "UNTIL PARTITION", we could re-use the UNTIL keyword to handle other 
> kinds of early-termination of selects that the coordinator might be able to 
> do, such as stop retrieving additional rows from shards after a particular 
> criterion was met.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CASSANDRA-6167) Add end-slice termination predicate

Reply via email to