[ 
https://issues.apache.org/jira/browse/CASSANDRA-6167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901454#comment-13901454
 ] 

Tupshin Harper commented on CASSANDRA-6167:
-------------------------------------------

The example above shows how this feature would allow for efficient client-side 
aggregation without having to do two round trips from the client. What you 
describe is a variant that I use today. However, even caching the last 
aggregation value is clearly insufficient in a massively distributed 
environment. The assumption here is that you are likely to have had another 
process do the last aggregation, so it is almost always necessary to do two 
rounds trips with the current approach.

I share your dislike the string abuse hack, and am very open to other 
suggestions.

It is quite possible that this ticket should somehow be subsumed into 
CASSANDRA-4914, but not as an aggregate function that could be implemented, but 
instead by extending CASSANDRA-4914 to include partition slice termination 
functions. It would be very incorrect to assume that this ticket would only be 
used for aggregation. There are a lot of cases where I would want to slice 
backwards in time until event X occurred, where event X is determined by the 
value of the event itself.

So, if 4914 became "Custom aggregate, filtering, and slice termination 
functions in CQL, then I would be all on board. But filtering is certainly not 
going to be sufficient as it implies that the nod would still have to read the 
entire partition (or explicitly specified portion of the partition) which is 
exactly the opposite of the goal here


> Add end-slice termination predicate
> -----------------------------------
>
>                 Key: CASSANDRA-6167
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6167
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API, Core
>            Reporter: Tupshin Harper
>            Priority: Minor
>              Labels: ponies
>
> When doing performing storage-engine slices, it would sometimes be beneficial 
> to have the slice terminate for other reasons other than number of columns or 
> min/max cell name.
> Since we are able to look at the contents of each cell as we read it, this is 
> potentially doable with very little overhead. 
> Probably more challenging than the storage-engine implementation itself, is 
> to come up with appropriate CQL syntax (Thrift, should we decide to support 
> it, would be trivial).
> Two possibilities ar
> 1) special where function:
> SELECT pk,event from cf WHERE pk IN (1,5,10,11) AND 
> partition_predicate({predicate})
> or a bigger language change, but i think one I prefer. more like:
> 2) SELECT pk,event from cf where pk IN (1,5,10,11) UNTIL PARTITION event 
> {predicate}
> Neither feels perfect, but I do like the fact that the second one at least 
> clearly states what it is intended to do.
> By using "UNTIL PARTITION", we could re-use the UNTIL keyword to handle other 
> kinds of early-termination of selects that the coordinator might be able to 
> do, such as stop retrieving additional rows from shards after a particular 
> criterion was met.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to