[
https://issues.apache.org/jira/browse/CASSANDRA-6167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901454#comment-13901454
]
Tupshin Harper commented on CASSANDRA-6167:
-------------------------------------------
The example above shows how this feature would allow for efficient client-side
aggregation without having to do two round trips from the client. What you
describe is a variant that I use today. However, even caching the last
aggregation value is clearly insufficient in a massively distributed
environment. The assumption here is that you are likely to have had another
process do the last aggregation, so it is almost always necessary to do two
rounds trips with the current approach.
I share your dislike the string abuse hack, and am very open to other
suggestions.
It is quite possible that this ticket should somehow be subsumed into
CASSANDRA-4914, but not as an aggregate function that could be implemented, but
instead by extending CASSANDRA-4914 to include partition slice termination
functions. It would be very incorrect to assume that this ticket would only be
used for aggregation. There are a lot of cases where I would want to slice
backwards in time until event X occurred, where event X is determined by the
value of the event itself.
So, if 4914 became "Custom aggregate, filtering, and slice termination
functions in CQL, then I would be all on board. But filtering is certainly not
going to be sufficient as it implies that the nod would still have to read the
entire partition (or explicitly specified portion of the partition) which is
exactly the opposite of the goal here
> Add end-slice termination predicate
> -----------------------------------
>
> Key: CASSANDRA-6167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6167
> Project: Cassandra
> Issue Type: Improvement
> Components: API, Core
> Reporter: Tupshin Harper
> Priority: Minor
> Labels: ponies
>
> When doing performing storage-engine slices, it would sometimes be beneficial
> to have the slice terminate for other reasons other than number of columns or
> min/max cell name.
> Since we are able to look at the contents of each cell as we read it, this is
> potentially doable with very little overhead.
> Probably more challenging than the storage-engine implementation itself, is
> to come up with appropriate CQL syntax (Thrift, should we decide to support
> it, would be trivial).
> Two possibilities ar
> 1) special where function:
> SELECT pk,event from cf WHERE pk IN (1,5,10,11) AND
> partition_predicate({predicate})
> or a bigger language change, but i think one I prefer. more like:
> 2) SELECT pk,event from cf where pk IN (1,5,10,11) UNTIL PARTITION event
> {predicate}
> Neither feels perfect, but I do like the fact that the second one at least
> clearly states what it is intended to do.
> By using "UNTIL PARTITION", we could re-use the UNTIL keyword to handle other
> kinds of early-termination of selects that the coordinator might be able to
> do, such as stop retrieving additional rows from shards after a particular
> criterion was met.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)