[
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291428#comment-15291428
]
Benedict commented on CASSANDRA-11521:
--------------------------------------
I would also like to voice my support for a separate path. The two needs are
really quite distinct, and while optimising the normal read path is definitely
something we should be exploring in general, complicating it with harder to
reason about system behaviour on the normal path (wrt memory usage, reclaim,
abort detection etc) _and_ implementation details (leading to bugs around those
things, for more critical use cases), and yet still unlikely yielding the same
performance suggests it isn't the best approach for this goal.
However I would caveat that the idea of evaluating the entire query to an
off-heap memory region is not what I would have in mind - there's a sliding
scale starting from a small buffer (or pair of buffers) kept just ahead of the
client, refilled from a persistent server-side cursor that just avoids
repeating work to seek into files. The ideal would be as close to this as
possible, with a potential time-bound on the lifespan of the cursor, after
which it can be reinitialised to permit cleanup of sstables. A configurable
time limit on isolation could be provided as an option to define this period.
However these streams can be arbitrarily large, so certainly we don't want to
evaluate the entire query to permit releasing the sstables.
Note, that the OpOrder should not be used by these queries - actual references
should be taken so that long lifespans have no impact.
The code that takes these references really needs to be fixed, also, so that
the races to update the data tracker don't cause temporary "infinite" loops -
like we see for range queries today.
> Implement streaming for bulk read requests
> ------------------------------------------
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
> Issue Type: Sub-task
> Components: Local Write-Read Paths
> Reporter: Stefania
> Assignee: Stefania
> Fix For: 3.x
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer
> and eliminating the need to query individual pages one by one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)