[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

Benedict (JIRA) Thu, 19 May 2016 09:23:47 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291428#comment-15291428
 ]


Benedict commented on CASSANDRA-11521:
--------------------------------------

I would also like to voice my support for a separate path.  The two needs are 
really quite distinct, and while optimising the normal read path is definitely 
something we should be exploring in general, complicating it with harder to 
reason about system behaviour on the normal path (wrt memory usage, reclaim, 
abort detection etc) _and_ implementation details (leading to bugs around those 
things, for more critical use cases), and yet still unlikely yielding the same 
performance suggests it isn't the best approach for this goal. 

However I would caveat that the idea of evaluating the entire query to an 
off-heap memory region is not what I would have in mind - there's a sliding 
scale starting from a small buffer (or pair of buffers) kept just ahead of the 
client, refilled from a persistent server-side cursor that just avoids 
repeating work to seek into files.  The ideal would be as close to this as 
possible, with a potential time-bound on the lifespan of the cursor, after 
which it can be reinitialised to permit cleanup of sstables.  A configurable 
time limit on isolation could be provided as an option to define this period.

However these streams can be arbitrarily large, so certainly we don't want to 
evaluate the entire query to permit releasing the sstables.

Note, that the OpOrder should not be used by these queries - actual references 
should be taken so that long lifespans have no impact.

The code that takes these references really needs to be fixed, also, so that 
the races to update the data tracker don't cause temporary "infinite" loops - 
like we see for range queries today.

> Implement streaming for bulk read requests
> ------------------------------------------
>
>                 Key: CASSANDRA-11521
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local Write-Read Paths
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

Reply via email to