[ 
https://issues.apache.org/jira/browse/CASSANDRA-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669594#comment-13669594
 ] 

Yuki Morishita commented on CASSANDRA-5286:
-------------------------------------------

Still need to do work, but before I go too far, here is my working draft.

Design doc: https://gist.github.com/yukim/5672508
Work in progress branch: https://github.com/yukim/cassandra/commits/5286-1

New streaming API only works on bootstrap/move on this version. I checked using 
dtest's bootstrap test.

bq. I also think that streaming itself should be versioned separately from MS.

I separated streaming message exchange from MS. Still not versioned yet, but 
definitely we should. I think we also want to support the ability to stream 
different versions of SSTable. Right now, it is required to have the same major 
SSTable version.

bq. I think it would be nice if streaming was able to recover on error.

True. I'm implementing stream event handler to track stream states, so maybe we 
can use that to resume streaming.

Still TODO:

* Implement streaming status tracker using JMX and StreamEventHandler so that 
user can use nodetool to show current status.
* Migrate completely to new API. Still need to work on SSTableloader, repair 
and nodetool.
* Unit tests
                
> Streaming 2.0
> -------------
>
>                 Key: CASSANDRA-5286
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5286
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuki Morishita
>              Labels: streaming
>             Fix For: 2.0
>
>
> 2.0 is the good time to redesign streaming API including protocol to make 
> streaming more performant and reliable.
> Design goals that come up in my mind:
> *Better performance*
>   - Protocol optimization
>   - Stream multiple files in parallel (CASSANDRA-4663)
>   - Persistent connection (CASSANDRA-4660)
> *Better control*
>   - Cleaner API for error handling
>   - Integrate both IN/OUT streams into one session, so the 
> components(bootstrap, move, bulkload, repair...) that use streaming can 
> manage them easily.
> *Better reporting*
>   - Better logging/tracing
>   - More metrics
>   - Progress reporting API for external client

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to