[jira] [Commented] (CASSANDRA-8518) Cassandra Query Request Size Estimator

Benedict (JIRA) Wed, 14 Jan 2015 16:10:42 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277953#comment-14277953
 ]


Benedict commented on CASSANDRA-8518:
-------------------------------------

This is one of the two methods I proposed, and I'm comfortable aiming for the 
global threshold. Per-request thresholds are also a possibility, and seem 
reasonable also. Whether or not we _throttle_ or simply discard some in-flight 
queries on exceeding our limit is another matter though. I would prefer to go 
the route of discarding some random in-flight queries, as this brings the 
system back to full health immediately, instead of letting it crawl along until 
the blockage clears.

> Cassandra Query Request Size Estimator
> --------------------------------------
>
>                 Key: CASSANDRA-8518
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8518
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Cheng Ren
>
> We have been suffering from cassandra node crash due to out of memory for a 
> long time. The heap dump from the recent crash shows there are 22 native 
> transport request threads each of which consumes 3.3% of heap size, taking 
> more than 70% in total.  
> Heap dump:
> !https://dl-web.dropbox.com/get/attach1.png?_subject_uid=303980955&w=AAAVOoncBoZ5aOPbDg2TpRkUss7B-2wlrnhUAv19b27OUA|height=400,width=600!
> Expanded view of one thread:
> !https://dl-web.dropbox.com/get/Screen%20Shot%202014-12-18%20at%204.06.29%20PM.png?_subject_uid=303980955&w=AACUO4wrbxheRUxv8fwQ9P52T6gBOm5_g9zeIe8odu3V3w|height=400,width=600!
> The cassandra we are using now (2.0.4) utilized MemoryAwareThreadPoolExecutor 
> as the request executor and provided a default request size estimator which 
> constantly returns 1, meaning it limits only the number of requests being 
> pushed to the pool. To have more fine-grained control on handling requests 
> and better protect our node from OOM issue, we propose implementing a more 
> precise estimator. 
> Here is our two cents:
> For update/delete/insert request: Size could be estimated by adding size of 
> all class members together.
> For scan query, the major part of the request is response, which can be 
> estimated from the history data. For example if we receive a scan query on a 
> column family for a certain token range, we keep track of its response size 
> used as the estimated response size for later scan query on the same cf. 
> For future requests on the same cf, response size could be calculated by 
> token range*recorded size/ recorded token range. The request size should be 
> estimated as (query size + estimated response size).
> We believe what we're proposing here can be useful for other people in the 
> Cassandra community as well. Would you mind providing us feedbacks? Please 
> let us know if you have any concerns or suggestions regarding this proposal.
> Thanks,
> Cheng



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8518) Cassandra Query Request Size Estimator

Reply via email to