[ 
https://issues.apache.org/jira/browse/CASSANDRA-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367860#comment-14367860
 ] 

Philip Thompson commented on CASSANDRA-7280:
--------------------------------------------

Just to follow up, this property is not being respected ever? Or just in ALLOW 
FILTERING queries? If ever, I think that's no longer the case in trunk, but I'm 
not sure what interaction it might have with filtering.

> Hadoop support not respecting cassandra.input.split.size
> --------------------------------------------------------
>
>                 Key: CASSANDRA-7280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7280
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Jeremy Hanna
>
> Long ago (0.7), I tried to set the cassandra.input.split.size property and 
> never really got it to respect that property.  However the batch size was 
> useful for what I needed to affect the timeouts.
> Now with the cql record reader and the native paging, users can specify 
> queries potentially using allow filtering clauses.  The input split size is 
> more important because the server may have to scan through many many records 
> to get matching records.  If the user can effectively set the input split 
> size, then that gives a hard limit on how many records it will traverse.
> Currently it appears to be overriding the property, perhaps in the 
> client.describe_splits_ex method on the server side.
> It can be argued that users shouldn't be using allow filtering clauses in 
> their cql in the first place.  However it is still a bug that the input split 
> size is not honored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to