[
https://issues.apache.org/jira/browse/CASSANDRA-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367860#comment-14367860
]
Philip Thompson commented on CASSANDRA-7280:
--------------------------------------------
Just to follow up, this property is not being respected ever? Or just in ALLOW
FILTERING queries? If ever, I think that's no longer the case in trunk, but I'm
not sure what interaction it might have with filtering.
> Hadoop support not respecting cassandra.input.split.size
> --------------------------------------------------------
>
> Key: CASSANDRA-7280
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7280
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Reporter: Jeremy Hanna
>
> Long ago (0.7), I tried to set the cassandra.input.split.size property and
> never really got it to respect that property. However the batch size was
> useful for what I needed to affect the timeouts.
> Now with the cql record reader and the native paging, users can specify
> queries potentially using allow filtering clauses. The input split size is
> more important because the server may have to scan through many many records
> to get matching records. If the user can effectively set the input split
> size, then that gives a hard limit on how many records it will traverse.
> Currently it appears to be overriding the property, perhaps in the
> client.describe_splits_ex method on the server side.
> It can be argued that users shouldn't be using allow filtering clauses in
> their cql in the first place. However it is still a bug that the input split
> size is not honored.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)