Read timeouts with ALLOW FILTERING turned on
Hi all, Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded) These errors occur only during multi-row scans, and only during integration tests on our build server. I tried to see if I could replicate this error by reducing read_request_timeout_in_ms when I run Cassandra on my local machine (where I have not seen this error), but that is not working. Are there any other parameters that I need to adjust? I'd feel better if I could at least replicate this failure by reducing the read_request_timeout_in_ms (since doing so would mean I actually understand what is going wrong...). Best regards, Clint
Re: Read timeouts with ALLOW FILTERING turned on
On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote: Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly describe its typical performance. As a general statement, if you have to ALLOW FILTERING, you are probably Doing It Wrong in terms of schema design. A correctly operated cluster is unlikely to need to increase the default timeouts. If you find yourself needing to do so, you are, again, probably Doing It Wrong. =Rob
Re: Read timeouts with ALLOW FILTERING turned on
How much did you reduce *read_request_timeout_in_ms* on your local machine? Cassandra timeout during read query is higher than one machine because Cassandra server must run the read operation in more servers (so you have network traffic). 2014-08-05 14:54 GMT-03:00 Robert Coli rc...@eventbrite.com: On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote: Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly describe its typical performance. As a general statement, if you have to ALLOW FILTERING, you are probably Doing It Wrong in terms of schema design. A correctly operated cluster is unlikely to need to increase the default timeouts. If you find yourself needing to do so, you are, again, probably Doing It Wrong. =Rob -- Atenciosamente, Sávio S. Teles de Oliveira voice: +55 62 9136 6996 http://br.linkedin.com/in/savioteles Mestrando em Ciências da Computação - UFG Arquiteto de Software CUIA Internet Brasil
Re: Read timeouts with ALLOW FILTERING turned on
Hi Rob, Thanks for your feedback. I understand that use of ALLOW FILTERING is not a best practice. In this case, however, I am building a tool on top of Cassandra that allows users to sometimes do things that are less than optimal. When they try to do expensive queries like this, I'd rather provide a higher limit before timing out, but I can't seem to change the behavior of Cassandra by tweaking any of the parameters in the cassandra.yaml file or in the DataStax Java driver's Cluster object. FWIW these queries are also in batch jobs where we can tolerate the extra latency. Thanks for your help! Best regards, Clint On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote: Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly describe its typical performance. As a general statement, if you have to ALLOW FILTERING, you are probably Doing It Wrong in terms of schema design. A correctly operated cluster is unlikely to need to increase the default timeouts. If you find yourself needing to do so, you are, again, probably Doing It Wrong. =Rob
Re: Read timeouts with ALLOW FILTERING turned on
Ah FWIW I was able to reproduce the problem by reducing range_request_timeout_in_ms. This is great since I want to increase the timeout for batch jobs where we scan a large set of rows, but leave the timeout for single-row queries alone. Best regards, Clint On Tue, Aug 5, 2014 at 11:42 AM, Clint Kelly clint.ke...@gmail.com wrote: Hi Rob, Thanks for your feedback. I understand that use of ALLOW FILTERING is not a best practice. In this case, however, I am building a tool on top of Cassandra that allows users to sometimes do things that are less than optimal. When they try to do expensive queries like this, I'd rather provide a higher limit before timing out, but I can't seem to change the behavior of Cassandra by tweaking any of the parameters in the cassandra.yaml file or in the DataStax Java driver's Cluster object. FWIW these queries are also in batch jobs where we can tolerate the extra latency. Thanks for your help! Best regards, Clint On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote: Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly describe its typical performance. As a general statement, if you have to ALLOW FILTERING, you are probably Doing It Wrong in terms of schema design. A correctly operated cluster is unlikely to need to increase the default timeouts. If you find yourself needing to do so, you are, again, probably Doing It Wrong. =Rob
Re: Read timeouts with ALLOW FILTERING turned on
On Tue, Aug 5, 2014 at 11:53 AM, Clint Kelly clint.ke...@gmail.com wrote: Ah FWIW I was able to reproduce the problem by reducing range_request_timeout_in_ms. This is great since I want to increase the timeout for batch jobs where we scan a large set of rows, but leave the timeout for single-row queries alone. You have just explicated (a subset of) the reason the timeouts were broken out. https://issues.apache.org/jira/browse/CASSANDRA-2819 =Rob