Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi all,

Allow me to rephrase a question I asked last week.  I am performing some
queries with ALLOW FILTERING and getting consistent read timeouts like the
following:



com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra
timeout during read query at consistency ONE (1 responses were
required but only 0 replica responded)


These errors occur only during multi-row scans, and only during integration
tests on our build server.

I tried to see if I could replicate this error by reducing
read_request_timeout_in_ms when I run Cassandra on my local machine
(where I have not seen this error), but that is not working.  Are there any
other parameters that I need to adjust?  I'd feel better if I could at
least replicate this failure by reducing the read_request_timeout_in_ms
(since doing so would mean I actually understand what is going wrong...).

Best regards,
Clint


Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote:

 Allow me to rephrase a question I asked last week.  I am performing some
 queries with ALLOW FILTERING and getting consistent read timeouts like the
 following:


ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly
describe its typical performance.

As a general statement, if you have to ALLOW FILTERING, you are probably
Doing It Wrong in terms of schema design.

A correctly operated cluster is unlikely to need to increase the default
timeouts. If you find yourself needing to do so, you are, again, probably
Doing It Wrong.

=Rob


Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Sávio S . Teles de Oliveira
How much did you reduce *read_request_timeout_in_ms* on your local machine?
Cassandra timeout during read query is higher than one machine because
Cassandra server must run the read operation in more servers (so you have
network traffic).


2014-08-05 14:54 GMT-03:00 Robert Coli rc...@eventbrite.com:

 On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com
 wrote:

 Allow me to rephrase a question I asked last week.  I am performing some
 queries with ALLOW FILTERING and getting consistent read timeouts like the
 following:


 ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly
 describe its typical performance.

 As a general statement, if you have to ALLOW FILTERING, you are probably
 Doing It Wrong in terms of schema design.

 A correctly operated cluster is unlikely to need to increase the default
 timeouts. If you find yourself needing to do so, you are, again, probably
 Doing It Wrong.

 =Rob




-- 
Atenciosamente,
Sávio S. Teles de Oliveira
voice: +55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ciências da Computação - UFG
Arquiteto de Software
CUIA Internet Brasil


Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi Rob,

Thanks for your feedback.  I understand that use of ALLOW FILTERING is
not a best practice.  In this case, however, I am building a tool on
top of Cassandra that allows users to sometimes do things that are
less than optimal.  When they try to do expensive queries like this,
I'd rather provide a higher limit before timing out, but I can't seem
to change the behavior of Cassandra by tweaking any of the parameters
in the cassandra.yaml file or in the DataStax Java driver's Cluster
object.

FWIW these queries are also in batch jobs where we can tolerate the
extra latency.

Thanks for your help!

Best regards,
Clint


On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli rc...@eventbrite.com wrote:
 On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote:

 Allow me to rephrase a question I asked last week.  I am performing some
 queries with ALLOW FILTERING and getting consistent read timeouts like the
 following:


 ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly
 describe its typical performance.

 As a general statement, if you have to ALLOW FILTERING, you are probably
 Doing It Wrong in terms of schema design.

 A correctly operated cluster is unlikely to need to increase the default
 timeouts. If you find yourself needing to do so, you are, again, probably
 Doing It Wrong.

 =Rob


Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Ah FWIW I was able to reproduce the problem by reducing
range_request_timeout_in_ms.  This is great since I want to increase
the timeout for batch jobs where we scan a large set of rows, but
leave the timeout for single-row queries alone.

Best regards,
Clint


On Tue, Aug 5, 2014 at 11:42 AM, Clint Kelly clint.ke...@gmail.com wrote:
 Hi Rob,

 Thanks for your feedback.  I understand that use of ALLOW FILTERING is
 not a best practice.  In this case, however, I am building a tool on
 top of Cassandra that allows users to sometimes do things that are
 less than optimal.  When they try to do expensive queries like this,
 I'd rather provide a higher limit before timing out, but I can't seem
 to change the behavior of Cassandra by tweaking any of the parameters
 in the cassandra.yaml file or in the DataStax Java driver's Cluster
 object.

 FWIW these queries are also in batch jobs where we can tolerate the
 extra latency.

 Thanks for your help!

 Best regards,
 Clint


 On Tue, Aug 5, 2014 at 10:54 AM, Robert Coli rc...@eventbrite.com wrote:
 On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote:

 Allow me to rephrase a question I asked last week.  I am performing some
 queries with ALLOW FILTERING and getting consistent read timeouts like the
 following:


 ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly
 describe its typical performance.

 As a general statement, if you have to ALLOW FILTERING, you are probably
 Doing It Wrong in terms of schema design.

 A correctly operated cluster is unlikely to need to increase the default
 timeouts. If you find yourself needing to do so, you are, again, probably
 Doing It Wrong.

 =Rob


Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 11:53 AM, Clint Kelly clint.ke...@gmail.com wrote:

 Ah FWIW I was able to reproduce the problem by reducing
 range_request_timeout_in_ms.  This is great since I want to increase
 the timeout for batch jobs where we scan a large set of rows, but
 leave the timeout for single-row queries alone.


You have just explicated (a subset of) the reason the timeouts were broken
out.

https://issues.apache.org/jira/browse/CASSANDRA-2819

=Rob