[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-12-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508068#comment-13508068
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

Okay, let's leave UpdateSampleLatencies alone (although as style I'd prefer to 
inline it as an anyonymous Runnable).

Thinking more about the core functionality:

- a RetryType of one pre-emptive redundant data read would be a useful 
alternative to ALL.  (If supporting both makes things more complex, I would 
vote for just supporting the single extra read.)  E.g., for a CL.ONE read it 
would perform two data reads; for CL.QUORUM it would perform two data reads and 
a digest read.  Put another way, it would do the same exta data read 
Xpercentile would, but it would do it ahead of the threshold timeout.
- ISTM we should continue to use RDR for normal (non-RR) SR reads, and just 
accept the first data reply that comes back without comparing it to others.  
This makes the most sense to me semantically, and keeps CL.ONE reads 
lightweight.
- I think it's incorrect (again, in the non-RR case) to perform a data read 
against the same host we sent a digest read to.  Consider CL.QUORUM: I send a 
data read to replica X and a digest to replica Y.  X is slow to respond.  Doing 
a data read to Y won't help, since I need both to meet my CL.  I have to do my 
SR read to replica Z, if one exists and is alive.
- We should probably extend this to doing extra digest reads for CL  ONE, when 
we get the data read back quickly but the digest read is slow.
- SR + RR is the tricky part... this is where SR could result in data and 
digests from the same host.  So ideally, we want the ability to compare 
(potentially) multiple data reads, *and* multiple digests, *and* track the 
source for CL purposes, which neither RDR nor RRR is equipped to do.  Perhaps 
we should just force all reads to data reads for SR + RR [or even for all RR 
reads], to simplify this.

Finally,
- millis may be too coarse a grain here, especially for Custom settings.  
Currently an in-memory read will typically be under 2ms and it's quite possible 
we can get that down to 1 if we can purge some of the latency between stages.  
Might as well use micros since Timer gives it to us for free, right?

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-11-23 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13503260#comment-13503260
 ] 

Vijay commented on CASSANDRA-4705:
--

Hi Jonathan, Sorry for the delay.

{quote}
Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies 
into SR? that way we could replace case statements with polymorphism.
{quote}
The problem is that we have to calculate the expensive percentile calculation 
Async using a scheduled TPE, We can avoid the switch by introducing additional 
SRFactory which will initialize the TPE as per CF changes in the settings? Let 
me know.

{quote}
Why does preprocess return a boolean now?
{quote}
The current patch uses the boolean to understand if the processing was done or 
not its used by RCB after the patch when there are more than 1 responses 
received by the co-ordinator from the same host (When SR is on and the actual 
read response gets back at the same time as the speculated response), we should 
not count that towards the consistency level.

{quote}
How does/should SR interact with RR? Using ALL + RRR
{quote}
Currently we are doing additional read to double check if we need to write, I 
thought the goal for ALL will eliminate that and do additional write instead... 
Most cases it will be a memtable update :)
I can think of 2 options:
1) Just document the ALL case and live with the additional writes, user might 
not be a big issue for most cases and for the rest they can switch to the 
default behavior.
2) We can queue the repair Mutations, in the Async thread we can check if there 
are duplicate mutations pending... if yes then we can just ignore the 
duplicates this can be done by doing sendRR and adding the CF to be repaired in 
a HashSet (it takes additional memory footprint).

Should we move this discussion to a different ticket?

Let me know, Thanks!

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-11-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501877#comment-13501877
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

avro is just used for upgrading from 1.0 schemas, so shouldn't need to touch 
that anymore.

Would it make more sense to have getReadLatencyRate and UpdateSampleLatencies 
into SR?  that way we could replace case statements with polymorphism.

Can you split the AbstractReadExecutor refactor out from the speculative 
execution code?  That would make it easier to isolate the changes in review.

Why does preprocess return a boolean now?

How does/should SR interact with RR?  Using ALL + RRR means we're probably 
going to do a lot of unnecessary repair writes in a high-update environment 
(i.e., it would be normal for one replica to be slightly behind others on a 
read), which is probably not what we want.  Also unclear to me what happens 
when we use RDR and do a SR when we've also requested extra digests for RR, and 
we get a data read and a digest from the same replica.

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch, 0001-CASSANDRA-4705-v2.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-10-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13472500#comment-13472500
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

So I guess we could support {ALL, Xpercentile, Yms, NONE} where X and Y are 
both doubles?

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-10-09 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13472535#comment-13472535
 ] 

Vijay commented on CASSANDRA-4705:
--

Cool, let me work on the patch soon... 
{quote}
are both doubles?
{quote}
Well it will be long in ms, :)

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-10-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13472552#comment-13472552
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

our history has been that sooner or later someone always wants fractional ms, 
but I'm fine w/ long (or int) :)

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-10-05 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470461#comment-13470461
 ] 

Chris Burroughs commented on CASSANDRA-4705:


 Looks like metrics-core exposes 75, 95, 97, 99 and 99.9

Reporters have a limited set (ie you can't generate new values that will pop up 
in jmx on the fly), but in code you should be able to get at any percentile you 
want: 
https://github.com/codahale/metrics/blob/2.x-maintenance/metrics-core/src/main/java/com/yammer/metrics/stats/Snapshot.java#L54

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-10-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470470#comment-13470470
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

Thanks Chris!

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4705.patch


 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466559#comment-13466559
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

Well, we have a pretty short list of possibilities from metrics...  I guess we 
could add auto95, auto97, auto99 options?

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-29 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466390#comment-13466390
 ] 

Vijay commented on CASSANDRA-4705:
--

I pushed the prototype code into 
https://github.com/Vijay2win/cassandra/commit/62bbabfc41ba8e664eb63ba50110e5f5909b2a87

Looks like metrics-core exposes 75, 95, 97, 99 and 99.9 Percentile's, with my 
tests 75P is too low, and 99 is too high to make a difference, whereas 95P long 
tail looks better (Moving average doesn't make much of a difference too). 

I still think we should also support hard coded value in addition to the auto :)

Note: have to make the speculative_retry part of the schema but currently if 
you want to test it out it is a code change in CFMetaData

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462762#comment-13462762
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

I don't like the idea of making users manually specify thresholds.  They will 
usually get it wrong, and we have latency histograms that should let us do a 
better job automagically.

But I could see the value of a setting to allow disabling it when you know your 
CF has a bunch of different query types being thrown at it.  Something like 
speculative_retry = {off, automatic, full} where full is Peter's full data 
reads to each replica.

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461805#comment-13461805
 ] 

Jonathan Ellis commented on CASSANDRA-4705:
---

FTR I'm not sure CL.ONE is going to be substantially easier than generalizing 
to all CL.

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-24 Thread Peter Schuller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462450#comment-13462450
 ] 

Peter Schuller commented on CASSANDRA-4705:
---

99% based on what time period? If period it too short, you won't get the full 
impact since you'll pollute the track record. If it's too large, consider the 
traffic increase resulting from a prolonged hiccup. Will you be able to hide 
typical GC pauses? Then you better have the window be higher than 250 ms. What 
about full gc:s? How do you determine what the p99 is given a node with 
multiple replica sets shared with it? If a single node goes into full gc, how 
do you make latency be un-affected while still capping the number of backup 
requests at a reasonable number? If you don't cap it, the optimization is more 
dangerous than useful, since it just means you'll fall over under various 
hard-to-predict emergent situations if you expect to take advantage of less 
reads when provisioning your cluster. What's an appropriate cap? How do you 
scale that with RF and consistency level? How do you explain this to the person 
who has to figure out how much capacity is needed for a cluster?

In our case, we pretty much run all our clusters with RR turned fully up - not 
necessarily for RR purposes, but for the purpose of more deterministic 
behavior. You don't want things falling over when a replica goas down. If you 
don't have the iops/CPU to take all replicas having to process all requests for 
a replica set, you're at risk of falling over (i.e., you don't scale, because 
failures are common in large clusters) - unless you over-provision, but then 
you might as well go all data reads to begin with.

I am not arguing against the idea of backup requests, but I *strongly* 
recommend simply going for the trivial and obvious route of full data reads 
*first* and getting the obvious pay-off with no increase in complexity (I would 
even argue it's a *decrease* in complexity in terms of the behavior of the 
system as a whole, especially from the perspective of a human understanding 
emergent cluster behavior) - and then slowly develop something like this, with 
very careful thought to all the edge cases and implications of it.

I'm in favor of long-term *predictable* performance. Full data reads is a very 
very easy way to achieve that, and vastly better latency, in many cases (the 
bandwidth saturation case pretty much being the major exception; CPU savings 
aren't really relevant with Cassandra's model if you expect to survive nodes 
being down). It's also very easy for a human to understand the behavior when 
looking at graphs of system behavior in some event, and trying to predict what 
will happen, or explain what did happen.

I really think the drawbacks of full data reads are being massively 
over-estimated and the implications of lack of data reads massively 
under-estimated.


 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-24 Thread Peter Schuller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462456#comment-13462456
 ] 

Peter Schuller commented on CASSANDRA-4705:
---

Here's a good example of complexity implication that I just thought of (and 
it's stuff like this I'm worried about w.r.t. complexity): How do you split 
requests into groups within which to do latency profiling? If you don't, 
you'll easily end up having the expensive requests always be processed multiple 
times because they always hit the backup path (because they are expensive and 
thus latent). So you could very easily eat up all your intended benefit by 
having the very expensive requests take the backup path. Without knowledge of 
the nature of the requests, and since we cannot reliably just assume a 
homogenous request pattern, you would probably need some non-trivial way of 
classifying requests and having it relate to these statistics to keep.

In some cases, having it be a per-cf setting might be enough. In other cases 
that's not feasable - for example maybe you're doing slicing on large rows, and 
maybe it's impossible to determine based on an incoming requests whether it's 
expensive or not (the range may be high but result in only a single column, for 
example).

What if you don't care about the latency of the legitimately expensive 
requests, but about the cheap ones? And what if those legitimately expensive 
requests consumes your 1% (p99), such that none of the cheaper requests are 
subject to backup requests? Now you get none of the benefit, but you still take 
the brunt of the cost you'd have if you just went with full data reads.

I'm sure there are many other concerns I'm not thinking of; this was meant as 
an example of how it can be hard to make this actually work the way it's 
intended.


 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-23 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461464#comment-13461464
 ] 

Brandon Williams commented on CASSANDRA-4705:
-

bq. It would be nice to watch for latency and execute an additional request to 
a different node

Isn't this what the dsnitch does to some degree?

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4705) Speculative execution for CL_ONE

2012-09-23 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461478#comment-13461478
 ] 

Vijay commented on CASSANDRA-4705:
--

No, DSnitch watches for the latency but doesn't do the later It wont 
speculate/execute duplicate requests to another host, if the response times are 
 x%. 

I think this patch will be in addition to dsnitch, something like Jonathan 
posted in 2540

{quote}
I like the approach described in 
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf
 of doing backup requests if the original doesn't reply within N% of normal.
{quote}

 Speculative execution for CL_ONE
 

 Key: CASSANDRA-4705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4705
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 When read_repair is not 1.0, we send the request to one node for some of the 
 requests. When a node goes down or when a node is too busy the client has to 
 wait for the timeout before it can retry. 
 It would be nice to watch for latency and execute an additional request to a 
 different node, if the response is not received within average/99% of the 
 response times recorded in the past.
 CASSANDRA-2540 might be able to solve the variance when read_repair is set to 
 1.0
 1) May be we need to use metrics-core to record various Percentiles
 2) Modify ReadCallback.get to execute additional request speculatively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira