[jira] [Comment Edited] (CASSANDRA-14983) Local reads potentially blocking remote reads

2019-04-04 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809691#comment-16809691
 ] 

Benedict edited comment on CASSANDRA-14983 at 4/4/19 10:14 AM:
---

[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best ways I can think of to break a local request, without modifying the 
underlying codebase, would be one of two possibilities:

1) Register a java-based FileSystemProvider that we can break, but this would 
also require changes to the codebase, to use URIs instead of paths for the 
filesystem interactions.  This might not be too onerous, or might be really 
awful to undertake.  But it would offer us really powerful control over testing 
of filesystem interactions.

2) Use a pre-existing outside-of-JVM approach.  ScyllaDB seem to have written a 
[fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce, though I'm not 
sure if it would support a delay - and an error might not suffice for this test.

3) Byteman is also an option for a single fault injection, but in my experience 
it is really unpleasant to work with, and costly to maintain.  But this _might_ 
be the lowest effort.

4) You could mess with the local node state, copying the sstable thousands or 
tens-of-thousands of times (and registering the copies with the node's Tracker 
directly, since you can invoke any method you like).

We do also have prior form for modifying the real code to produce errors, but I 
have a preference to avoid this.

I think 4 should be the easiest, so that might be the best bet!


was (Author: benedict):
[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best way I can think of to break a local request, without modifying the 
underlying codebase, would be one of two possibilities:

1) Register a java-based FileSystemProvider that we can break, but this would 
also require changes to the codebase, to use URIs instead of paths for the 
filesystem interactions.  This might not be too onerous, or might be really 
awful to undertake.  But it would offer us really powerful control over testing 
of filesystem interactions.

2) Use a pre-existing outside-of-JVM approach.  ScyllaDB seem to have written a 
[fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce.

3) Byteman is also an option for a single fault injection, but in my experience 
it is really unpleasant to work with, and costly to maintain.  But this _might_ 
be the lowest effort.

We do also have prior form for modifying the real code to produce errors, but I 
have a preference to avoid this.

FWIW, I don't think any of these approaches are _strictly_ necessary for 
merging this patch, since none are trivial undertakings.  But one would 
certainly be preferable, and some would have wider positive impacts for the 
project.  They are just suggestions given your question.

> Local reads potentially blocking remote reads
> -
>
> Key: CASSANDRA-14983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Low
> Attachments: graph_local_read.html, graph_local_read_trunk.html, 
> local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for 

[jira] [Comment Edited] (CASSANDRA-14983) Local reads potentially blocking remote reads

2019-04-04 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809691#comment-16809691
 ] 

Benedict edited comment on CASSANDRA-14983 at 4/4/19 10:10 AM:
---

[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best way I can think of to break a local request, without modifying the 
underlying codebase, would be one of two possibilities:

1) Register a java-based FileSystemProvider that we can break, but this would 
also require changes to the codebase, to use URIs instead of paths for the 
filesystem interactions.  This might not be too onerous, or might be really 
awful to undertake.  But it would offer us really powerful control over testing 
of filesystem interactions.

2) Use a pre-existing outside-of-JVM approach.  ScyllaDB seem to have written a 
[fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce.

3) Byteman is also an option for a single fault injection, but in my experience 
it is really unpleasant to work with, and costly to maintain.  But this _might_ 
be the lowest effort.

We do also have prior form for modifying the real code to produce errors, but I 
have a preference to avoid this.

FWIW, I don't think any of these approaches are _strictly_ necessary for 
merging this patch, since none are trivial undertakings.  But one would 
certainly be preferable, and some would have wider positive impacts for the 
project.  They are just suggestions given your question.


was (Author: benedict):
[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best way I can think of to break a local request, without modifying the 
underlying codebase, would be one of two possibilities:

1) Register a java-based FileSystemProvider that we can break, but this would 
also require changes to the codebase, to use URIs instead of paths for the 
filesystem interactions.  This might not be too onerous, or might be really 
awful to undertake.  But it would offer us really powerful control over testing 
of filesystem interactions.

2) Use a pre-existing outside-of-JVM approach.  ScyllaDB seem to have written a 
[fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce.

> Local reads potentially blocking remote reads
> -
>
> Key: CASSANDRA-14983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Low
> Attachments: graph_local_read.html, graph_local_read_trunk.html, 
> local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for local data/transient data requests this will block the next type of 
> request from being sent away until the local read is finished and add 
> additional latency to the request.
> In addition to this it seems like local requests are *always* data requests 
> (might not be a problem), but the log message can say either ["digest" or 
> "data"|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L156]
>  as the type of request.
> I have tried to run performance measurements to see the impact of this in 3.0 
> (by moving local requests to the end of ARE#executeAsync()) but I haven't 
> seen any big 

[jira] [Comment Edited] (CASSANDRA-14983) Local reads potentially blocking remote reads

2019-04-04 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809691#comment-16809691
 ] 

Benedict edited comment on CASSANDRA-14983 at 4/4/19 10:07 AM:
---

[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best way I can think of to break a local request, without modifying the 
underlying codebase, would be one of two possibilities:

1) Register a java-based FileSystemProvider that we can break, but this would 
also require changes to the codebase, to use URIs instead of paths for the 
filesystem interactions.  This might not be too onerous, or might be really 
awful to undertake.  But it would offer us really powerful control over testing 
of filesystem interactions.

2) Use a pre-existing outside-of-JVM approach.  ScyllaDB seem to have written a 
[fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce.


was (Author: benedict):
[~molsson] very sorry for taking so long to look at this patch, or to respond.  
Thank you for producing the patch - I am looking forward to reviewing it soon.  
I'm afraid I still don't quite have time - I hope to get to it in the next 
couple of weeks.

The best way I can think of to break a local request, without modifying the 
underlying codebase, would be to either register a java-based 
FileSystemProvider that we can break, but this would also require changes to 
the codebase, to use URIs instead of paths for the filesystem interactions.  
This might not be too onerous, or might be really awful to undertake.  But it 
would offer us really powerful control over testing of filesystem interactions.

An alternative might be to simply use a pre-existing outside-of-JVM approach.  
ScyllaDB seem to have written a [fault-injecting filesystem for 
Linux|https://www.scylladb.com/2016/05/02/fault-injection-filesystem-cookbook/].
  This would work in CircleCI and on users developing under Linux, which is 
better than nothing, and might be lower effort to introduce.

> Local reads potentially blocking remote reads
> -
>
> Key: CASSANDRA-14983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Low
> Attachments: graph_local_read.html, graph_local_read_trunk.html, 
> local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for local data/transient data requests this will block the next type of 
> request from being sent away until the local read is finished and add 
> additional latency to the request.
> In addition to this it seems like local requests are *always* data requests 
> (might not be a problem), but the log message can say either ["digest" or 
> "data"|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L156]
>  as the type of request.
> I have tried to run performance measurements to see the impact of this in 3.0 
> (by moving local requests to the end of ARE#executeAsync()) but I haven't 
> seen any big difference yet. I'll continue to run some more tests to see if I 
> can find a use case affected by this.
> Attaching a trace (3.0) where this happens. Reproduction:
>  # Create a three node CCM cluster
>  # Provision data with stress (rf=3)
>  # In parallel:
>  ## Start stress read run
>  ## Run multiple manual read queries in cqlsh with tracing on and 
> local_quorum (as this does not always happen)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-14983) Local reads potentially blocking remote reads

2019-02-20 Thread Marcus Olsson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773237#comment-16773237
 ] 

Marcus Olsson edited comment on CASSANDRA-14983 at 2/20/19 6:26 PM:


I created CASSANDRA-15022 for the SEP executor and have uploaded some patches 
for 3.0, 3.11 and 4.0:
|[3.0|https://github.com/emolsson/cassandra/tree/3.0-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/13]|
|[3.11|https://github.com/emolsson/cassandra/tree/3.11-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/16]|
|[4.0|https://github.com/emolsson/cassandra/tree/4.0-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/19]|

I seem to be having some issues running the tests on 4.0 though. When I tried 
to run the tests on a clean trunk branch locally I got similar issues as the 
ones in the build. But the 3.0 and 3.11 branches seems to pass the unit tests.

A note for the 4.0 patch is that all requests are still performed as full data 
requests. My understanding is that digest request are primarily used to avoid 
sending large amounts of data over the network. If we always perform data 
requests locally we might however keep multiple copies of the data in memory if 
we send data requests to remote nodes as well. If we want to change that 
behavior I think it should be in a separate JIRA ticket.

I also looked into the transient requests and I don't think anything would 
change from a logic perspective by executing the local request as a transient 
request as it is right now. Responses seems to be considered transient by 
checking the replica plan rather than the response. The 
ReadCommand#acceptsTransient()-method only seems to be used for validation 
before starting to execute the request and not affecting the actual request. 
Basically I don't think this is causing us any bugs right now. With that said I 
think the transient status should be kept even for local requests, at least 
from a correctness perspective. WDYT, should the local request be converted to 
a transient request when that is the case?

–

I also looked at the distributed tests but I'm not sure how to delay or lose 
the local requests in a good way without changing the actual code base. Do you 
know of a good place on the nodes read path to hook in to and delay it - I 
can't seem to find any?


was (Author: molsson):
I created CASSANDRA-15022 for the SEP executor and have uploaded some patches 
for 3.0, 3.11 and 4.0:

|[3.0|https://github.com/emolsson/cassandra/tree/3.0-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/13]|
|[3.11|https://github.com/emolsson/cassandra/tree/3.11-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/16]|
|[4.0|https://github.com/emolsson/cassandra/tree/4.0-CASSANDRA-14983]|[unit 
tests|https://circleci.com/gh/emolsson/cassandra/19]|

I seem to be having some issues running the tests on 4.0 though. When I tried 
to run the tests on a clean trunk branch locally I got similar issues as the 
ones in the build. But the 3.0 and 3.11 branches seems to pass the unit tests.

I also looked at the distributed tests but I'm not sure how to delay or lose 
the local requests in a good way without changing the actual code base.

> Local reads potentially blocking remote reads
> -
>
> Key: CASSANDRA-14983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Marcus Olsson
>Priority: Minor
> Attachments: graph_local_read.html, graph_local_read_trunk.html, 
> local_read_trace.log
>
>
> Since CASSANDRA-4718 there is a fast path allowing local requests to continue 
> to [work in the same 
> thread|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java#L157]
>  rather than being sent over to the read stage.
> Based on the comment
> {code:java}
> // We delay the local (potentially blocking) read till the end to avoid 
> stalling remote requests.
> {code}
> it seems like this should be performed last in the chain to avoid blocking 
> remote requests but that does not seem to be the case when the local request 
> is a data request. The digest request(s) are sent after the data requests are 
> sent (and now the transient replica requests as well). When the fast path is 
> used for local data/transient data requests this will block the next type of 
> request from being sent away until the local read is finished and add 
> additional latency to the request.
> In addition to this it seems like local requests are *always* data requests 
> (might not be a problem), but the log message can say either ["digest" or 
>