[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out

Ariel Weisberg (JIRA) Thu, 17 Sep 2015 10:28:47 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803268#comment-14803268
 ]


Ariel Weisberg commented on CASSANDRA-7392:
-------------------------------------------

bq. The last two lines of the first paragraph:
Ah right. Well it's true according to docs (and maybe the JMM, not sure it 
addresses lazySet) it only eventually sets the value. To guarantee the value 
has already been set (globally visible) you need some other operation that 
blocks flushing the store buffers (which is an incomplete implementation 
specific way of describing the type of barrier). Even then that doesn't 
guarantee it happens faster (JMM/docs make no such guarantee) it just make 
guarantees about the ordering of events.

The docs are pretty scary looking because they are definitely reserving the 
right to delay the heck out of things the same way the JMM says that stores 
aren't visible until threads synchronize on the same monitor or volatile field. 
I could concede that according the JMM/doc it could be bad to rely on lazySet 
for timely propagation. The JMM is pretty conservatively specified to provide a 
minimum of guarantees until they are asked for explicitly (synchronized, 
volatile) and that leaves what happens open to how far the compiler can reorder 
stuff (not very with the caveat of usually) and how much the CPU can buffer 
(not much, not for long). I think it is fine for an approximation like timeouts.

bq. Intel Cache coherence protocol (MESI/MESIF)
I think this is another one of those in practice things. In practice all shared 
memory CPUs we have to care about are cache coherent and do something like MESI 
where you can't read from an invalidated cache line (change to cache lines 
propagate immediately). In practice (there is that word again) the book keeping 
to make use of invalidated cache lines in a shared memory system is probably 
daunting and that's why it isn't done. Your program would have to be explicit 
about which loads are safe to do against an invalidated cache line or it would 
have to be inferred some how from the memory model of the language.

And that is basically how I arrive at a certain set of assumptions about what 
eventually means in lazy set. Also at some point Martin Thompson mentioned 
talking to an Intel engineer who said that store buffers always drain as fast 
as they can. It's probably buried in the mechanical sympathy google group or 
his blog.

* [This comment seems out of 
place?|https://github.com/apache/cassandra/compare/trunk...stef1927:7392-3.0#diff-dbafe458cfd36b99995c24a11de1864eR34]
* [What is this change to Slices, is it fixing an NPE? I am guessing you found 
a few things making heavier use of 
toCQLString.|https://github.com/apache/cassandra/compare/trunk...stef1927:7392-3.0#diff-f6989cedf51b3a01860712cdd32d3a1aR741]
* [This is random now so the comment is out of 
date|https://github.com/apache/cassandra/compare/trunk...stef1927:7392-3.0#diff-e06002c30313f8ead63ee472617d1b10R66]
* If only a fixed number of timed out queries are reported how about only 
storing references to the first N? Allocating an ArrayList per query doesn't 
make much sense if aggregating doesn't really work yet although it's necessary 
to count identical queries.
* [Check if debug is enabled before formatting the string and 
logging?|https://github.com/apache/cassandra/compare/trunk...stef1927:7392-3.0#diff-e06002c30313f8ead63ee472617d1b10R147
* [This is set to 
30|https://github.com/apache/cassandra/compare/trunk...stef1927:7392-3.0#diff-e06002c30313f8ead63ee472617d1b10R147]
* I think the utests would be closer to trunk if you rebased. The dtests will 
get a lot worse though.




> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node 
> is overloaded) but not queries that time out while being processed.  
> (Particularly common for index queries on data that shouldn't be indexed.)  
> Adding the latter and logging when we have to interrupt one gets us a poor 
> man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out

Reply via email to