[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out

Stefania (JIRA) Mon, 28 Sep 2015 21:03:22 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934590#comment-14934590
 ]


Stefania commented on CASSANDRA-7392:
-------------------------------------

bq. Use a dedicated thread to update the timestamp so it isn't impacted by 
other activities

bq. I was going to suggest using the thread used by 
NanoTimeToCurrentTimeMillis, so make it an SES and schedule the work there. 
However I'm not even sure why that activity deserved it's own thread. I think 
there was nothing available in some version of C*, but now it could just use 
ScheduledExecutors. So maybe just a dedicated thread for updating 
ApproximateTime. I believe approximate time will find more traction over time 
so it should be reasonably accurate when possible.

I've introduced a new periodic SES for fast jobs (sub-microsecond) and moved 
{{ApproximateTime}} and {{NanoTimeToCurrentTimeMillis}} to it.


bq. I think the timestamp field in ApproximateTime needs to be volatile.

OK

bq. Several properties don't have the "cassandra." prefix

Thanks, I accidentally dropped them during the refactoring.

bq. By polling the queue when not reporting you are increasing the bound on the 
number of retained failures and resources pinned by this reporting since 
aggregation doesn't really aggregate yet. I would just drain the queue when 
logging.

OK

bq. I think you want a count of operations that were truncated instead of a 
boolean so you can log the count.

OK

bq. Offering into the queue returns a boolean and doesn't throw, which style 
wise seems a little nicer, but that is bike shedding.

OK

bq. More bike shedding, when aggregating I would just allocate the map each 
time rather than clear it.

It's done now since we only drain when reporting, a map is now created only 
during reporting.

bq. I think you should sync logging to the debug log and logging info level to 
the regular log. Then in the regular log print a count of how many operations 
timed out since the last time you logged. That way it is easy to map between 
the two when looking at timestamps.

I've added number of operations and interval and made the two messages 
partially identical, is this what you meant by "sync"? 
Bear in mind that the no spam logger will only log once every 15 minutes 
however.

bq. I don't think this is a correct average calculation. You want a sum and a 
count. I didn't work for the simple example I did by hand.

Done.

bq. More bike shedding, you can implement min and max as "oldValue = 
Math.min(oldValue, nextMeasurement)".

OK

bq. Can you humor me and for Monitorable boolean checks rename to isXYZ and for 
things that might change it leave as is?

Sure, done.

bq. I think failedAt is unused now?

No, we still need it when adding a timeout to the same failed operation.

bq. If we use approximate time for timeouts can we also use it for setting the 
construction time?

I believe we can, this is however existing functionality that we are changing 
as it is used by the existing logging of all dropped messages.

bq. More bike shedding. The idiom for polling a thread safe queue is to avoid 
calling isEmpty() and poll checking for null to avoid extra lock acquisitions 
(assuming the queue does that) on the queue.. Some queues do have cheap(er) 
isEmpty() calls.

OK

> Abort in-progress queries that time out
> ---------------------------------------
>
>                 Key: CASSANDRA-7392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Critical
>             Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node 
> is overloaded) but not queries that time out while being processed.  
> (Particularly common for index queries on data that shouldn't be indexed.)  
> Adding the latter and logging when we have to interrupt one gets us a poor 
> man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out

Reply via email to