[jira] [Commented] (CASSANDRA-15066) Improvements to Internode Messaging

Benedict (JIRA) Tue, 11 Jun 2019 12:09:15 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861370#comment-16861370
 ]


Benedict commented on CASSANDRA-15066:
--------------------------------------

Thanks for even more review feedback!  I've pushed some modifications that you 
can take a look at, addressing most of your concerns.  Some comments / 
questions:

bq. in releaseSpace i’d leave a comment why thread is unparked but waiting is 
not set to null (parkUntilFlushed releases it)
Rather than this, I have made absolutely explicit in the class description, and 
on each variable, that every variable is updated by only one thread, and which 
thread this is.  Does this work for you?

bq. in resumeEpochSampling and resumeNowSampling we can use 
scheduleWithFixedDelay(task, 0, … ) instead of running a task upfront
I don't think so, because we want to be certain that after the method is 
invoked (particularly in the constructor) the relevant value has been updated - 
particularly for {{now}} and for the very first translation call

bq. looks like monotonic clock implementation might race. each operation is 
synchronized but their combination isn’t:
Good catch. This method should probably have been marked 
{{@VisisbleForTesting}}, but you're right that it could have been misused in 
future.

bq. In OutboundConnection, there seems to be a race condition between between 
promiseToExecuteLater(); requestConnect().addListener(f -> executeAgain()); and 
maybeExecuteAgain();. 
Really excellent catch, but I think this is much simpler than you think - it's 
a (pretty serious) typo / mistake in {{executeAgain}}, which has its condition 
inverted.  I'm a bit distracted today, but I'm pretty sure this fixes it (and I 
think it's caused because the condition is inverted from the outer condition, 
so I've inverted the inner condition to match, instead of the order of ternary 
operands)

bq. Short question: How is the approximate clock implementation with sampling 
better than just using a regular clock? 

Just to absolutely clarify, this feature existed already, I have just packaged 
it to permit disabling its use as others have also raised questions about its 
purpose.  The only point is to reduce the cost of the very frequent clock 
accesses we perform.  Ideally we would probably replace it with direct use of 
RDTSC.  Either way, this simply means that this patch's usage of clocks is 
cleanly separated into those we require to be precise, and those we permit to 
be approximate, and lets us override the implementation.  We can use 
{{System.nanoTime}} in all cases, and in fact see if there is even any 
measurable impact.

bq. Is the intention to normalize clock calls to yield the epoch timestamp or 
the intention is to improve performance by doing so periodically?
bq. Should error in approximate time be an absolute value? Especially since we 
seem to compare two error values later. However, it seems it has to be the case 
anyways

Could you expand a little on these questions?

bq. Also, what are we going to do with all the TODOs?.. Should we create 
follow-up tickets for them?

I've been periodically auditing them, and yes I will file follow-up Jira for 
any that seem to warrant it (in general I think it's OK to leave some TODOs in 
place without Jira, for the next maintainer)

Thanks again for this and all of your other feedback over the past few months!

> Improvements to Internode Messaging
> -----------------------------------
>
>                 Key: CASSANDRA-15066
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15066
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Internode
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: High
>             Fix For: 4.0
>
>         Attachments: 20k_backfill.png, 60k_RPS.png, 
> 60k_RPS_CPU_bottleneck.png, backfill_cass_perf_ft_msg_tst.svg, 
> baseline_patch_vs_30x.png, increasing_reads_latency.png, 
> many_reads_cass_perf_ft_msg_tst.svg
>
>
> CASSANDRA-8457 introduced asynchronous networking to internode messaging, but 
> there have been several follow-up endeavours to improve some semantic issues. 
>  CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were 
> combined some months ago into a single overarching refactor of the original 
> work, to address some of the issues that have been discovered.  Given the 
> criticality of this work to the project, we wanted to bring some more eyes to 
> bear to ensure the release goes ahead smoothly.  In doing so, we uncovered a 
> number of issues with messaging, some of which long standing, that we felt 
> needed to be addressed.  This patch widens the scope of CASSANDRA-14503 and 
> CASSANDRA-13630 in an effort to close the book on the messaging service, at 
> least for the foreseeable future.
> The patch includes a number of clarifying refactors that touch outside of the 
> {{net.async}} package, and a number of semantic changes to the {{net.async}} 
> packages itself.  We believe it clarifies the intent and behaviour of the 
> code while improving system stability, which we will outline in comments 
> below.
> https://github.com/belliottsmith/cassandra/tree/messaging-improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15066) Improvements to Internode Messaging

Reply via email to