Robert Stupp created CASSANDRA-13034:
----------------------------------------
Summary: Move to FastThreadLocalThread and FastThreadLocal
Key: CASSANDRA-13034
URL: https://issues.apache.org/jira/browse/CASSANDRA-13034
Project: Cassandra
Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
(Supersedes/includes CASSANDRA-13033 for 3.X & trunk)
We still use {{ThreadLocal}} in a couple of places, so I was curious how much
faster {{FastThreadLocal}} is compared to {{ThreadLocal}}. A micro bench tells,
that {{FastThreadLocal}} has a runtime of ~2.7ns and {{ThreadLocal}} of ~4.7ns
- about 2ns slower (EDIT: subtracted baseline).
However, looking at the implementations it seems that {{ThreadLocal}} has more
dependent pointer gets than {{FastThreadLocal}}. This (CPU cache misses) is not
reflected in the artificial benchmark below.
The patch migrates all {{Thread}} instances (except a few in tests) and all
{{ThreadLocal}} instances.
{code:title=FastThreadLocalBench with 4 threads on 4 core CPU}
[java] FastThreadLocalBench.baseline 2 avgt 5 3.023
± 0.081 ns/op
[java] FastThreadLocalBench.fastThreadLocal 2 avgt 5 5.610
± 0.154 ns/op
[java] FastThreadLocalBench.fastThreadLocal 4 avgt 5 5.653
± 0.042 ns/op
[java] FastThreadLocalBench.fastThreadLocal 8 avgt 5 5.763
± 0.588 ns/op
[java] FastThreadLocalBench.fastThreadLocal 12 avgt 5 5.673
± 0.117 ns/op
[java] FastThreadLocalBench.threadLocal 2 avgt 5 7.708
± 0.723 ns/op
[java] FastThreadLocalBench.threadLocal 4 avgt 5 7.604
± 0.059 ns/op
[java] FastThreadLocalBench.threadLocal 8 avgt 5 7.629
± 0.080 ns/op
[java] FastThreadLocalBench.threadLocal 12 avgt 5 7.858
± 0.483 ns/op
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)