Robert Stupp created CASSANDRA-13034: ----------------------------------------
Summary: Move to FastThreadLocalThread and FastThreadLocal Key: CASSANDRA-13034 URL: https://issues.apache.org/jira/browse/CASSANDRA-13034 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp Assignee: Robert Stupp (Supersedes/includes CASSANDRA-13033 for 3.X & trunk) We still use {{ThreadLocal}} in a couple of places, so I was curious how much faster {{FastThreadLocal}} is compared to {{ThreadLocal}}. A micro bench tells, that {{FastThreadLocal}} has a runtime of ~2.7ns and {{ThreadLocal}} of ~4.7ns - about 2ns slower (EDIT: subtracted baseline). However, looking at the implementations it seems that {{ThreadLocal}} has more dependent pointer gets than {{FastThreadLocal}}. This (CPU cache misses) is not reflected in the artificial benchmark below. The patch migrates all {{Thread}} instances (except a few in tests) and all {{ThreadLocal}} instances. {code:title=FastThreadLocalBench with 4 threads on 4 core CPU} [java] FastThreadLocalBench.baseline 2 avgt 5 3.023 ± 0.081 ns/op [java] FastThreadLocalBench.fastThreadLocal 2 avgt 5 5.610 ± 0.154 ns/op [java] FastThreadLocalBench.fastThreadLocal 4 avgt 5 5.653 ± 0.042 ns/op [java] FastThreadLocalBench.fastThreadLocal 8 avgt 5 5.763 ± 0.588 ns/op [java] FastThreadLocalBench.fastThreadLocal 12 avgt 5 5.673 ± 0.117 ns/op [java] FastThreadLocalBench.threadLocal 2 avgt 5 7.708 ± 0.723 ns/op [java] FastThreadLocalBench.threadLocal 4 avgt 5 7.604 ± 0.059 ns/op [java] FastThreadLocalBench.threadLocal 8 avgt 5 7.629 ± 0.080 ns/op [java] FastThreadLocalBench.threadLocal 12 avgt 5 7.858 ± 0.483 ns/op {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)