On 12/06/2012 09:38 PM, Peter Levart wrote:
On 12/06/2012 08:08 PM, Remi Forax wrote:
On 12/06/2012 08:01 PM, Peter Levart wrote:
There's a quick trick that guarantees in-lining of get/set/remove:
public static class FastThreadLocal<T> extends ThreadLocal<T> {
@Override
public final T get() { return super.get(); }
@Override
public final void set(T value) { super.set(value); }
@Override
public final void remove() { super.remove(); }
}
....just use static type FastThreadLocal everywhere in code.
I tried it and it works.
No, there is no way to have such guarantee, here, it works either
because the only class ThreadLocal you load is FastThreadLocal or
because the VM has profiled the callsite see that you only use
FastThreadLocal for a specific instruction.
Nothing is certain but death and taxes, I agree.
But think deeper, Remi!
I try, i try, i've trouble to understand what you mean
How do you explain the following test:
public class ThreadLocalTest {
static class Int { int value; }
static class TL0 extends ThreadLocal<Int> {}
static class TL1 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL2 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL3 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL4 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static long doTest(ThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++; //
callsite 1
return System.nanoTime() - t0;
}
here, tl.get() (callsite 1) is called with TL0, TL1, TL2, TL3 and TL4,
do tl.get() is a polymorphic call
static long doTest(FastThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++; // callsite 2
return System.nanoTime() - t0;
}
if TL0 is defined like as a FastThreadLocal. doTest is called and here
tl.get() (callsite 2) is only called with a FastThreadLocal, so the call
is inlined.
static long test0(ThreadLocal<Int> tl) {
if (tl instanceof FastThreadLocal)
return doTest((FastThreadLocal<Int>)tl);
else
return doTest(tl);
}
static void test(ThreadLocal<Int> tl) {
tl.set(new Int());
System.out.print(tl.getClass().getName() + ":");
for (int i = 0; i < 8; i++)
System.out.print(" " + test0(tl));
System.out.println();
}
public static void main(String[] args) {
TL0 tl0 = new TL0();
test(tl0);
test(new TL1());
test(new TL2());
test(new TL3());
test(new TL4());
test(tl0);
}
}
Which prints the following (demonstrating almost 2x slowdown of TL0 -
last line compared to first):
test.ThreadLocalTest$TL0: 342716421 326105315 300744544 300654890
300726346 300752009 300700781 300735651
test.ThreadLocalTest$TL1: 321424139 312128166 312173383 312125203
312142144 312150949 316760957 313393554
test.ThreadLocalTest$TL2: 525661886 524169413 524184405 524215685
524162050 524400364 524174966 454370228
test.ThreadLocalTest$TL3: 472042229 471071328 464387909 468047355
464795171 464466481 464449567 464365974
test.ThreadLocalTest$TL4: 459651686 454142365 454129481 454180718
454217277 454109611 454119988 456978405
test.ThreadLocalTest$TL0: 582252322 582773455 582612509 582753610
582626360 582852195 582805654 582598285
Now with a simple change of:
static class TL0 extends FastThreadLocal<Int> {}
...the same test prints:
test.ThreadLocalTest$TL0: 330722181 325823711 301171182 309992192
321868979 308111417 303806979 300612033
test.ThreadLocalTest$TL1: 330263857 326448062 300607081 300575641
307442821 300616794 300548457 303462898
test.ThreadLocalTest$TL2: 319627165 311309477 311465815 311279612
311294427 311315803 311470291 311293823
test.ThreadLocalTest$TL3: 526849874 524209792 524421574 524166747
524396011 524163313 524395641 524165429
test.ThreadLocalTest$TL4: 464963126 455172216 455466304 455245487
455368318 455093735 455125038 455317375
test.ThreadLocalTest$TL0: 300472239 300695398 300480230 303459397
300451419 300679904 300445717 300451166
And that's very repeatable! Try it for yourself (on JDK8 of course).
Regards, Peter
so to summarize, the tl.get() in doTest(ThreadLocal) is polymorphic and
the one in doTest(FastThreadLocal) is monomorphic thus inlinable.
It has nothing to do with the fact that FastThreadLocal override get,
set and remove.
You can comment the methods inside FastThreadLocal to see that it change
nothing.
regards,
Rémi