On 17 December 2013 07:53, Michael Hudson-Doyle
<michael.hud...@linaro.org> wrote:
> Michael Hudson-Doyle <michael.hud...@linaro.org> writes:
>
>> Michael Hudson-Doyle <michael.hud...@linaro.org> writes:
>>
>>> Will Newton <will.new...@linaro.org> writes:
>>>
>>>> On 16 December 2013 03:36, Michael Hudson-Doyle
>>>> <michael.hud...@linaro.org> wrote:
>>>>> Michael Hudson-Doyle <michael.hud...@linaro.org> writes:
>>>>>
>>>>>> Aaah, you might be onto something there.  I built myself a cross gcc-4.8
>>>>>> today and it appeared to compile things correctly (I didn't actually get
>>>>>> to run it, but the objdump poking looked right) and I got a bit worried
>>>>>> that this was all down to some cosmic ray / corruption when I first
>>>>>> compiled it.  But, the scripts I cargo culted just use compile binutils
>>>>>> from git tip, so if the bug is in binutils...
>>>>>
>>>>> So I still don't know what's going on, exactly, but I have a debug build
>>>>> of binutils now and some clues.  It still only happens on real hardware,
>>>>> not cross compiling on my laptop, but I think I have an idea as to why.
>>>>> This might be complete crack, but anyway.
>>>>>
>>>>> I think it's to do with the order of things within the GOT.
>>>>>
>>>>> When I cross compile, sort the relocations by address, then count up the
>>>>> number of relocations of each type, it looks like this:
>>>>>
>>>>> $ objdump -C -R build/linux2/*/mongo/base/counter_test | LC_ALL=C sort | 
>>>>> cut -d' ' -f 2 | uniq -c
>>>>>       4
>>>>>     496 R_AARCH64_GLOB_DAT
>>>>>       1 R_AARCH64_TLS_TPREL64
>>>>>     103 R_AARCH64_GLOB_DAT
>>>>>     305 R_AARCH64_JUMP_SLOT
>>>>>      12 R_AARCH64_COPY
>>>>>       1 RELOCATION
>>>>>       2
>>>>>
>>>>> In this case, the code and the relocation agree on where the thread
>>>>> local variable is.
>>>>>
>>>>> When I compile natively, it looks like this:
>>>>>
>>>>> (t-mwhudson)ubuntu@arm64:~/src/mongo$ objdump -C -R 
>>>>> build/linux2/*/mongo/base/counter_test | LC_ALL=C sort | cut -d' ' -f 2 | 
>>>>> uniq -c
>>>>>       4
>>>>>     295 R_AARCH64_JUMP_SLOT
>>>>>     496 R_AARCH64_GLOB_DAT
>>>>>       1 R_AARCH64_TLS_TPREL64
>>>>>     104 R_AARCH64_GLOB_DAT
>>>>>      12 R_AARCH64_COPY
>>>>>       1 RELOCATION
>>>>>       2
>>>>>
>>>>> And the code and the relocation disagree on where the thread local
>>>>> variable is -- by 298 * sizeof(void*).  Which is almost (but I admit,
>>>>> not exactly) the number of JUMP_SLOTs that are, in this case, before the
>>>>> TLS variable in the GOT.  When I compiled in a different way, there were
>>>>> only 160 JUMP_SLOTs before the TLS reloc, and the code and relocation
>>>>> disagreed by 163 slots.
>>>>>
>>>>> So is it possible somehow that the GOT has these JUMP_SLOTs inserted
>>>>> into it after the relocation for the TLS has been written out?  I don't
>>>>> really see how but maybe this rings a bell...
>>>>
>>>> Indeed it does. ;-)
>>>>
>>>> A similar issue was caused by commit
>>>> 692e2b8bcdd8325ebfbe1daace87100d53d15ad6 (which adds ifunc support to
>>>> the aarch64 ld backend) but was intended to be fixed by the rework of
>>>> the same code in 1419bbe5712af8a43f1d63feb32687649563426d. However I
>>>> was never actually able to reproduce the failure case (I saw binaries
>>>> that were broken so I know it could happen) so the fix was somewhat
>>>> speculative. Hence I am very interested in finding a reproducible case
>>>> where this GOT entry misordering happens!
>>>
>>> I'm possibly doing something wrong, but I've tried to try compiling the
>>> suspect binary with both binutils git tip and the commit before
>>> 692e2b8bc but both had the problem.  So I guess it's something else, or
>>> I wasn't testing what I thought I was testing.
>>
>> Argh, I wasn't testing what I thought I was testing... trying again.
>
> Ah... found it!  This is the code that determines the offset to patch
> into the code (elfnn-aarch64.c line 3845):
>
>       value = (symbol_got_offset (input_bfd, h, r_symndx)
>                + globals->root.sgot->output_section->vma
>                + globals->root.sgot->output_section->output_offset);
>
> and this is the code that determines the offset as written into the
> relocation (elfnn-aarch64.c line 4248):
>
>               off = symbol_got_offset (input_bfd, h, r_symndx);
>               ...
>                   rela.r_offset = globals->root.sgot->output_section->vma +
>                     globals->root.sgot->output_offset + off;
>
> Can you see the difference?  The former is
> "root.sgot->output_section->output_offset", the latter is
> "root.sgot->output_offset".

Yes, that does look a bit odd.

> This suggests the rather obvious attached patch.  I haven't tested this
> exact patch, but its an obvious translation from a patch to
> 692e2b8bcdd8325ebfbe1daace87100d53d15ad6^ which does work.  I also
> haven't tested the second hunk at all, but it seems plausible...

Thanks for you analysis, the fix does look plausible indeed. ;-)

Have you verified it fixes the problem you were seeing?

I'm about to disappear to sunnier climes for three weeks but I'll
definitely look at it when I get back. I've added Marcus to CC in case
he isn't reading this list.

-- 
Will Newton
Toolchain Working Group, Linaro

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to