Re: [Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
On Wed, Jul 12, 2017 at 7:08 PM, Richard Hendersonwrote: > On 07/12/2017 12:14 PM, Pranith Kumar wrote: >> >> Use ADR instruction for shorter jumps. >> >> I was going through rth's email and realized that I should have done >> this the first time. >> >> Signed-off-by: Pranith Kumar >> --- >> tcg/aarch64/tcg-target.inc.c | 14 +- >> 1 file changed, 9 insertions(+), 5 deletions(-) >> >> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c >> index 04bc369a92..5121ebc1a1 100644 >> --- a/tcg/aarch64/tcg-target.inc.c >> +++ b/tcg/aarch64/tcg-target.inc.c >> @@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, >> uintptr_t addr) >> i1 = I3206_B | ((offset >> 2) & 0x3ff); >> i2 = NOP; >> } else { >> -offset = (addr >> 12) - (jmp_addr >> 12); >> +if (offset == sextract64(offset, 0, 21)) { >> +i1 = I3406_ADR; >> +i2 = NOP; >> +} else { > > > This is dead code because it is covered by the direct jump above. > B has a 26-bit range, whereas ADR has a 21-bit range. Hmm, yep. And I guess it doesn't make sense to use ADR for short jumps because this will be a 2 instruction jump vs 1 instruction for direct branch. Thanks, -- Pranith
Re: [Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
On 07/12/2017 12:14 PM, Pranith Kumar wrote: Use ADR instruction for shorter jumps. I was going through rth's email and realized that I should have done this the first time. Signed-off-by: Pranith Kumar--- tcg/aarch64/tcg-target.inc.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 04bc369a92..5121ebc1a1 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr) i1 = I3206_B | ((offset >> 2) & 0x3ff); i2 = NOP; } else { -offset = (addr >> 12) - (jmp_addr >> 12); +if (offset == sextract64(offset, 0, 21)) { +i1 = I3406_ADR; +i2 = NOP; +} else { This is dead code because it is covered by the direct jump above. B has a 26-bit range, whereas ADR has a 21-bit range. r~
[Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
Use ADR instruction for shorter jumps. I was going through rth's email and realized that I should have done this the first time. Signed-off-by: Pranith Kumar--- tcg/aarch64/tcg-target.inc.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 04bc369a92..5121ebc1a1 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr) i1 = I3206_B | ((offset >> 2) & 0x3ff); i2 = NOP; } else { -offset = (addr >> 12) - (jmp_addr >> 12); +if (offset == sextract64(offset, 0, 21)) { +i1 = I3406_ADR; +i2 = NOP; +} else { +offset = (addr >> 12) - (jmp_addr >> 12); -/* patch ADRP */ -i1 = I3406_ADRP | (offset & 3) << 29 | (offset & 0x1c) << (5 - 2) | rd; -/* patch ADDI */ -i2 = I3401_ADDI | rt << 31 | (addr & 0xfff) << 10 | rd << 5 | rd; +i1 = I3406_ADRP; +i2 = I3401_ADDI | rt << 31 | (addr & 0xfff) << 10 | rd << 5 | rd; +} +i1 |= (offset & 3) << 29 | (offset & 0x1c) << (5 - 2) | rd; } pair = (uint64_t)i2 << 32 | i1; atomic_set((uint64_t *)jmp_addr, pair); -- 2.13.0