Re: [EXTERNAL] Re: [PULL 18/23] accel/tcg: re-factor non-RAM execution code

2021-04-16 Thread Cédric Le Goater
On 4/16/21 11:14 AM, Alex Bennée wrote:
> 
> Cédric Le Goater  writes:
> 
>> On 4/15/21 7:34 PM, Peter Maydell wrote:
>>> On Thu, 15 Apr 2021 at 18:18, Cédric Le Goater  wrote:

 On 4/15/21 5:55 PM, Philippe Mathieu-Daudé wrote:
> On 4/15/21 4:54 PM, Peter Maydell wrote:
>> On Thu, 15 Apr 2021 at 15:32, Alex Bennée  wrote:
>>> --8<---cut here---start->8---
>>> accel/tcg: avoid re-translating one-shot instructions
>>>
>>> By definition a single instruction is capable of being an IO
>>> instruction. This avoids a problem of triggering a cpu_io_recompile on
>>> a non-cached translation which would only do exactly this anyway.
>>>
>>> Signed-off-by: Alex Bennée 
>>>
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> accel/tcg/translate-all.c | 2 +-
>>>
>>> modified   accel/tcg/translate-all.c
>>> @@ -1863,7 +1863,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>>
>>>  if (phys_pc == -1) {
>>>  /* Generate a one-shot TB with 1 insn in it */
>>> -cflags = (cflags & ~CF_COUNT_MASK) | 1;
>>> +cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
>>>  }
>>>
>>>  max_insns = cflags & CF_COUNT_MASK;
>>> --8<---cut here---end--->8---
>>
>> Yes, this fixes the problem. Do we want to put this in for 6.0? My
>> feeling is that executing from non-RAM is pretty niche, so maybe
>> if we need an rc4 anyway, but this isn't important enough to cause an
>> rc4 itself.
>
> Isn't it the default for Aspeed machines (with U-Boot)? (Cc'ing Cédric).

 You need to set the 'execute-in-place' machine option to load/execute the
 instructions from the AHB window of CE0. It's not on by default because
 boot can be really slow with some recent u-boot which heavily trash the 
 TBs.

 But this seems to work fine with -rc3.
>>>
>>> Triggering the bug requires both execute-in-place and -icount -- did
>>> you test with -icount enabled?
>>
>> It crashes.
> 
> 
> Without the above patch? I've re-posted as a proper patch here:
> 
>   Subject: [RFC PATCH] accel/tcg: avoid re-translating one-shot instructions
>   Date: Thu, 15 Apr 2021 17:24:53 +0100
>   Message-Id: <20210415162454.22056-1-alex.ben...@linaro.org>
> 


This patch does not fix the crash for the aspeed machines.

C.



Re: [EXTERNAL] Re: [PULL 18/23] accel/tcg: re-factor non-RAM execution code

2021-04-16 Thread Alex Bennée


Cédric Le Goater  writes:

> On 4/15/21 7:34 PM, Peter Maydell wrote:
>> On Thu, 15 Apr 2021 at 18:18, Cédric Le Goater  wrote:
>>>
>>> On 4/15/21 5:55 PM, Philippe Mathieu-Daudé wrote:
 On 4/15/21 4:54 PM, Peter Maydell wrote:
> On Thu, 15 Apr 2021 at 15:32, Alex Bennée  wrote:
>> --8<---cut here---start->8---
>> accel/tcg: avoid re-translating one-shot instructions
>>
>> By definition a single instruction is capable of being an IO
>> instruction. This avoids a problem of triggering a cpu_io_recompile on
>> a non-cached translation which would only do exactly this anyway.
>>
>> Signed-off-by: Alex Bennée 
>>
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> accel/tcg/translate-all.c | 2 +-
>>
>> modified   accel/tcg/translate-all.c
>> @@ -1863,7 +1863,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>
>>  if (phys_pc == -1) {
>>  /* Generate a one-shot TB with 1 insn in it */
>> -cflags = (cflags & ~CF_COUNT_MASK) | 1;
>> +cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
>>  }
>>
>>  max_insns = cflags & CF_COUNT_MASK;
>> --8<---cut here---end--->8---
>
> Yes, this fixes the problem. Do we want to put this in for 6.0? My
> feeling is that executing from non-RAM is pretty niche, so maybe
> if we need an rc4 anyway, but this isn't important enough to cause an
> rc4 itself.

 Isn't it the default for Aspeed machines (with U-Boot)? (Cc'ing Cédric).
>>>
>>> You need to set the 'execute-in-place' machine option to load/execute the
>>> instructions from the AHB window of CE0. It's not on by default because
>>> boot can be really slow with some recent u-boot which heavily trash the TBs.
>>>
>>> But this seems to work fine with -rc3.
>> 
>> Triggering the bug requires both execute-in-place and -icount -- did
>> you test with -icount enabled?
>
> It crashes.


Without the above patch? I've re-posted as a proper patch here:

  Subject: [RFC PATCH] accel/tcg: avoid re-translating one-shot instructions
  Date: Thu, 15 Apr 2021 17:24:53 +0100
  Message-Id: <20210415162454.22056-1-alex.ben...@linaro.org>

-- 
Alex Bennée



Re: [EXTERNAL] Re: [PULL 18/23] accel/tcg: re-factor non-RAM execution code

2021-04-16 Thread Cédric Le Goater
On 4/15/21 7:34 PM, Peter Maydell wrote:
> On Thu, 15 Apr 2021 at 18:18, Cédric Le Goater  wrote:
>>
>> On 4/15/21 5:55 PM, Philippe Mathieu-Daudé wrote:
>>> On 4/15/21 4:54 PM, Peter Maydell wrote:
 On Thu, 15 Apr 2021 at 15:32, Alex Bennée  wrote:
> --8<---cut here---start->8---
> accel/tcg: avoid re-translating one-shot instructions
>
> By definition a single instruction is capable of being an IO
> instruction. This avoids a problem of triggering a cpu_io_recompile on
> a non-cached translation which would only do exactly this anyway.
>
> Signed-off-by: Alex Bennée 
>
> 1 file changed, 1 insertion(+), 1 deletion(-)
> accel/tcg/translate-all.c | 2 +-
>
> modified   accel/tcg/translate-all.c
> @@ -1863,7 +1863,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>
>  if (phys_pc == -1) {
>  /* Generate a one-shot TB with 1 insn in it */
> -cflags = (cflags & ~CF_COUNT_MASK) | 1;
> +cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
>  }
>
>  max_insns = cflags & CF_COUNT_MASK;
> --8<---cut here---end--->8---

 Yes, this fixes the problem. Do we want to put this in for 6.0? My
 feeling is that executing from non-RAM is pretty niche, so maybe
 if we need an rc4 anyway, but this isn't important enough to cause an
 rc4 itself.
>>>
>>> Isn't it the default for Aspeed machines (with U-Boot)? (Cc'ing Cédric).
>>
>> You need to set the 'execute-in-place' machine option to load/execute the
>> instructions from the AHB window of CE0. It's not on by default because
>> boot can be really slow with some recent u-boot which heavily trash the TBs.
>>
>> But this seems to work fine with -rc3.
> 
> Triggering the bug requires both execute-in-place and -icount -- did
> you test with -icount enabled?

It crashes.

Thanks,

C. 

$ qemu-system-arm -M romulus-bmc,execute-in-place=true -icount auto -drive 
file=./flash-romulus,format=raw,if=mtd  -serial mon:stdio
qemu: fatal: cpu_io_recompile: could not find TB for pc=0x7efbcc001992
R00=0005107a R01= R02= R03=
R04=0350 R05= R06= R07=
R08= R09= R10= R11=
R12= R13= R14=0350 R15=0c70
PSR=41d3 -Z-- A S svc32
s00= s01= d00=
s02= s03= d01=
s04= s05= d02=
s06= s07= d03=
s08= s09= d04=
s10= s11= d05=
s12= s13= d06=
s14= s15= d07=
s16= s17= d08=
s18= s19= d09=
s20= s21= d10=
s22= s23= d11=
s24= s25= d12=
s26= s27= d13=
s28= s29= d14=
s30= s31= d15=
FPSCR: 
Aborted (core dumped)



Re: [EXTERNAL] Re: [PULL 18/23] accel/tcg: re-factor non-RAM execution code

2021-04-15 Thread Cédric Le Goater
On 4/15/21 5:55 PM, Philippe Mathieu-Daudé wrote:
> On 4/15/21 4:54 PM, Peter Maydell wrote:
>> On Thu, 15 Apr 2021 at 15:32, Alex Bennée  wrote:
>>> --8<---cut here---start->8---
>>> accel/tcg: avoid re-translating one-shot instructions
>>>
>>> By definition a single instruction is capable of being an IO
>>> instruction. This avoids a problem of triggering a cpu_io_recompile on
>>> a non-cached translation which would only do exactly this anyway.
>>>
>>> Signed-off-by: Alex Bennée 
>>>
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> accel/tcg/translate-all.c | 2 +-
>>>
>>> modified   accel/tcg/translate-all.c
>>> @@ -1863,7 +1863,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>>
>>>  if (phys_pc == -1) {
>>>  /* Generate a one-shot TB with 1 insn in it */
>>> -cflags = (cflags & ~CF_COUNT_MASK) | 1;
>>> +cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
>>>  }
>>>
>>>  max_insns = cflags & CF_COUNT_MASK;
>>> --8<---cut here---end--->8---
>>
>> Yes, this fixes the problem. Do we want to put this in for 6.0? My
>> feeling is that executing from non-RAM is pretty niche, so maybe
>> if we need an rc4 anyway, but this isn't important enough to cause an
>> rc4 itself.
> 
> Isn't it the default for Aspeed machines (with U-Boot)? (Cc'ing Cédric).

You need to set the 'execute-in-place' machine option to load/execute the
instructions from the AHB window of CE0. It's not on by default because 
boot can be really slow with some recent u-boot which heavily trash the TBs.

But this seems to work fine with -rc3. 

C. 



Re: [EXTERNAL] Re: [PULL 18/23] accel/tcg: re-factor non-RAM execution code

2021-04-15 Thread Peter Maydell
On Thu, 15 Apr 2021 at 18:18, Cédric Le Goater  wrote:
>
> On 4/15/21 5:55 PM, Philippe Mathieu-Daudé wrote:
> > On 4/15/21 4:54 PM, Peter Maydell wrote:
> >> On Thu, 15 Apr 2021 at 15:32, Alex Bennée  wrote:
> >>> --8<---cut here---start->8---
> >>> accel/tcg: avoid re-translating one-shot instructions
> >>>
> >>> By definition a single instruction is capable of being an IO
> >>> instruction. This avoids a problem of triggering a cpu_io_recompile on
> >>> a non-cached translation which would only do exactly this anyway.
> >>>
> >>> Signed-off-by: Alex Bennée 
> >>>
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>> accel/tcg/translate-all.c | 2 +-
> >>>
> >>> modified   accel/tcg/translate-all.c
> >>> @@ -1863,7 +1863,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> >>>
> >>>  if (phys_pc == -1) {
> >>>  /* Generate a one-shot TB with 1 insn in it */
> >>> -cflags = (cflags & ~CF_COUNT_MASK) | 1;
> >>> +cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
> >>>  }
> >>>
> >>>  max_insns = cflags & CF_COUNT_MASK;
> >>> --8<---cut here---end--->8---
> >>
> >> Yes, this fixes the problem. Do we want to put this in for 6.0? My
> >> feeling is that executing from non-RAM is pretty niche, so maybe
> >> if we need an rc4 anyway, but this isn't important enough to cause an
> >> rc4 itself.
> >
> > Isn't it the default for Aspeed machines (with U-Boot)? (Cc'ing Cédric).
>
> You need to set the 'execute-in-place' machine option to load/execute the
> instructions from the AHB window of CE0. It's not on by default because
> boot can be really slow with some recent u-boot which heavily trash the TBs.
>
> But this seems to work fine with -rc3.

Triggering the bug requires both execute-in-place and -icount -- did
you test with -icount enabled?

thanks
-- PMM