Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-06 Thread Cédric Le Goater
On 06/06/2016 06:17 AM, Benjamin Herrenschmidt wrote:
> On Sun, 2016-06-05 at 19:41 +0200, Cédric Le Goater wrote:
>>  
>> Here is a fix I think. Could you give it a try ? 
> 
> This is somewhat wrong...
> 
>> commit cd0c6f473532 ('ppc: Do some batching of TCG tlb flushes')
>> introduced an optimisation to flush TLBs only when a context
>> synchronizing event is reached (interrupt, rfi). This was done for
>> ppc64 but 32bit was forgotten on the way.
> 
> No it didn't. That commit only delays flushes on ppc64. ppc32 is
> unaffected, unless I missed something. IE. It will delay flushes caused
> by slb instructions (which don't exist on 32-bit)
> and ppc_tlb_invalidate_one() only in the 64-bit cases.
> 
> Also what your patch does in practice is not really change that, though
> you seem to try to somewhat extend the batching to 32-bit (but
> incompletely), you also introduce something which effectively reverts
> part of 9fb044911444fdd09f5f072ad0ca269d7f8b841d (split I/D mode).
> 
> I think that's more what's "fixing" your problem, ie, the flush in
> IR/DR changes. However it shouldn't be needed.

OK. I thought that was needed because of what the 32b specs say in 
"Synchronization Requirements for Special Registers and for Lookaside 
Buffers", a "Context-synchronizing instruction" is required after a 
mtmsr({I,D}R) and also because changing IR can break implicit branching.

But I might just misunderstand all of it as I am discovering.

> I suspect all of that is papering over another bug somewhere else which
> got exposed by the split I/D mode, since we no longer over-flush on
> transitions to/from real-mode. So we must be missing flushes elsewhere,
> possibly some G3 specific stuff, or there always was some kind of bug
> in the TLB flushing on 32-bit that got somewhat masked by the over-
> flushing we used to do.

>From what I see, darwin loops on tlbie :

0x000952fc:  mtctr   r0
0x00095300:  tlbie   r6
0x00095304:  addir6,r6,4096
0x00095308:  bdnz+   0x95300
0x0009530c:  mtcrf   128,r11
0x00095310:  sync
0x00095314:  eieio
0x00095318:  bns-0x95328

and this is done on the G4, but not necessarily on the G3, it depends on
r11 which contains some bits of SPRG2 :

  0x0009531c:  tlbsync
  0x00095320:  sync
  0x00095324:  isync

HID0 is also read and written to but to control cache bits.

> I need a repro-case.

Booting the darwin CD is enough.

Cheers,

C. 
 
> Cheers,
> Ben.
> 
>> Tested on mac99 and g3beige with
>>
>> qemu-system-ppc -cdrom darwinppc-602.cdr -boot d
>>
>> Signed-off-by: Cédric Le Goater 
>> ---
>>
>>  I think the hunk in powerpc_excp() is needed if we don't generate a
>>  context synchronizing event. what is best to do ?
>>
>>  target-ppc/cpu.h |2 +-
>>  target-ppc/excp_helper.c |   10 ++
>>  target-ppc/helper_regs.h |9 -
>>  target-ppc/translate.c   |2 +-
>>  4 files changed, 20 insertions(+), 3 deletions(-)
>>
>> Index: qemu-dgibson-for-2.7.git/target-ppc/translate.c
>> ===
>> --- qemu-dgibson-for-2.7.git.orig/target-ppc/translate.c
>> +++ qemu-dgibson-for-2.7.git/target-ppc/translate.c
>> @@ -3290,7 +3290,7 @@ static void gen_eieio(DisasContext *ctx)
>>  {
>>  }
>>  
>> -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64)
>> +#if !defined(CONFIG_USER_ONLY)
>>  static inline void gen_check_tlb_flush(DisasContext *ctx)
>>  {
>>  TCGv_i32 t = tcg_temp_new_i32();
>> Index: qemu-dgibson-for-2.7.git/target-ppc/cpu.h
>> ===
>> --- qemu-dgibson-for-2.7.git.orig/target-ppc/cpu.h
>> +++ qemu-dgibson-for-2.7.git/target-ppc/cpu.h
>> @@ -958,9 +958,9 @@ struct CPUPPCState {
>>  /* PowerPC 64 SLB area */
>>  ppc_slb_t slb[MAX_SLB_ENTRIES];
>>  int32_t slb_nr;
>> +#endif
>>  /* tcg TLB needs flush (deferred slb inval instruction
>> typically) */
>>  uint32_t tlb_need_flush;
>> -#endif
>>  /* segment registers */
>>  hwaddr htab_base;
>>  /* mask used to normalize hash value to PTEG index */
>> Index: qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h
>> ===
>> --- qemu-dgibson-for-2.7.git.orig/target-ppc/helper_regs.h
>> +++ qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h
>> @@ -121,6 +121,13 @@ static inline int hreg_store_msr(CPUPPCS
>>  }
>>  if (((value >> MSR_IR) & 1) != msr_ir ||
>>  ((value >> MSR_DR) & 1) != msr_dr) {
>> +/* A change of the instruction relocation bit in the MSR can
>> + * cause an implicit branch in the address space. This
>> + * requires a tlb flush.
>> + */
>> +if (env->mmu_model & POWERPC_MMU_32B) {
>> +env->tlb_need_flush = 1;
>> +}
>>  cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
>>  }
>>  if ((env->mmu_model & P

Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-06 Thread Benjamin Herrenschmidt
On Mon, 2016-06-06 at 17:04 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2016-06-06 at 07:29 +0100, Mark Cave-Ayland wrote:
> > 
> > 
> > The best reproducer is to run from David's ppc-for-2.7 branch with
> > the above patch applied manually and then try booting the following
> > ISOs which now panic on boot with the split I/D MMU mode enabled:
> So at least HelenOS is fixed by this:

Note that Cedric is right, the TLB flush batching isn't done for 32-
bit, and we could do it, it would probably improve performances, but it
needs to be done properly :-) I'll look into it next.

> --- a/target-ppc/mmu_helper.c
> +++ b/target-ppc/mmu_helper.c
> @@ -1969,6 +1969,11 @@ void ppc_tlb_invalidate_one(CPUPPCState *env,
> target_ulong addr)
>  /* XXX: this case should be optimized,
>   * giving a mask to tlb_flush_page
>   */
> +/* This is broken, some CPUs invalidate a whole congruence
> + * class on an even smaller subset of bits and some OSes
> take
> + * advantage of this. Just blow the whole thing away.
> + */
> +#if 0
>  tlb_flush_page(cs, addr | (0x0 << 28));
>  tlb_flush_page(cs, addr | (0x1 << 28));
>  tlb_flush_page(cs, addr | (0x2 << 28));
> @@ -1985,6 +1990,9 @@ void ppc_tlb_invalidate_one(CPUPPCState *env,
> target_ulong addr)
>  tlb_flush_page(cs, addr | (0xD << 28));
>  tlb_flush_page(cs, addr | (0xE << 28));
>  tlb_flush_page(cs, addr | (0xF << 28));
> +#else
> +tlb_flush(cs, 1);
> +#endif
>  break;
>  #if defined(TARGET_PPC64)
>  case POWERPC_MMU_64B:
> 
> I'll check Darwin...
> 
> Cheers,
> Ben.


Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-06 Thread Benjamin Herrenschmidt
On Mon, 2016-06-06 at 07:29 +0100, Mark Cave-Ayland wrote:
> 
> The best reproducer is to run from David's ppc-for-2.7 branch with
> the above patch applied manually and then try booting the following
> ISOs which now panic on boot with the split I/D MMU mode enabled:

So at least HelenOS is fixed by this:

--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -1969,6 +1969,11 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 /* XXX: this case should be optimized,
  * giving a mask to tlb_flush_page
  */
+/* This is broken, some CPUs invalidate a whole congruence
+ * class on an even smaller subset of bits and some OSes take
+ * advantage of this. Just blow the whole thing away.
+ */
+#if 0
 tlb_flush_page(cs, addr | (0x0 << 28));
 tlb_flush_page(cs, addr | (0x1 << 28));
 tlb_flush_page(cs, addr | (0x2 << 28));
@@ -1985,6 +1990,9 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 tlb_flush_page(cs, addr | (0xD << 28));
 tlb_flush_page(cs, addr | (0xE << 28));
 tlb_flush_page(cs, addr | (0xF << 28));
+#else
+tlb_flush(cs, 1);
+#endif
 break;
 #if defined(TARGET_PPC64)
 case POWERPC_MMU_64B:

I'll check Darwin...

Cheers,
Ben.



Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-05 Thread Mark Cave-Ayland
On 06/06/16 05:20, Benjamin Herrenschmidt wrote:

> On Mon, 2016-06-06 at 13:55 +1000, Benjamin Herrenschmidt wrote:
>>
>> I'm not sure that 32-bit patch is correct. We shouldn't have to flush
>> on IR/DR transitions at all, that's the whole point of the split I/D
>> code.
>>
>> I think something else is wrong.
> 
> Note: With whatever's in this branch, OpenBIOS won't start at all

Yeah, there's a patch currently pending upstream to fix the serial port
which hangs OpenBIOS upon startup:
https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg00035.html.

The best reproducer is to run from David's ppc-for-2.7 branch with the
above patch applied manually and then try booting the following ISOs
which now panic on boot with the split I/D MMU mode enabled:

Darwin/ppc: https://opensource.apple.com/static/iso/darwinppc-602.cdr.gz
HelenOS:http://www.helenos.org/releases/HelenOS-0.6.0-ppc32.iso

I've seen some DMA coherence issues related to macio in some of my local
tests and so I'm also testing with my beta patch to switch macio over to
use the DMA helpers here:
https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04907.html.


ATB,

Mark.




Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-05 Thread Benjamin Herrenschmidt
On Mon, 2016-06-06 at 13:55 +1000, Benjamin Herrenschmidt wrote:
> 
> I'm not sure that 32-bit patch is correct. We shouldn't have to flush
> on IR/DR transitions at all, that's the whole point of the split I/D
> code.
> 
> I think something else is wrong.

Note: With whatever's in this branch, OpenBIOS won't start at all

Cheers,
Ben.



Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ppc: complete the new HV mode

2016-06-05 Thread Benjamin Herrenschmidt
On Sun, 2016-06-05 at 19:41 +0200, Cédric Le Goater wrote:
> 
> Here is a fix I think. Could you give it a try ? 

This is somewhat wrong...

> commit cd0c6f473532 ('ppc: Do some batching of TCG tlb flushes')
> introduced an optimisation to flush TLBs only when a context
> synchronizing event is reached (interrupt, rfi). This was done for
> ppc64 but 32bit was forgotten on the way.

No it didn't. That commit only delays flushes on ppc64. ppc32 is
unaffected, unless I missed something. IE. It will delay flushes caused
by slb instructions (which don't exist on 32-bit)
and ppc_tlb_invalidate_one() only in the 64-bit cases.

Also what your patch does in practice is not really change that, though
you seem to try to somewhat extend the batching to 32-bit (but
incompletely), you also introduce something which effectively reverts
part of 9fb044911444fdd09f5f072ad0ca269d7f8b841d (split I/D mode).

I think that's more what's "fixing" your problem, ie, the flush in
IR/DR changes. However it shouldn't be needed.

I suspect all of that is papering over another bug somewhere else which
got exposed by the split I/D mode, since we no longer over-flush on
transitions to/from real-mode. So we must be missing flushes elsewhere,
possibly some G3 specific stuff, or there always was some kind of bug
in the TLB flushing on 32-bit that got somewhat masked by the over-
flushing we used to do.

I need a repro-case.

Cheers,
Ben.

> Tested on mac99 and g3beige with
> 
> qemu-system-ppc -cdrom darwinppc-602.cdr -boot d
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  I think the hunk in powerpc_excp() is needed if we don't generate a
>  context synchronizing event. what is best to do ?
> 
>  target-ppc/cpu.h |2 +-
>  target-ppc/excp_helper.c |   10 ++
>  target-ppc/helper_regs.h |9 -
>  target-ppc/translate.c   |2 +-
>  4 files changed, 20 insertions(+), 3 deletions(-)
> 
> Index: qemu-dgibson-for-2.7.git/target-ppc/translate.c
> ===
> --- qemu-dgibson-for-2.7.git.orig/target-ppc/translate.c
> +++ qemu-dgibson-for-2.7.git/target-ppc/translate.c
> @@ -3290,7 +3290,7 @@ static void gen_eieio(DisasContext *ctx)
>  {
>  }
>  
> -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64)
> +#if !defined(CONFIG_USER_ONLY)
>  static inline void gen_check_tlb_flush(DisasContext *ctx)
>  {
>  TCGv_i32 t = tcg_temp_new_i32();
> Index: qemu-dgibson-for-2.7.git/target-ppc/cpu.h
> ===
> --- qemu-dgibson-for-2.7.git.orig/target-ppc/cpu.h
> +++ qemu-dgibson-for-2.7.git/target-ppc/cpu.h
> @@ -958,9 +958,9 @@ struct CPUPPCState {
>  /* PowerPC 64 SLB area */
>  ppc_slb_t slb[MAX_SLB_ENTRIES];
>  int32_t slb_nr;
> +#endif
>  /* tcg TLB needs flush (deferred slb inval instruction
> typically) */
>  uint32_t tlb_need_flush;
> -#endif
>  /* segment registers */
>  hwaddr htab_base;
>  /* mask used to normalize hash value to PTEG index */
> Index: qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h
> ===
> --- qemu-dgibson-for-2.7.git.orig/target-ppc/helper_regs.h
> +++ qemu-dgibson-for-2.7.git/target-ppc/helper_regs.h
> @@ -121,6 +121,13 @@ static inline int hreg_store_msr(CPUPPCS
>  }
>  if (((value >> MSR_IR) & 1) != msr_ir ||
>  ((value >> MSR_DR) & 1) != msr_dr) {
> +/* A change of the instruction relocation bit in the MSR can
> + * cause an implicit branch in the address space. This
> + * requires a tlb flush.
> + */
> +if (env->mmu_model & POWERPC_MMU_32B) {
> +env->tlb_need_flush = 1;
> +}
>  cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
>  }
>  if ((env->mmu_model & POWERPC_MMU_BOOKE) &&
> @@ -151,7 +158,7 @@ static inline int hreg_store_msr(CPUPPCS
>  return excp;
>  }
>  
> -#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64)
> +#if !defined(CONFIG_USER_ONLY)
>  static inline void check_tlb_flush(CPUPPCState *env)
>  {
>  CPUState *cs = CPU(ppc_env_get_cpu(env));
> Index: qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c
> ===
> --- qemu-dgibson-for-2.7.git.orig/target-ppc/excp_helper.c
> +++ qemu-dgibson-for-2.7.git/target-ppc/excp_helper.c
> @@ -709,6 +709,16 @@ static inline void powerpc_excp(PowerPCC
>  }
>  }
>  #endif
> +if (((new_msr >> MSR_IR) & 1) != msr_ir ||
> +((new_msr >> MSR_DR) & 1) != msr_dr) {
> +/* A change of the instruction relocation bit in the MSR can
> + * cause an implicit branch in the address space. This
> + * requires a tlb flush.
> + */
> +if (env->mmu_model & POWERPC_MMU_32B) {
> +env->tlb_need_flush = 1;
> +}
> +}
>  /* We don't use hreg_store_msr here as already have treated
>   * any special ca