[gem5-users] Re: DerivO3CPU panic: initiateAcc not defined: ROB fills and locks up

2023-01-06 Thread Eliot Moss via gem5-users

On 1/6/2023 1:27 PM, Jason Lowe-Power via gem5-users wrote:

Hi Eliot,

Unfortunately, I don't have a direct answer for you. However, I want to say that I appreciate you 
keeping the mailing list updated with your progress!


Thank you for the encouragement, Jason!

At this point I probably need to set it aside for a bit.
I see two ways forward.  One is to port the rest of my
changes forward into release 22 and see if the problem
still occurs.  Another is to add more instrumentation
(debug flags, or debug prints) that can help narrow
down the sequence of events and what it is that is not
happening to allow that micro-op to run.

I don't *think* it has to do with my changes to the
caches, which mostly have to do with bulk cleaning
operations such as wbnoinvd, but it's hard absolutely
to rule out that I did not cause some collateral damage
that is revealed by this behavior.

Best - Eliot

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: DerivO3CPU panic: initiateAcc not defined: ROB fills and locks up

2023-01-06 Thread Jason Lowe-Power via gem5-users
Hi Eliot,

Unfortunately, I don't have a direct answer for you. However, I want to say
that I appreciate you keeping the mailing list updated with your progress!

Cheers,
Jason

On Fri, Jan 6, 2023 at 10:07 AM Eliot Moss via gem5-users <
gem5-users@gem5.org> wrote:

> On 1/4/2023 11:51 PM, Eliot Moss via gem5-users wrote:
> > So, what I have found is that the bad micro-op is coming from trying to
> execute the micro-ops of an
> > INT3 macro-instruction.  The end of the sequence consists of the
> micro-ops:
> >
> > andi t0, t5, 0x1
> > br 0x803d
> > br 0x80b8
> >
> > followed by a bunch of "panic" micro-ops.  t5 holds an m5 register,
> > where the low bit supposedly indicated whether we are in long mode.
> >
> > The br micro-ops branch into long sequences of micro-ops in the "ROM".
>
> I have found out some more things about this issue.
>
> - The macro instruction is INT_I (my mistake in saying INT3), but the
>micro-ops are almost exactly the same.
>
> - My original though about a load instruction *is* connected somehow.
> Here's
>the big-picture sequence of events:
>
>1) A garden flavor load from memory (mov reg <- offset(reg)) gets stuck
> at
>the head of the ROB.  It was originally deferred because a page table
> walk
>was necessary to resolve the virtual address of the load.
>
>2) The ROB fills (all 192) entries.
>
>3) The panic happens.
>
>
> So I tried adjusting the size of the ROB, just to see what would happen.
> When
> I increased it from 192 to 500, a panic still happened.  I guess that if an
> instruction remains stuck at the head of the ROB forever, the ROB fills and
> then somehow causes the panic.
>
> When I *decreased* the ROB size from 192 to 64, the program worked.
>
> I am inclined to infer that there is (was) a bug in the O3 interactions
> that
> would make the load micro-op fully ready and not stuck at the head of the
> ROB.
>
> What I wonder is whether any similar ROB lock-up behavior has been found
> and
> fixed since 21.0.0.0.  There have been a lot of textual changes, but many
> had
> to do with names and such and did not really change what the code *does*.
> I
> am hoping someone out there can confirm one way or another whether this may
> have been found and fixed already if I can manage to move the rest of my
> changes forward to a newer release.
>
> Best - Eliot
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: DerivO3CPU panic: initiateAcc not defined: ROB fills and locks up

2023-01-06 Thread Eliot Moss via gem5-users

On 1/4/2023 11:51 PM, Eliot Moss via gem5-users wrote:
So, what I have found is that the bad micro-op is coming from trying to execute the micro-ops of an 
INT3 macro-instruction.  The end of the sequence consists of the micro-ops:


andi t0, t5, 0x1
br 0x803d
br 0x80b8

followed by a bunch of "panic" micro-ops.  t5 holds an m5 register,
where the low bit supposedly indicated whether we are in long mode.

The br micro-ops branch into long sequences of micro-ops in the "ROM".


I have found out some more things about this issue.

- The macro instruction is INT_I (my mistake in saying INT3), but the
  micro-ops are almost exactly the same.

- My original though about a load instruction *is* connected somehow.  Here's
  the big-picture sequence of events:

  1) A garden flavor load from memory (mov reg <- offset(reg)) gets stuck at
  the head of the ROB.  It was originally deferred because a page table walk
  was necessary to resolve the virtual address of the load.

  2) The ROB fills (all 192) entries.

  3) The panic happens.


So I tried adjusting the size of the ROB, just to see what would happen.  When
I increased it from 192 to 500, a panic still happened.  I guess that if an
instruction remains stuck at the head of the ROB forever, the ROB fills and
then somehow causes the panic.

When I *decreased* the ROB size from 192 to 64, the program worked.

I am inclined to infer that there is (was) a bug in the O3 interactions that
would make the load micro-op fully ready and not stuck at the head of the
ROB.

What I wonder is whether any similar ROB lock-up behavior has been found and
fixed since 21.0.0.0.  There have been a lot of textual changes, but many had
to do with names and such and did not really change what the code *does*.  I
am hoping someone out there can confirm one way or another whether this may
have been found and fixed already if I can manage to move the rest of my
changes forward to a newer release.

Best - Eliot
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org