Re: [m5-dev] X86 FS regression

Steve Reinhardt Tue, 23 Nov 2010 07:35:51 -0800

I definitely agree that putting a bus between the CPU and L1 and plugging
the table walker in there is the best way to figure out if this is really
the problem (and I expect it is).


I'm not sure if it's the long-term right answer or not.  We also need to
consider how this works with Ruby.

Steve

On Tue, Nov 23, 2010 at 3:29 AM, Gabe Black <gbl...@eecs.umich.edu> wrote:

> I think I may have just now. I've fixed a few issues, and am now getting
> to the point where something that should be in the pagetables is causing
> a page fault. I found where the table walker is walking the tables for
> this particular access, and the last level entry is all 0s. There could
> be a number of reasons this is all 0s, but since the main difference
> other than timing between this and a working configuration is the
> presence of caches and we've identified a potential issue there, I'm
> inclined to suspect the actual page table entry is still in the L1 and
> hasn't been evicted out to memory yet.
>
> To fix this, is the best solution to add a bus below the CPU for all the
> connections that need to go to the L1? I'm assuming they'd all go into
> the dcache since they're more data-ey and that keeps the icache read
> only (ignoring SMC issues), and the dcache is probably servicing lower
> bandwidth normally. It also seems a little strange that this type of
> configuration is going on in the BaseCPU.py SimObject python file and
> not a configuration file, but I could be convinced there's a reason.
> Even if this isn't really a "fix" or the "right thing" to do, I'd still
> like to try it temporarily at least to see if it corrects the problem
> I'm seeing.
>
> Gabe
>
> Ali Saidi wrote:
> >
> > I haven't seen any strange behavior yet. That isn't to say it's not
> > going to cause an issue in the future, but we've taken many a tlb miss
> > and it hasn't fallen over yet.
> >
> > Ali
> >
> > On Mon, 22 Nov 2010 13:08:13 -0800, Steve Reinhardt <ste...@gmail.com>
> > wrote:
> >
> >> Yea, I just got around to reading this thread and that was the point
> >> I was going to make... the L1 cache effectively serves as a
> >> translator between the CPU's word-size read & write requests and the
> >> coherent block-level requests that get snooped.  If you attach a
> >> CPU-like device (such as the table walker) directly to an L2, the
> >> CPU-like accesses that go to the L2 will get sent to the L1s but I'm
> >> not sure they'll be handled correctly.  Not that they fundamentally
> >> couldn't, this just isn't a configuration we test so it's likely that
> >> there are problems... for example, the L1 may try to hand ownership
> >> to the requester but the requester won't recognize that and things
> >> will break.
> >>
> >> Steve
> >>
> >> On Mon, Nov 22, 2010 at 12:00 PM, Gabe Black <gbl...@eecs.umich.edu
> >> <mailto:gbl...@eecs.umich.edu>> wrote:
> >>
> >>     What happens if an entry is in the L1 but not the L2?
> >>
> >>     Gabe
> >>
> >>     Ali Saidi wrote:
> >>     > Between the l1 and l2 caches seems like a good place to me. The
> >>     caches can cache page table entries, otherwise a tlb miss would
> >>     be even more expensive then it is. The l1 isn't normally used for
> >>     such things since it would get polluted (look why sparc has a
> >>     load 128bits from l2, do not allocate into l1 instruction).
> >>     >
> >>     > Ali
> >>     >
> >>     > On Nov 22, 2010, at 4:27 AM, Gabe Black wrote:
> >>     >
> >>     >
> >>     >>    For anybody waiting for an x86 FS regression (yes, I know,
> >>     you can
> >>     >> all hardly wait, but don't let this spoil your Thanksgiving)
> >>     I'm getting
> >>     >> closer to having it working, but I've discovered some issues
> >>     with the
> >>     >> mechanisms behind the --caches flag with fs.py and x86. I'm
> >>     surprised I
> >>     >> never thought to try it before. It also brings up some
> >>     questions about
> >>     >> where the table walkers should be hooked up in x86 and ARM.
> >>     Currently
> >>     >> it's after the L1, if any, but before the L2, if any, which
> >>     seems wrong
> >>     >> to me. Also caches don't seem to propagate requests upwards to
> >>     the CPUs
> >>     >> which may or may not be an issue. I'm still looking into that.
> >>     >>
> >>     >> Gabe
> >>     >> _______________________________________________
> >>     >> m5-dev mailing list
> >>     >> m5-dev@m5sim.org <mailto:m5-dev@m5sim.org>
> >>     >> http://m5sim.org/mailman/listinfo/m5-dev
> >>     >>
> >>     >>
> >>     >
> >>     > _______________________________________________
> >>     > m5-dev mailing list
> >>     > m5-dev@m5sim.org <mailto:m5-dev@m5sim.org>
> >>     > http://m5sim.org/mailman/listinfo/m5-dev
> >>     >
> >>
> >>     _______________________________________________
> >>     m5-dev mailing list
> >>     m5-dev@m5sim.org <mailto:m5-dev@m5sim.org>
> >>     http://m5sim.org/mailman/listinfo/m5-dev
> >>
> >>
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > m5-dev mailing list
> > m5-dev@m5sim.org
> > http://m5sim.org/mailman/listinfo/m5-dev
> >
>
> _______________________________________________
> m5-dev mailing list
> m5-dev@m5sim.org
> http://m5sim.org/mailman/listinfo/m5-dev
>

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] X86 FS regression

Reply via email to