I couldn't say exactly, so much in M5 has changed since that paper to give
an exact number, but I'd imagine whatever instructions/second you're getting
in the detailed, if you make the checkpoint run the appropriate speed
considering its 1 IPC, you'd probably be in the right range.

Lisa

On Wed, Jan 28, 2009 at 8:26 PM, Rick Strong <[email protected]> wrote:

> The clock rate on the server is only set to 1GHz on the checkpoint run
> (as opposed to 3GHz for the detailed simulation). How slow should it be
> set? Are we talking nearer to 250MHz?
>
> Thanks,
> -Rick
>
> Ali Saidi wrote:
> > That's almost certainly what is happening. Different packets are
> > trying to be sent, both originating from the kernel. This isn't a
> > device bug. It's exactly what that paper described. The delay observed
> > by the server has changed dramatically, that a retransmit is occurring
> > because since the ack didn't arrive in twice the round trip latency.
> > You should add some latency to the ethernet link, and drive the server
> > with a slower CPU during the checkpoint run. That will normally fix
> > the problem.
> >
> > Ali
> >
> >
> >
> >
> > On Jan 28, 2009, at 5:59 PM, Rick Strong wrote:
> >
> >
> >> This is an interesting. Thanks for the link.
> >>
> >> -Rick
> >>
> >> Lisa Hsu wrote:
> >>
> >>> Your description that it only occurs when you switch to a timing sim
> >>> makes me think of this (not to toot my own horn or anything):
> >>>
> >>> http://www.eecs.umich.edu/~hsul/pubs/mobs05.pdf<http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
> >>> <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
> >>>
> >>> Just throwing that out as a possibility.  You might want to "slow
> >>> down" your checkpoint dropping run so that it's not so disruptive
> >>> when
> >>> you switch over to timing.
> >>>
> >>> Lisa
> >>>
> >>> On Wed, Jan 28, 2009 at 5:32 PM, Rick Strong <[email protected]
> >>> <mailto:[email protected]>> wrote:
> >>>
> >>>    I have posted tar.gz files that include EthernetAll output (in
> >>>    ethernet_all.trace) @ http://rickshin.ucsd.edu. Once you have a
> >>>    chance,
> >>>    if you could take a look at the trace and figure what is wrong
> >>> that is
> >>>    great.
> >>>
> >>>    I ended restoring from the checkpoint for two runs. One run
> >>> stays in
> >>>    atomic mode while the other switches to timing and detailed. The
> >>> run
> >>>    that stays in atomic mode works fine. This leads me to believe
> >>>    that the
> >>>    checkpoint restore mechanism is fine. The fault likely lies in the
> >>>    switching to timing or detailed mdoe.
> >>>
> >>>    The differences between the runs:
> >>>
> >>>    (1) m5out-atomic-run-aftercheckpoint.tar.gz is a run that stays in
> >>>    atomic mode (no switching to timing mode)
> >>>
> >>>    (2) m5out-timing-run-aftercheckpoint.tar.gz is a run that
> >>> switches to
> >>>    timing and then to detailed mode.
> >>>
> >>>
> >>>    Thanks and good luck,
> >>>
> >>>    -Rick
> >>>
> >>>    Ali Saidi wrote:
> >>>
> >>>> Looking at the trace, it appears as though you just restored from a
> >>>> checkpoint. Is this the case? If so, what does the checkpoint
> >>>>
> >>>    dropping
> >>>
> >>>> run do after that checkpoint is created? It's so early in the trace
> >>>> that  I would guess it's a serialization bug, particularly in
> >>>>
> >>>    the TSO
> >>>
> >>>> code. However, I looked quickly at the code and and nothing
> >>>>
> >>>    seemed to
> >>>
> >>>> jump out at me. If you can provide me with an EthernetAll trace from
> >>>> the checkpoint run and from the restored run I can work on figuring
> >>>> out what the problem is.
> >>>>
> >>>> Ali
> >>>>
> >>>>
> >>>>
> >>>> On Jan 28, 2009, at 2:53 AM, Rick Strong wrote:
> >>>>
> >>>>
> >>>>
> >>>>>> There are three possibilities here:
> >>>>>> a) A kernel bug
> >>>>>> b) a device model/driver bug
> >>>>>> c) a checkpointing bug (as it relates to (b))
> >>>>>>
> >>>>>> What kernel version are you using? Could you put the ethernet
> >>>>>>
> >>>    trace
> >>>
> >>>>>> somewhere so I could look at it?
> >>>>>>
> >>>>>> Ali
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> I am using the  kernel 2.6.18 with M5 patches.
> >>>>>
> >>>>> I have put the ethernet traces up at http://rickshin.ucsd.edu.
> >>>>>
> >>>    It is
> >>>
> >>>>> the
> >>>>> only link. If you take a look, let me know what you think.
> >>>>>
> >>>>> Best,
> >>>>> -Rick
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> m5-dev mailing list
> >>>>> [email protected] <mailto:[email protected]>
> >>>>> http://m5sim.org/mailman/listinfo/m5-dev
> >>>>>
> >>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> m5-dev mailing list
> >>>> [email protected] <mailto:[email protected]>
> >>>> http://m5sim.org/mailman/listinfo/m5-dev
> >>>>
> >>>>
> >>>>
> >>>    _______________________________________________
> >>>    m5-dev mailing list
> >>>    [email protected] <mailto:[email protected]>
> >>>    http://m5sim.org/mailman/listinfo/m5-dev
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------
> >>>
> >>> _______________________________________________
> >>> m5-dev mailing list
> >>> [email protected]
> >>> http://m5sim.org/mailman/listinfo/m5-dev
> >>>
> >>>
> >> _______________________________________________
> >> m5-dev mailing list
> >> [email protected]
> >> http://m5sim.org/mailman/listinfo/m5-dev
> >>
> >>
> >
> > _______________________________________________
> > m5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/m5-dev
> >
> >
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to