FYI, I don't know about the specweb95 benchmark Rick mentioned he was using,
but I was poking around with our surge-specweb benchmark that ships with M5
and got the same 404 problems.  I noticed we don't have any hooks to fileset
images in our tree, which seemed to be an obvious source of a problem.  I
added a specweb-fileset disk to my sim and my 404's went away.  I poked
around our images, old and new, and saw no references at all to a
mod_specweb99.so, so I don't know what that was about.

Lisa

On Wed, Feb 4, 2009 at 5:09 PM, Rick Strong <[email protected]> wrote:

> Lisa Hsu wrote:
> > Rick,
> >
> > 1) Just to follow up, did you figure out a good speed for the
> > checkpoint producing run?
> I guess my situation is a bit complicated as I am using asymmetric
> hardware (many configurations) at different frequencies and complexity.
> I confirmed that there indeed was a slow down happening due to a packet
> that timed out on the server side. Your advice was on target. Thanks. My
> solution was to add  a script with different ethernet link delays from
> 100us to 10ms and to find a point where greater stability occurred
> across all the hardware configurations.
> >
> > 2) Did you ever find why everything was 404'ed?
> This issue has not been resolved.  The initial suggestions was the
> problem was caused by:
>
> bad
> > mod_specweb99.so
>
> However, talking with Ali, it appears that Apache was not fixed. I opted to
> use to the version that sends 404 responses as it still has requests and
> transferring (albeit requiring greater description if used). The last
> conversation with Ali suggested that he was moving towards lighttpd, but
> that license issues prevent its general release to the M5 community. A
> snippet is copied below.
>
>
>  "I created a lighttpd one that transfers large files only, but it never
> made it into M5 (nor can it because of licensing issues from the
> components it's based on). I never fixed apache."
>
>
>
> Ali
> >
> > Lisa
> >
> > On Wed, Jan 28, 2009 at 9:44 PM, Lisa Hsu <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     I couldn't say exactly, so much in M5 has changed since that paper
> >     to give an exact number, but I'd imagine whatever
> >     instructions/second you're getting in the detailed, if you make
> >     the checkpoint run the appropriate speed considering its 1 IPC,
> >     you'd probably be in the right range.
> >
> >     Lisa
> >
> >
> >     On Wed, Jan 28, 2009 at 8:26 PM, Rick Strong <[email protected]
> >     <mailto:[email protected]>> wrote:
> >
> >         The clock rate on the server is only set to 1GHz on the
> >         checkpoint run
> >         (as opposed to 3GHz for the detailed simulation). How slow
> >         should it be
> >         set? Are we talking nearer to 250MHz?
> >
> >         Thanks,
> >         -Rick
> >
> >         Ali Saidi wrote:
> >         > That's almost certainly what is happening. Different packets
> are
> >         > trying to be sent, both originating from the kernel. This
> >         isn't a
> >         > device bug. It's exactly what that paper described. The
> >         delay observed
> >         > by the server has changed dramatically, that a retransmit is
> >         occurring
> >         > because since the ack didn't arrive in twice the round trip
> >         latency.
> >         > You should add some latency to the ethernet link, and drive
> >         the server
> >         > with a slower CPU during the checkpoint run. That will
> >         normally fix
> >         > the problem.
> >         >
> >         > Ali
> >         >
> >         >
> >         >
> >         >
> >         > On Jan 28, 2009, at 5:59 PM, Rick Strong wrote:
> >         >
> >         >
> >         >> This is an interesting. Thanks for the link.
> >         >>
> >         >> -Rick
> >         >>
> >         >> Lisa Hsu wrote:
> >         >>
> >         >>> Your description that it only occurs when you switch to a
> >         timing sim
> >         >>> makes me think of this (not to toot my own horn or anything):
> >         >>>
> >         >>> 
> > http://www.eecs.umich.edu/~hsul/pubs/mobs05.pdf<http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
> >         <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
> >         >>> <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
> >         >>>
> >         >>> Just throwing that out as a possibility.  You might want
> >         to "slow
> >         >>> down" your checkpoint dropping run so that it's not so
> >         disruptive
> >         >>> when
> >         >>> you switch over to timing.
> >         >>>
> >         >>> Lisa
> >         >>>
> >         >>> On Wed, Jan 28, 2009 at 5:32 PM, Rick Strong
> >         <[email protected] <mailto:[email protected]>
> >         >>> <mailto:[email protected] <mailto:[email protected]>>>
> >         wrote:
> >         >>>
> >         >>>    I have posted tar.gz files that include EthernetAll
> >         output (in
> >         >>>    ethernet_all.trace) @ http://rickshin.ucsd.edu. Once
> >         you have a
> >         >>>    chance,
> >         >>>    if you could take a look at the trace and figure what
> >         is wrong
> >         >>> that is
> >         >>>    great.
> >         >>>
> >         >>>    I ended restoring from the checkpoint for two runs. One
> run
> >         >>> stays in
> >         >>>    atomic mode while the other switches to timing and
> >         detailed. The
> >         >>> run
> >         >>>    that stays in atomic mode works fine. This leads me to
> >         believe
> >         >>>    that the
> >         >>>    checkpoint restore mechanism is fine. The fault likely
> >         lies in the
> >         >>>    switching to timing or detailed mdoe.
> >         >>>
> >         >>>    The differences between the runs:
> >         >>>
> >         >>>    (1) m5out-atomic-run-aftercheckpoint.tar.gz is a run
> >         that stays in
> >         >>>    atomic mode (no switching to timing mode)
> >         >>>
> >         >>>    (2) m5out-timing-run-aftercheckpoint.tar.gz is a run that
> >         >>> switches to
> >         >>>    timing and then to detailed mode.
> >         >>>
> >         >>>
> >         >>>    Thanks and good luck,
> >         >>>
> >         >>>    -Rick
> >         >>>
> >         >>>    Ali Saidi wrote:
> >         >>>
> >         >>>> Looking at the trace, it appears as though you just
> >         restored from a
> >         >>>> checkpoint. Is this the case? If so, what does the
> checkpoint
> >         >>>>
> >         >>>    dropping
> >         >>>
> >         >>>> run do after that checkpoint is created? It's so early in
> >         the trace
> >         >>>> that  I would guess it's a serialization bug, particularly
> in
> >         >>>>
> >         >>>    the TSO
> >         >>>
> >         >>>> code. However, I looked quickly at the code and and nothing
> >         >>>>
> >         >>>    seemed to
> >         >>>
> >         >>>> jump out at me. If you can provide me with an EthernetAll
> >         trace from
> >         >>>> the checkpoint run and from the restored run I can work
> >         on figuring
> >         >>>> out what the problem is.
> >         >>>>
> >         >>>> Ali
> >         >>>>
> >         >>>>
> >         >>>>
> >         >>>> On Jan 28, 2009, at 2:53 AM, Rick Strong wrote:
> >         >>>>
> >         >>>>
> >         >>>>
> >         >>>>>> There are three possibilities here:
> >         >>>>>> a) A kernel bug
> >         >>>>>> b) a device model/driver bug
> >         >>>>>> c) a checkpointing bug (as it relates to (b))
> >         >>>>>>
> >         >>>>>> What kernel version are you using? Could you put the
> >         ethernet
> >         >>>>>>
> >         >>>    trace
> >         >>>
> >         >>>>>> somewhere so I could look at it?
> >         >>>>>>
> >         >>>>>> Ali
> >         >>>>>>
> >         >>>>>>
> >         >>>>>>
> >         >>>>> I am using the  kernel 2.6.18 with M5 patches.
> >         >>>>>
> >         >>>>> I have put the ethernet traces up at
> >         http://rickshin.ucsd.edu.
> >         >>>>>
> >         >>>    It is
> >         >>>
> >         >>>>> the
> >         >>>>> only link. If you take a look, let me know what you think.
> >         >>>>>
> >         >>>>> Best,
> >         >>>>> -Rick
> >         >>>>>
> >         >>>>>
> >         >>>>> _______________________________________________
> >         >>>>> m5-dev mailing list
> >         >>>>> [email protected] <mailto:[email protected]>
> >         <mailto:[email protected] <mailto:[email protected]>>
> >         >>>>> http://m5sim.org/mailman/listinfo/m5-dev
> >         >>>>>
> >         >>>>>
> >         >>>>>
> >         >>>> _______________________________________________
> >         >>>> m5-dev mailing list
> >         >>>> [email protected] <mailto:[email protected]>
> >         <mailto:[email protected] <mailto:[email protected]>>
> >         >>>> http://m5sim.org/mailman/listinfo/m5-dev
> >         >>>>
> >         >>>>
> >         >>>>
> >         >>>    _______________________________________________
> >         >>>    m5-dev mailing list
> >         >>>    [email protected] <mailto:[email protected]>
> >         <mailto:[email protected] <mailto:[email protected]>>
> >         >>>    http://m5sim.org/mailman/listinfo/m5-dev
> >         >>>
> >         >>>
> >         >>>
> >
> ------------------------------------------------------------------------
> >         >>>
> >         >>> _______________________________________________
> >         >>> m5-dev mailing list
> >         >>> [email protected] <mailto:[email protected]>
> >         >>> http://m5sim.org/mailman/listinfo/m5-dev
> >         >>>
> >         >>>
> >         >> _______________________________________________
> >         >> m5-dev mailing list
> >         >> [email protected] <mailto:[email protected]>
> >         >> http://m5sim.org/mailman/listinfo/m5-dev
> >         >>
> >         >>
> >         >
> >         > _______________________________________________
> >         > m5-dev mailing list
> >         > [email protected] <mailto:[email protected]>
> >         > http://m5sim.org/mailman/listinfo/m5-dev
> >         >
> >         >
> >
> >         _______________________________________________
> >         m5-dev mailing list
> >         [email protected] <mailto:[email protected]>
> >         http://m5sim.org/mailman/listinfo/m5-dev
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > m5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/m5-dev
> >
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to