Lisa Hsu wrote:
> FYI, I don't know about the specweb95 benchmark Rick mentioned he was 
> using, but I was poking around with our surge-specweb benchmark that 
> ships with M5 and got the same 404 problems.  I noticed we don't have 
> any hooks to fileset images in our tree, which seemed to be an obvious 
> source of a problem.  I added a specweb-fileset disk to my sim and my 
> 404's went away.
I actually have access to these files (and am moutning them) but still 
have a problem. I was using the bigfiles-fileset.img but maybe I should 
switch back to specweb-fileset.img. I will do a test with 
specweb-fileset to see if everything works.
> I poked around our images, old and new, and saw no references at all 
> to a mod_specweb99.so, so I don't know what that was about.
I was told this in a conversation, but it may or may not be true. I 
wouldn't worry about that too much.

Thanks,
-Rick
>
> Lisa
>
> On Wed, Feb 4, 2009 at 5:09 PM, Rick Strong <[email protected] 
> <mailto:[email protected]>> wrote:
>
>     Lisa Hsu wrote:
>     > Rick,
>     >
>     > 1) Just to follow up, did you figure out a good speed for the
>     > checkpoint producing run?
>     I guess my situation is a bit complicated as I am using asymmetric
>     hardware (many configurations) at different frequencies and
>     complexity.
>     I confirmed that there indeed was a slow down happening due to a
>     packet
>     that timed out on the server side. Your advice was on target.
>     Thanks. My
>     solution was to add  a script with different ethernet link delays from
>     100us to 10ms and to find a point where greater stability occurred
>     across all the hardware configurations.
>     >
>     > 2) Did you ever find why everything was 404'ed?
>     This issue has not been resolved.  The initial suggestions was the
>     problem was caused by:
>
>     bad
>     > mod_specweb99.so
>
>     However, talking with Ali, it appears that Apache was not fixed. I
>     opted to use to the version that sends 404 responses as it still
>     has requests and transferring (albeit requiring greater
>     description if used). The last conversation with Ali suggested
>     that he was moving towards lighttpd, but that license issues
>     prevent its general release to the M5 community. A snippet is
>     copied below.
>
>
>      "I created a lighttpd one that transfers large files only, but it
>     never
>     made it into M5 (nor can it because of licensing issues from the
>     components it's based on). I never fixed apache."
>
>
>
>     Ali
>     >
>     > Lisa
>     >
>     > On Wed, Jan 28, 2009 at 9:44 PM, Lisa Hsu <[email protected]
>     <mailto:[email protected]>
>     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>     >
>     >     I couldn't say exactly, so much in M5 has changed since that
>     paper
>     >     to give an exact number, but I'd imagine whatever
>     >     instructions/second you're getting in the detailed, if you make
>     >     the checkpoint run the appropriate speed considering its 1 IPC,
>     >     you'd probably be in the right range.
>     >
>     >     Lisa
>     >
>     >
>     >     On Wed, Jan 28, 2009 at 8:26 PM, Rick Strong
>     <[email protected] <mailto:[email protected]>
>     >     <mailto:[email protected] <mailto:[email protected]>>>
>     wrote:
>     >
>     >         The clock rate on the server is only set to 1GHz on the
>     >         checkpoint run
>     >         (as opposed to 3GHz for the detailed simulation). How slow
>     >         should it be
>     >         set? Are we talking nearer to 250MHz?
>     >
>     >         Thanks,
>     >         -Rick
>     >
>     >         Ali Saidi wrote:
>     >         > That's almost certainly what is happening. Different
>     packets are
>     >         > trying to be sent, both originating from the kernel. This
>     >         isn't a
>     >         > device bug. It's exactly what that paper described. The
>     >         delay observed
>     >         > by the server has changed dramatically, that a
>     retransmit is
>     >         occurring
>     >         > because since the ack didn't arrive in twice the round
>     trip
>     >         latency.
>     >         > You should add some latency to the ethernet link, and
>     drive
>     >         the server
>     >         > with a slower CPU during the checkpoint run. That will
>     >         normally fix
>     >         > the problem.
>     >         >
>     >         > Ali
>     >         >
>     >         >
>     >         >
>     >         >
>     >         > On Jan 28, 2009, at 5:59 PM, Rick Strong wrote:
>     >         >
>     >         >
>     >         >> This is an interesting. Thanks for the link.
>     >         >>
>     >         >> -Rick
>     >         >>
>     >         >> Lisa Hsu wrote:
>     >         >>
>     >         >>> Your description that it only occurs when you switch
>     to a
>     >         timing sim
>     >         >>> makes me think of this (not to toot my own horn or
>     anything):
>     >         >>>
>     >         >>> http://www.eecs.umich.edu/~hsul/pubs/mobs05.pdf
>     <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
>     >         <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
>     >         >>> <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf>
>     >         >>>
>     >         >>> Just throwing that out as a possibility.  You might want
>     >         to "slow
>     >         >>> down" your checkpoint dropping run so that it's not so
>     >         disruptive
>     >         >>> when
>     >         >>> you switch over to timing.
>     >         >>>
>     >         >>> Lisa
>     >         >>>
>     >         >>> On Wed, Jan 28, 2009 at 5:32 PM, Rick Strong
>     >         <[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         >>> <mailto:[email protected]
>     <mailto:[email protected]> <mailto:[email protected]
>     <mailto:[email protected]>>>>
>     >         wrote:
>     >         >>>
>     >         >>>    I have posted tar.gz files that include EthernetAll
>     >         output (in
>     >         >>>    ethernet_all.trace) @ http://rickshin.ucsd.edu. Once
>     >         you have a
>     >         >>>    chance,
>     >         >>>    if you could take a look at the trace and figure what
>     >         is wrong
>     >         >>> that is
>     >         >>>    great.
>     >         >>>
>     >         >>>    I ended restoring from the checkpoint for two
>     runs. One run
>     >         >>> stays in
>     >         >>>    atomic mode while the other switches to timing and
>     >         detailed. The
>     >         >>> run
>     >         >>>    that stays in atomic mode works fine. This leads
>     me to
>     >         believe
>     >         >>>    that the
>     >         >>>    checkpoint restore mechanism is fine. The fault
>     likely
>     >         lies in the
>     >         >>>    switching to timing or detailed mdoe.
>     >         >>>
>     >         >>>    The differences between the runs:
>     >         >>>
>     >         >>>    (1) m5out-atomic-run-aftercheckpoint.tar.gz is a run
>     >         that stays in
>     >         >>>    atomic mode (no switching to timing mode)
>     >         >>>
>     >         >>>    (2) m5out-timing-run-aftercheckpoint.tar.gz is a
>     run that
>     >         >>> switches to
>     >         >>>    timing and then to detailed mode.
>     >         >>>
>     >         >>>
>     >         >>>    Thanks and good luck,
>     >         >>>
>     >         >>>    -Rick
>     >         >>>
>     >         >>>    Ali Saidi wrote:
>     >         >>>
>     >         >>>> Looking at the trace, it appears as though you just
>     >         restored from a
>     >         >>>> checkpoint. Is this the case? If so, what does the
>     checkpoint
>     >         >>>>
>     >         >>>    dropping
>     >         >>>
>     >         >>>> run do after that checkpoint is created? It's so
>     early in
>     >         the trace
>     >         >>>> that  I would guess it's a serialization bug,
>     particularly in
>     >         >>>>
>     >         >>>    the TSO
>     >         >>>
>     >         >>>> code. However, I looked quickly at the code and and
>     nothing
>     >         >>>>
>     >         >>>    seemed to
>     >         >>>
>     >         >>>> jump out at me. If you can provide me with an
>     EthernetAll
>     >         trace from
>     >         >>>> the checkpoint run and from the restored run I can work
>     >         on figuring
>     >         >>>> out what the problem is.
>     >         >>>>
>     >         >>>> Ali
>     >         >>>>
>     >         >>>>
>     >         >>>>
>     >         >>>> On Jan 28, 2009, at 2:53 AM, Rick Strong wrote:
>     >         >>>>
>     >         >>>>
>     >         >>>>
>     >         >>>>>> There are three possibilities here:
>     >         >>>>>> a) A kernel bug
>     >         >>>>>> b) a device model/driver bug
>     >         >>>>>> c) a checkpointing bug (as it relates to (b))
>     >         >>>>>>
>     >         >>>>>> What kernel version are you using? Could you put the
>     >         ethernet
>     >         >>>>>>
>     >         >>>    trace
>     >         >>>
>     >         >>>>>> somewhere so I could look at it?
>     >         >>>>>>
>     >         >>>>>> Ali
>     >         >>>>>>
>     >         >>>>>>
>     >         >>>>>>
>     >         >>>>> I am using the  kernel 2.6.18 with M5 patches.
>     >         >>>>>
>     >         >>>>> I have put the ethernet traces up at
>     >         http://rickshin.ucsd.edu.
>     >         >>>>>
>     >         >>>    It is
>     >         >>>
>     >         >>>>> the
>     >         >>>>> only link. If you take a look, let me know what
>     you think.
>     >         >>>>>
>     >         >>>>> Best,
>     >         >>>>> -Rick
>     >         >>>>>
>     >         >>>>>
>     >         >>>>> _______________________________________________
>     >         >>>>> m5-dev mailing list
>     >         >>>>> [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         <mailto:[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>>
>     >         >>>>> http://m5sim.org/mailman/listinfo/m5-dev
>     >         >>>>>
>     >         >>>>>
>     >         >>>>>
>     >         >>>> _______________________________________________
>     >         >>>> m5-dev mailing list
>     >         >>>> [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         <mailto:[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>>
>     >         >>>> http://m5sim.org/mailman/listinfo/m5-dev
>     >         >>>>
>     >         >>>>
>     >         >>>>
>     >         >>>    _______________________________________________
>     >         >>>    m5-dev mailing list
>     >         >>>    [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         <mailto:[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>>
>     >         >>>    http://m5sim.org/mailman/listinfo/m5-dev
>     >         >>>
>     >         >>>
>     >         >>>
>     >        
>     ------------------------------------------------------------------------
>     >         >>>
>     >         >>> _______________________________________________
>     >         >>> m5-dev mailing list
>     >         >>> [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         >>> http://m5sim.org/mailman/listinfo/m5-dev
>     >         >>>
>     >         >>>
>     >         >> _______________________________________________
>     >         >> m5-dev mailing list
>     >         >> [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         >> http://m5sim.org/mailman/listinfo/m5-dev
>     >         >>
>     >         >>
>     >         >
>     >         > _______________________________________________
>     >         > m5-dev mailing list
>     >         > [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         > http://m5sim.org/mailman/listinfo/m5-dev
>     >         >
>     >         >
>     >
>     >         _______________________________________________
>     >         m5-dev mailing list
>     >         [email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >         http://m5sim.org/mailman/listinfo/m5-dev
>     >
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > m5-dev mailing list
>     > [email protected] <mailto:[email protected]>
>     > http://m5sim.org/mailman/listinfo/m5-dev
>     >
>
>     _______________________________________________
>     m5-dev mailing list
>     [email protected] <mailto:[email protected]>
>     http://m5sim.org/mailman/listinfo/m5-dev
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>   

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to