Lisa Hsu wrote: > FYI, I don't know about the specweb95 benchmark Rick mentioned he was > using, but I was poking around with our surge-specweb benchmark that > ships with M5 and got the same 404 problems. I noticed we don't have > any hooks to fileset images in our tree, which seemed to be an obvious > source of a problem. I added a specweb-fileset disk to my sim and my > 404's went away. I actually have access to these files (and am moutning them) but still have a problem. I was using the bigfiles-fileset.img but maybe I should switch back to specweb-fileset.img. I will do a test with specweb-fileset to see if everything works. > I poked around our images, old and new, and saw no references at all > to a mod_specweb99.so, so I don't know what that was about. I was told this in a conversation, but it may or may not be true. I wouldn't worry about that too much.
Thanks, -Rick > > Lisa > > On Wed, Feb 4, 2009 at 5:09 PM, Rick Strong <[email protected] > <mailto:[email protected]>> wrote: > > Lisa Hsu wrote: > > Rick, > > > > 1) Just to follow up, did you figure out a good speed for the > > checkpoint producing run? > I guess my situation is a bit complicated as I am using asymmetric > hardware (many configurations) at different frequencies and > complexity. > I confirmed that there indeed was a slow down happening due to a > packet > that timed out on the server side. Your advice was on target. > Thanks. My > solution was to add a script with different ethernet link delays from > 100us to 10ms and to find a point where greater stability occurred > across all the hardware configurations. > > > > 2) Did you ever find why everything was 404'ed? > This issue has not been resolved. The initial suggestions was the > problem was caused by: > > bad > > mod_specweb99.so > > However, talking with Ali, it appears that Apache was not fixed. I > opted to use to the version that sends 404 responses as it still > has requests and transferring (albeit requiring greater > description if used). The last conversation with Ali suggested > that he was moving towards lighttpd, but that license issues > prevent its general release to the M5 community. A snippet is > copied below. > > > "I created a lighttpd one that transfers large files only, but it > never > made it into M5 (nor can it because of licensing issues from the > components it's based on). I never fixed apache." > > > > Ali > > > > Lisa > > > > On Wed, Jan 28, 2009 at 9:44 PM, Lisa Hsu <[email protected] > <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > I couldn't say exactly, so much in M5 has changed since that > paper > > to give an exact number, but I'd imagine whatever > > instructions/second you're getting in the detailed, if you make > > the checkpoint run the appropriate speed considering its 1 IPC, > > you'd probably be in the right range. > > > > Lisa > > > > > > On Wed, Jan 28, 2009 at 8:26 PM, Rick Strong > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > The clock rate on the server is only set to 1GHz on the > > checkpoint run > > (as opposed to 3GHz for the detailed simulation). How slow > > should it be > > set? Are we talking nearer to 250MHz? > > > > Thanks, > > -Rick > > > > Ali Saidi wrote: > > > That's almost certainly what is happening. Different > packets are > > > trying to be sent, both originating from the kernel. This > > isn't a > > > device bug. It's exactly what that paper described. The > > delay observed > > > by the server has changed dramatically, that a > retransmit is > > occurring > > > because since the ack didn't arrive in twice the round > trip > > latency. > > > You should add some latency to the ethernet link, and > drive > > the server > > > with a slower CPU during the checkpoint run. That will > > normally fix > > > the problem. > > > > > > Ali > > > > > > > > > > > > > > > On Jan 28, 2009, at 5:59 PM, Rick Strong wrote: > > > > > > > > >> This is an interesting. Thanks for the link. > > >> > > >> -Rick > > >> > > >> Lisa Hsu wrote: > > >> > > >>> Your description that it only occurs when you switch > to a > > timing sim > > >>> makes me think of this (not to toot my own horn or > anything): > > >>> > > >>> http://www.eecs.umich.edu/~hsul/pubs/mobs05.pdf > <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf> > > <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf> > > >>> <http://www.eecs.umich.edu/%7Ehsul/pubs/mobs05.pdf> > > >>> > > >>> Just throwing that out as a possibility. You might want > > to "slow > > >>> down" your checkpoint dropping run so that it's not so > > disruptive > > >>> when > > >>> you switch over to timing. > > >>> > > >>> Lisa > > >>> > > >>> On Wed, Jan 28, 2009 at 5:32 PM, Rick Strong > > <[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > >>> <mailto:[email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>>>> > > wrote: > > >>> > > >>> I have posted tar.gz files that include EthernetAll > > output (in > > >>> ethernet_all.trace) @ http://rickshin.ucsd.edu. Once > > you have a > > >>> chance, > > >>> if you could take a look at the trace and figure what > > is wrong > > >>> that is > > >>> great. > > >>> > > >>> I ended restoring from the checkpoint for two > runs. One run > > >>> stays in > > >>> atomic mode while the other switches to timing and > > detailed. The > > >>> run > > >>> that stays in atomic mode works fine. This leads > me to > > believe > > >>> that the > > >>> checkpoint restore mechanism is fine. The fault > likely > > lies in the > > >>> switching to timing or detailed mdoe. > > >>> > > >>> The differences between the runs: > > >>> > > >>> (1) m5out-atomic-run-aftercheckpoint.tar.gz is a run > > that stays in > > >>> atomic mode (no switching to timing mode) > > >>> > > >>> (2) m5out-timing-run-aftercheckpoint.tar.gz is a > run that > > >>> switches to > > >>> timing and then to detailed mode. > > >>> > > >>> > > >>> Thanks and good luck, > > >>> > > >>> -Rick > > >>> > > >>> Ali Saidi wrote: > > >>> > > >>>> Looking at the trace, it appears as though you just > > restored from a > > >>>> checkpoint. Is this the case? If so, what does the > checkpoint > > >>>> > > >>> dropping > > >>> > > >>>> run do after that checkpoint is created? It's so > early in > > the trace > > >>>> that I would guess it's a serialization bug, > particularly in > > >>>> > > >>> the TSO > > >>> > > >>>> code. However, I looked quickly at the code and and > nothing > > >>>> > > >>> seemed to > > >>> > > >>>> jump out at me. If you can provide me with an > EthernetAll > > trace from > > >>>> the checkpoint run and from the restored run I can work > > on figuring > > >>>> out what the problem is. > > >>>> > > >>>> Ali > > >>>> > > >>>> > > >>>> > > >>>> On Jan 28, 2009, at 2:53 AM, Rick Strong wrote: > > >>>> > > >>>> > > >>>> > > >>>>>> There are three possibilities here: > > >>>>>> a) A kernel bug > > >>>>>> b) a device model/driver bug > > >>>>>> c) a checkpointing bug (as it relates to (b)) > > >>>>>> > > >>>>>> What kernel version are you using? Could you put the > > ethernet > > >>>>>> > > >>> trace > > >>> > > >>>>>> somewhere so I could look at it? > > >>>>>> > > >>>>>> Ali > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>> I am using the kernel 2.6.18 with M5 patches. > > >>>>> > > >>>>> I have put the ethernet traces up at > > http://rickshin.ucsd.edu. > > >>>>> > > >>> It is > > >>> > > >>>>> the > > >>>>> only link. If you take a look, let me know what > you think. > > >>>>> > > >>>>> Best, > > >>>>> -Rick > > >>>>> > > >>>>> > > >>>>> _______________________________________________ > > >>>>> m5-dev mailing list > > >>>>> [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > <mailto:[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> > > >>>>> http://m5sim.org/mailman/listinfo/m5-dev > > >>>>> > > >>>>> > > >>>>> > > >>>> _______________________________________________ > > >>>> m5-dev mailing list > > >>>> [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > <mailto:[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> > > >>>> http://m5sim.org/mailman/listinfo/m5-dev > > >>>> > > >>>> > > >>>> > > >>> _______________________________________________ > > >>> m5-dev mailing list > > >>> [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > <mailto:[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> > > >>> http://m5sim.org/mailman/listinfo/m5-dev > > >>> > > >>> > > >>> > > > ------------------------------------------------------------------------ > > >>> > > >>> _______________________________________________ > > >>> m5-dev mailing list > > >>> [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > >>> http://m5sim.org/mailman/listinfo/m5-dev > > >>> > > >>> > > >> _______________________________________________ > > >> m5-dev mailing list > > >> [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > >> http://m5sim.org/mailman/listinfo/m5-dev > > >> > > >> > > > > > > _______________________________________________ > > > m5-dev mailing list > > > [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > > http://m5sim.org/mailman/listinfo/m5-dev > > > > > > > > > > _______________________________________________ > > m5-dev mailing list > > [email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> > > http://m5sim.org/mailman/listinfo/m5-dev > > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > m5-dev mailing list > > [email protected] <mailto:[email protected]> > > http://m5sim.org/mailman/listinfo/m5-dev > > > > _______________________________________________ > m5-dev mailing list > [email protected] <mailto:[email protected]> > http://m5sim.org/mailman/listinfo/m5-dev > > > ------------------------------------------------------------------------ > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
