Yes, many of the things I do for gem5 can be very frustrating. A hack that works is not necessarily any better than code that doesn't work because it will have to be maintained and worked with/around for a long time. It can cause more damage than what it's supposed to be fixing, and probably will.
Gabe On 12/30/11 04:03, Korey Sewell wrote: > I can't vouch for reading all the emails but I have gone through this whole > thread (which dates back to Nov. 29th). > > Also, I'm not all the way familiar with x86 so maybe this excludes me from > understanding the problem at the detailed level, but I think I am starting > to get a good grasp of the general squashing problem here (basically > maintaining squash state through exception events). > > My concern is that if you don't literally "fix" the problem first, you can > get caught up in the minutia of making this big grand sweeping change and > then have no good way to say if "the fix" fixes anything in the first place. > > If Nilay or anyone could get something to the reviewboard that worked, hack > or not, then that would be a good step toward making the "clean" change > that I think you're referring to Gabe. We dont have to commit the code, but > on a 1st pass working is better then "not working", right? :) > > (Gabe, I do understand it can be frustrating explaining the same things > over/over again.) > > On Fri, Dec 30, 2011 at 3:48 AM, Gabe Black <[email protected]> wrote: > >> If you read my emails the problem would already be identified and >> understood, because I did that weeks or even months ago and explained it >> multiple times. A hack fix is not ok. A hack fix is why this is still >> broken in the first place. That's also something I explained in my emails. >> >> Gabe >> >> On 12/30/11 02:50, Korey Sewell wrote: >>> I agree with you Gabe that the squashing mechanism could be cleaned up. >>> >>> But I'd also suggest that Nilay should understand/identify the problem >>> first and then implement a first-pass fix without a big squashing revamp >>> (if possible). >>> >>> That way, when we (nilay, you, me, whoever) gets to revamping the squash >>> code, there is at least a set test case and trace we can use to debug >> with.. >>> On Fri, Dec 30, 2011 at 2:30 AM, Gabe Black <[email protected]> >> wrote: >>>> On 12/05/11 05:24, Gabe Black wrote: >>>>> On 12/03/11 13:02, Nilay Vaish wrote: >>>>>> On Wed, 30 Nov 2011, Gabriel Michael Black wrote: >>>>>> >>>>>>> That may be the same thing that's happening with Ali's branch >>>>>>> predictor patch. With Ruby execution changes enough to hit one of the >>>>>>> broken squashing cases. The Ruby integration is probably working. >>>>>>> >>>>>>> Gabe >>>>>>> >>>>>>> Quoting Nilay Vaish <[email protected]>: >>>>>>> >>>>>>>> Gabe, when I boot FS with O3 CPU and Ruby, I get the following >>>>>>>> output on the terminal of the simulated system. >>>>>>>> >>>>>>>> EXT2-fs warning: mounting unchecked fs, running e2fsck is >> recommended >>>>>>>> VFS: Mounted root (ext2 filesystem). >>>>>>>> Freeing unused kernel memory: 232k freed >>>>>>>> init[1]: segfault at ffffffff802095c0 rip ffffffff802095c8 rsp >>>>>>>> 00007fff38fa81b8 error 15 >>>>>>>> init[1]: segfault at ffffffff802095c0 rip ffffffff802095c8 rsp >>>>>>>> 00007fff38fa81b8 error 15 >>>>>>>> >>>>>>>> The segfault message keeps appearing. Do you know why this might be >>>>>>>> happening? >>>>>>>> >>>>>> Gabe, how can I confirm this? Is there something that I can do to >>>>>> resolve the problem with branch prediction? >>>>>> >>>>>> Thanks >>>>>> Nilay >>>>>> _______________________________________________ >>>>>> gem5-dev mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>>> I'm fairly confident that's what's going on. The stack address is user >>>>> space and the instruction pointer is in kernel space. The page fault is >>>>> from near the ip, and the error code is 15 which means, if I'm not >>>>> mistaken, a permission problem on fetch. You can't easily fix the >>>>> problem, but if you want to get started the first step would be to >> clean >>>>> up the squashing mechanisms in O3 like I brought up in that email a >>>>> while ago. The real problem is that squashing doesn't always preserve >>>>> enough state (the macroop instance specifically) in all situations, and >>>>> that the squashing stuff is too ad-hoc and all over the place to really >>>>> fix it correctly and know that it's correct. I'd thought I fixed it >>>>> before when I fixed one particular squash path, but obviously I didn't >>>>> get it all. >>>>> >>>>> Gabe >>>>> _______________________________________________ >>>>> gem5-dev mailing list >>>>> [email protected] >>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>> What was unclear about this email and the ones before it? Did you not >>>> believe me for some reason? You've spent about a month partially >>>> rediscovering what I explained in them. I've already said how this needs >>>> to be fixed. >>>> >>>> Gabe >>>> _______________________________________________ >>>> gem5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>> >>> >> _______________________________________________ >> gem5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/gem5-dev >> > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
