So what is the relatively good way to make this work in the short term? A bus? What about the slightly better version? I suppose a small cache might be ok and probably somewhat realistic.
Thanks, Ali On Tue, 23 Nov 2010 08:15:01 -0800, Steve Reinhardt wrote: And even though I do think it could be made to work, I'm not sure it would be easy or a good idea. There are a lot of corner cases to worry about, especially for writes, since you'd have to actually buffer the write data somewhere as opposed to just remembering that so-and-so has requested an exclusive copy. Actually as I think about it, that might be the case that's breaking now... if the L1 has an exclusive copy and then it snoops a write (and not a read-exclusive), I'm guessing it will just invalidate its copy, losing the modifications. I wouldn't be terribly surprised if reads are working OK (the L1 should snoop those and respond if it's the owner), and of course it's all OK if the L1 doesn't have a copy of the block. So maybe there is a relatively easy way to make this work, but figuring out whether that's true and then testing it is still a non-trivial amount of effort. Steve On Tue, Nov 23, 2010 at 7:57 AM, Steve Reinhardt wrote: No, when the L2 receives a request it assumes the L1s above it have already been snooped, which is true since the request came in on the bus that the L1s snoop. The issue is that caches don't necessarily behave correctly when non-cache-block requests come in through their mem-side (snoop) port and not through their cpu-side (request) port. I'm guessing this could be made to work, I'd just be very surprised if it does right now, since the caches weren't designed to deal with this case and aren't tested this way. Steve On Tue, Nov 23, 2010 at 7:50 AM, Ali Saidi wrote: Does it? Shouldn't the l2 receive the request, ask for the block and end up snooping the l1s? Ali On Tue, 23 Nov 2010 07:30:00 -0800, Steve Reinhardt wrote: The point is that connecting between the L1 and L2 induces the same problems wrt the L1 that connecting directly to memory induces wrt the whole cache hierarchy. You're just statistically more likely to get away with it in the former case because the L1 is smaller. Steve On Tue, Nov 23, 2010 at 7:16 AM, Ali Saidi wrote: Where are you connecting the table walker? If it's between the l1 and l2 my guess is that it will work. if it is to the memory bus, yes, memory is just responding without the help of a cache and this could be the reason. Ali On Tue, 23 Nov 2010 06:29:20 -0500, Gabe Black wrote: I think I may have just now. I've fixed a few issues, and am now getting to the point where something that should be in the pagetables is causing a page fault. I found where the table walker is walking the tables for this particular access, and the last level entry is all 0s. There could be a number of reasons this is all 0s, but since the main difference other than timing between this and a working configuration is the presence of caches and we've identified a potential issue there, I'm inclined to suspect the actual page table entry is still in the L1 and hasn't been evicted out to memory yet. To fix this, is the best solution to add a bus below the CPU for all the connections that need to go to the L1? I'm assuming they'd all go into the dcache since they're more data-ey and that keeps the icache read only (ignoring SMC issues), and the dcache is probably servicing lower bandwidth normally. It also seems a little strange that this type of configuration is going on in the BaseCPU.py SimObject python file and not a configuration file, but I could be convinced there's a reason. Even if this isn't really a "fix" or the "right thing" to do, I'd still like to try it temporarily at least to see if it corrects the problem I'm seeing. Gabe Ali Saidi wrote: I haven't seen any strange behavior yet. That isn't to say it's not going to cause an issue in the future, but we've taken many a tlb miss and it hasn't fallen over yet. Ali On Mon, 22 Nov 2010 13:08:13 -0800, Steve Reinhardt wrote: Yea, I just got around to reading this thread and that was the point I was going to make... the L1 cache effectively serves as a translator between the CPU's word-size read & write requests and the coherent block-level requests that get snooped. If you attach a CPU-like device (such as the table walker) directly to an L2, the CPU-like accesses that go to the L2 will get sent to the L1s but I'm not sure they'll be handled correctly. Not that they fundamentally couldn't, this just isn't a configuration we test so it's likely that there are problems... for example, the L1 may try to hand ownership to the requester but the requester won't recognize that and things will break. Steve On Mon, Nov 22, 2010 at 12:00 PM, Gabe Black > wrote: What happens if an entry is in the L1 but not the L2? Gabe Ali Saidi wrote: > Between the l1 and l2 caches seems like a good place to me. The caches can cache page table entries, otherwise a tlb miss would be even more expensive then it is. The l1 isn't normally used for such things since it would get polluted (look why sparc has a load 128bits from l2, do not allocate into l1 instruction). > > Ali > > On Nov 22, 2010, at 4:27 AM, Gabe Black wrote: > > >> For anybody waiting for an x86 FS regression (yes, I know, you can >> all hardly wait, but don't let this spoil your Thanksgiving) I'm getting >> closer to having it working, but I've discovered some issues with the >> mechanisms behind the --caches flag with fs.py and x86. I'm surprised I >> never thought to try it before. It also brings up some questions about >> where the table walkers should be hooked up in x86 and ARM. Currently >> it's after the L1, if any, but before the L2, if any, which seems wrong >> to me. Also caches don't seem to propagate requests upwards to the CPUs >> which may or may not be an issue. I'm still looking into that. >> >> Gabe >> _______________________________________________ >> m5-dev mailing list >> m5-dev@m5sim.org [9] m5-dev@m5sim.org [10]> >> http://m5sim.org/mailman/listinfo/m5-dev [11] >> >> > > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org [12] m5-dev@m5sim.org [13]> > http://m5sim.org/mailman/listinfo/m5-dev [14] > _______________________________________________ m5-dev mailing list m5-dev@m5sim.org [15] m5-dev@m5sim.org [16]> http://m5sim.org/mailman/listinfo/m5-dev [17] ------------------------------------------------------------------------ _______________________________________________ m5-dev mailing list m5-dev@m5sim.org [18] http://m5sim.org/mailman/listinfo/m5-dev [19] _______________________________________________ m5-dev mailing list m5-dev@m5sim.org [20] http://m5sim.org/mailman/listinfo/m5-dev [21] _______________________________________________ m5-dev mailing list m5-dev@m5sim.org [22] http://m5sim.org/mailman/listinfo/m5-dev [23] _______________________________________________ m5-dev mailing list m5-dev@m5sim.org [24] http://m5sim.org/mailman/listinfo/m5-dev [25] Links: ------ [1] mailto:ste...@gmail.com [2] mailto:sa...@umich.edu [3] mailto:ste...@gmail.com [4] mailto:sa...@umich.edu [5] mailto:gbl...@eecs.umich.edu [6] mailto:ste...@gmail.com [7] mailto:gbl...@eecs.umich.edu [8] mailto:gbl...@eecs.umich.edu [9] mailto:m5-dev@m5sim.org [10] mailto:m5-dev@m5sim.org [11] http://m5sim.org/mailman/listinfo/m5-dev [12] mailto:m5-dev@m5sim.org [13] mailto:m5-dev@m5sim.org [14] http://m5sim.org/mailman/listinfo/m5-dev [15] mailto:m5-dev@m5sim.org [16] mailto:m5-dev@m5sim.org [17] http://m5sim.org/mailman/listinfo/m5-dev [18] mailto:m5-dev@m5sim.org [19] http://m5sim.org/mailman/listinfo/m5-dev [20] mailto:m5-dev@m5sim.org [21] http://m5sim.org/mailman/listinfo/m5-dev [22] mailto:m5-dev@m5sim.org [23] http://m5sim.org/mailman/listinfo/m5-dev [24] mailto:m5-dev@m5sim.org [25] http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev