The netperf test is running. It was good that Ali brought up this email from 5-6 years back. The very first error I encountered was due to the decode cache being shared. For the time being I have taken the easy way of making the cache a per decoder object, instead of it being shared amongst amongst the decoders. Apart from that I really did not face any problems.
The etherlink object was provided with the sim objects that are at its two ends. The etherlink object, when required to schedule the event that moves a packet from one end to the other, schedules the event using the sim object at the other end. The code in event queue class takes the decision for the right queue on which to schedule this event. One thing that seems not to be working correctly right now is automatic deletion of the per queue events that form a global event. We can handle it in course of time. I think it is working fine. -- Nilay On Fri, February 1, 2013 3:14 pm, Steve Reinhardt wrote: > Forwarding this thread to the dev list since others in the community might > be interested (both in the technical issues and in the fact that someone > is > working in this direction). > > Steve > > > On Fri, Feb 1, 2013 at 9:01 AM, Steve Reinhardt <ste...@gmail.com> wrote: > >> Glad you found this, Ali. >> >> As far as the decode cache: a great big lock is certainly adequate just >> to >> bring things up and get it working. In the longer term, we should take >> advantage of the fact that the decode cache is read-mostly (or at least >> it >> should be... if it's not we have bigger problems) to do something more >> intelligent. I'm guessing it would be possible to make the decode cache >> lock-free using cmpxchg; if not, some sort of medium-grain >> multiple-reader-single-writer locking scheme could also work. But those >> optimizations should be left for later; I just wanted to bring them up >> now >> for the record while I was thinking of them. In particular, I think >> making >> the decode cache per-thread is the wrong way to go. >> >> Steve >> >> >> >> On Fri, Feb 1, 2013 at 8:31 AM, Ali Saidi <sa...@umich.edu> wrote: >> >>> ** >>> >>> Hi Nilay, >>> >>> >>> >>> I finally found an email which I've been looking for since the last >>> email you sent about running multiple systems in gem5. This undegrad >>> named >>> Miles got two systems running in gem5 (in 2007). None of the diffs are >>> useful at this point, everything has changed, but in the process he did >>> identify the areas that he had to lock around to make multiple systems >>> work. I'm not sure if you've gotten past this point yet, but there are >>> the >>> areas he identified and "fixed." The fix was just a great-big-lock >>> around >>> each of them which for the decode cache really hurt performance. >>> >>> FastAlloc: gone, so no problem and tcmalloc at least in thread-safe >>> >>> RefCount: I'm not sure if this is still a problem or not. If the >>> pointers >>> you're going to exchange are reference counted they could be. Certainly >>> another issue (see below) is refcounting of instructions. This might be >>> the >>> biggest reason to more toward c++11 pointers. Miles ended up using gcc >>> intrinsics (__atomic_compare_and_exchange() on the incref/decref >>> members, >>> although there are now C++::atomic_add and __atomic_fetch_and_add() >>> which >>> is probably more useful that having to write a while loop for the comp >>> and >>> exchange.) >>> >>> Stream output (e.g. DPRINTFS from multiple threads) >>> >>> Decode Cache: since in can be shared cross threads (perhaps it >>> shouldn't >>> be, or maybe it should be), and the stl structures aren't threadsafe by >>> default. >>> >>> >>> >>> Thanks, >>> >>> Ali >>> >>> _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev