Re: [m5-dev] Parallel M5

2008-06-30 Thread Gabe Black
Yes. CMPXCHG on page 98 of volume 3 of the AMD manuals. It says it supports the lock prefix so I'm assuming it's not otherwise atomic. Gabe > non-blocking update (has x86 added a compare-and-swap yet?). > ___ m5-dev mailing list m5-dev@m5sim.org

Re: [m5-dev] Parallel M5

2008-06-30 Thread nathan binkert
> OK, that makes more sense now. Still seems like in the long term the > right thing is to use a data structure that supports multiple readers > with either per-bucket locks, a reader/writer lock, or some sort of > non-blocking update (has x86 added a compare-and-swap yet?). They've had cmpxchg s

Re: [m5-dev] Parallel M5

2008-06-30 Thread Steve Reinhardt
On Mon, Jun 30, 2008 at 2:35 PM, Ali Saidi <[EMAIL PROTECTED]> wrote: > > On Jun 30, 2008, at 5:15 PM, Steve Reinhardt wrote: > >> On Mon, Jun 30, 2008 at 9:11 AM, Ali Saidi <[EMAIL PROTECTED]> wrote: >>> The FastAlloc pools and StaticInst cache should clearly be >>> duplicated. >> >> Why would you

Re: [m5-dev] Parallel M5

2008-06-30 Thread Ali Saidi
On Jun 30, 2008, at 5:15 PM, Steve Reinhardt wrote: > On Mon, Jun 30, 2008 at 9:11 AM, Ali Saidi <[EMAIL PROTECTED]> wrote: >> The FastAlloc pools and StaticInst cache should clearly be >> duplicated. > > Why would you want to duplicate the StaticInst cache? It's a > read-mostly structure so y

Re: [m5-dev] Parallel M5

2008-06-30 Thread Steve Reinhardt
On Mon, Jun 30, 2008 at 9:11 AM, Ali Saidi <[EMAIL PROTECTED]> wrote: > The FastAlloc pools and StaticInst cache should clearly be duplicated. Why would you want to duplicate the StaticInst cache? It's a read-mostly structure so you'd only have to lock on a miss/insert, and having a larger shared

Re: [m5-dev] Parallel M5

2008-06-30 Thread Steve Reinhardt
On Mon, Jun 30, 2008 at 12:30 PM, nathan binkert <[EMAIL PROTECTED]> wrote: >> Another option would be to have a boolean that said if it needed to be >> atomic or not. If you knew that the object wouldn't span threads it >> wouldn't need to be set or perhaps at the point when it changed >> threads

Re: [m5-dev] Parallel M5

2008-06-30 Thread nathan binkert
> Stats Nodes are refcounted, I don't know enough about how that works, > but it seems like there could be a problem if you had a statistics > that spanned threads. Ah, those objects are for creating formulas and don't really get passed around. > Another option would be to have a boolean that said

Re: [m5-dev] Parallel M5

2008-06-30 Thread Ali Saidi
On Jun 30, 2008, at 2:03 PM, nathan binkert wrote: >> The FastAlloc pools and StaticInst cache should clearly be >> duplicated. >> RefCounted will need some work though. I think the only problematic >> case is the EthPacketData and perhaps the statistics at the moment. > > Good call. > > What e

Re: [m5-dev] Parallel M5

2008-06-30 Thread nathan binkert
> The FastAlloc pools and StaticInst cache should clearly be duplicated. > RefCounted will need some work though. I think the only problematic > case is the EthPacketData and perhaps the statistics at the moment. Good call. What exactly is the problem with stats? They're for the most part part o

Re: [m5-dev] Parallel M5

2008-06-30 Thread Ali Saidi
You're going to need to add an item to that your plan to lock around or duplicate shared structures. Miles got these all to work, but in many cases they were with locks that killed performance. Places that come to mind are: FastAlloc pools RefCounted (yet another reason to not refcount memory

Re: [m5-dev] Parallel M5

2008-06-30 Thread Steve Reinhardt
If you're interested in the algorithm, here's a good place to start: http://www.eecs.umich.edu/~stever/pubs/sigmetrics93_wwt.pdf The "synchronization events" are equivalent to the "quantum expiration events" of section 4.2. One difference between Nate's plans and WWT is that WWT simulated a sing

Re: [m5-dev] Parallel M5

2008-06-29 Thread Gabe Black
This is probably slightly off topic, but could you explain more specifically the synchronization event stuff you mention on the wiki page? It sounds interesting but I can't picture what you're describing. Gabe nathan binkert wrote: >> I vote for (1) until it can be shown that it matters. A sing

Re: [m5-dev] Parallel M5

2008-06-29 Thread nathan binkert
> I vote for (1) until it can be shown that it matters. A single pointer > doesn't seem like a big deal, especially since most of the things we > create and destroy frequently aren't SimObjects but other classes. Showing that it matters is pretty hard unless you actually do it. A profile won't act

Re: [m5-dev] Parallel M5

2008-06-29 Thread Steve Reinhardt
I'm fine with #1 or #3... since you have to subclass Event to override process() anyway, just moving the queue pointer to the subclass for those that need it doesn't seem so bad to me. #2 and #4 seem unnecessarily complex and/or pervasive. Steve On Sun, Jun 29, 2008 at 3:34 PM, Ali Saidi <[EMAIL

Re: [m5-dev] Parallel M5

2008-06-29 Thread Ali Saidi
I vote for (1) until it can be shown that it matters. A single pointer doesn't seem like a big deal, especially since most of the things we create and destroy frequently aren't SimObjects but other classes. Ali On Jun 29, 2008, at 12:23 PM, nathan binkert wrote: > I'm nearly done with the fi

[m5-dev] Parallel M5 Wiki Page

2008-06-29 Thread nathan binkert
I've created a wiki page with my plan for parallel M5 http://m5sim.org/wiki/index.php/Parallel_M5 Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Parallel M5

2008-06-29 Thread nathan binkert
I'm nearly done with the first step of getting parallel M5 working. -- Add an EventQueue pointer to every SimObject and add schedule()/deschedule()/reschedule() functions to the Base SimObject to use that event queue pointer. -- Change all calls to event scheduling to use that EventQueue pointer.