Would it make sense for me to ship the EtherSwitch patch first, since it has utility on its own, and then we can decide which of the "multi-gem5" approaches is best, or if it's some combination of both?
The only reason I never shipped it was because Steve raised an issue that I didn't have a good alternative for, and didn't have the time to look into one at that time. ________________________________________ From: gem5-dev [gem5-dev-boun...@gem5.org] on behalf of Mohammad Alian [al...@wisc.edu] Sent: Wednesday, June 24, 2015 12:43 PM To: gem5 Developer List Subject: Re: [gem5-dev] pd-gem5: simulating a parallel/distributed system on multiple physical hosts Hi Andreas, Thanks for the comment. I think the checkpointing support in both works is the same. Here is how checkpointing support is implemented in pd-gem5: Whenever one of gem5 processes encounter an m5-checkpoint pseudo instruction, it will send a “recv-ckpt” signal to the “barrier” process. Then the “barrier” process sends a “take-ckpt” signal to all the simulated nodes (including the node that encountered m5-checkpoint) at the end of the current simulation quantum. On the reception of “take-ckpt” signal, gem5 processes start dumping check-points. This makes each simulated node dump a checkpoint at the same simulated time point while ensuring there is no in-flight packets. I believe this is the same as multi-gem5 patch approach for checkpoint support (based on the commit message of http://reviews.gem5.org/r/2865/). Also, we have tested our mechanism with several benchmarks and it works. As Steve suggested, I'll look into Curtis's patch and try to review it as well. But as Nilay also mentioned earlier, there are some codes missing in Curtis's patch. I prefer to first run multi-gem5 before starting to review it. Thank you, Mohammad On Wed, Jun 24, 2015 at 7:25 AM, Andreas Hansson <andreas.hans...@arm.com> wrote: > Hi Steve, > > Apologies for the confusion. We are on the same page. My point is that we > cannot simply take a little bit of patch A and a little bit of patch B. > This change involves a lot of code, and we need to approach this in a > structured fashion. My proposal is to do it bottom up, and start by > getting the basic support in place. Since http://reviews.gem5.org/r/2826/ > has already been on the review board for a few months, I am merely > suggesting that the it would be a good start to relate the newly posted > patches to what is already there. > > Andreas > > > > On 24/06/2015 13:11, "gem5-dev on behalf of Steve Reinhardt" > <gem5-dev-boun...@gem5.org on behalf of ste...@gmail.com> wrote: > > >Hi Andreas, > > > >I'm a little confused by your email---you say you're fundamentally opposed > >to looking at both patches and picking the best features, then you point > >out that the patches Curtis posted have the feature of better > >checkpointing > >support so we should pick that :). > > > >Obviously we can't just pick patch A from Mohammad's set and patch B from > >Curtis's set and expect them to work together, but I think that having > >both > >sets of patches available and comparing and contrasting the two > >implementations should enable us to get to a single implementation that's > >the best of both. Someone will have to make the effort of integrating the > >better ideas from one set into the other set to create a new unified set > >of > >patches; (or maybe we commit one set and then integrate the best of the > >other set as patches on top of that), but the first step is to identify > >what "the best of both" is. Having Mohammad look at Curtis's patches, and > >Curtis (or someone else from ARM) closely examine Mohammad's patches would > >be a great start. I intend to review them both, though unfortunately my > >time has been scarce lately---I'm hoping to squeeze that in later this > >week. > > > >Once we've had a few people look at both, we can discuss the pros and cons > >of each, then discuss the strategy for getting the best features in. So > >far I've heard that Mohammad's patches have a better network model but the > >ARM patches have better checkpointing support; that seems like a good > >start. > > > >Steve > > > >On Wed, Jun 24, 2015 at 12:26 AM Andreas Hansson <andreas.hans...@arm.com > > > >wrote: > > > >> Hi all, > >> > >> Great work. However, I fundamentally do not believe in the approach of > >> ‘letting reviewers pick the best features’. There is no way we would > >>ever > >> get something working out if it. We need to get _one_ working solution > >> here, and figure out how to best get there. I would propose to do it > >> bottom up, starting with the basic multi-simulator instance support, > >> checkpointing support, and then move on to the network between the > >> simulator instances. > >> > >> Thus, I propose we go with the low-level plumbing and checkpoint support > >> from what Curtis has posted. I believe proper checkpointing support to > >>be > >> the most challenging, and from what I can tell this is far more limited > >>in > >> what you just posted Mohammad. Could you perhaps review Curtis patches > >> based on your insights, and we can try and get these patches in shape > >>and > >> committed asap. > >> > >> Once we have the baseline functionality in place, then we can start > >> looking at the more elaborate network models. > >> > >> Does this sound reasonable? > >> > >> Thanks, > >> > >> Andreas > >> > >> On 24/06/2015 05:05, "gem5-dev on behalf of Mohammad Alian" > >> <gem5-dev-boun...@gem5.org on behalf of al...@wisc.edu> wrote: > >> > >> >Hello All, > >> > > >> >I have submitted a chain of patches which enables gem5 to simulate a > >> >cluster on multiple physical hosts: > >> > > >> >http://reviews.gem5.org/r/2909/ > >> >http://reviews.gem5.org/r/2910/ > >> >http://reviews.gem5.org/r/2912/ > >> >http://reviews.gem5.org/r/2913/ > >> >http://reviews.gem5.org/r/2914/ <http://reviews.gem5.org/r/2914/> > >> > > >> >and a patch that contains run scripts for a simple experiment: > >> >http://reviews.gem5.org/r/2915/ > >> > > >> >We have run several benchmarks using this infrastructure, including NAS > >> >parallel benchmarks (MPI) and DCBench-hadoop > >> >(http://prof.ict.ac.cn/DCBench/), > >> >and would be happy to share scripts/diskimages. > >> > > >> >We call this *pd-gem5*. *pd-gem5 *functionality is more or less the > >>same > >> >as > >> >Curtis's patch for *multi-gem5.* However, I feel *pd-gem5 *network > >>model > >> >is > >> >more thorough; it also enables modeling different network topologies. > >> >Having both set of changes together let reviewers to pick best features > >> >from both works. > >> > > >> >Thank you, > >> >Mohammad Alian > >> >_______________________________________________ > >> >gem5-dev mailing list > >> >gem5-dev@gem5.org > >> >http://m5sim.org/mailman/listinfo/gem5-dev > >> > >> > >> -- IMPORTANT NOTICE: The contents of this email and any attachments are > >> confidential and may also be privileged. If you are not the intended > >> recipient, please notify the sender immediately and do not disclose the > >> contents to any other person, use it for any purpose, or store or copy > >>the > >> information in any medium. Thank you. > >> > >> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > >> Registered in England & Wales, Company No: 2557590 > >> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 > >>9NJ, > >> Registered in England & Wales, Company No: 2548782 > >> _______________________________________________ > >> gem5-dev mailing list > >> gem5-dev@gem5.org > >> http://m5sim.org/mailman/listinfo/gem5-dev > >> > >_______________________________________________ > >gem5-dev mailing list > >gem5-dev@gem5.org > >http://m5sim.org/mailman/listinfo/gem5-dev > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > > ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2557590 > ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2548782 > _______________________________________________ > gem5-dev mailing list > gem5-dev@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev