> On Sept. 9, 2016, 2:53 p.m., Gabor Dozsa wrote:
> > I like the idea of starting the sync on a pseudo op very much. I cannot see 
> > the usefulness of stopping the sync after it started though.  Do you have a 
> > use case in mind?
> > 
> > I have a few comments below (mainly minor nits). 
> > 
> > However, I think there is also a basic issue with the CPU suspend approach. 
> > A CPU can wake up whenever there is an interrupt or even a snoop request. 
> > That should be taken care of somehow I assume.
> 
> Michael LeBeane wrote:
>     The use case for switching off would be if you have multiple regions of 
> interest in the code and no need to sync in between.  You could also just run 
> the simulation a bunch of times and move the start sync point, but I think 
> it’s much cleaner to run the app once and toggle the sync.
>     
>     I suppose a spurious wakeup is possible; I had not really thought about 
> it.  I haven't seen CPUs wakeup in practice, but I don't really have any 
> devices that would generate an interrupt/snoop when all the cores are asleep.
>     
>     One alternative way to do this would be to not suspend any of the cores 
> and modify the sync to support all the devices being on different ticks when 
> sync starts.  We would need to remember what the start time was for all the 
> participating nodes and use that time to enforce synchronization instead of 
> assuming everyone is on the same tick.  I avoided doing this since suspending 
> the CPUs until we reach the max of all nodes ticks' was easier.
> 
> Michael LeBeane wrote:
>     Before I address your specific review comments, I think it would be good 
> to resolve the high-level concern about suspending CPUs.  Do you have any 
> thoughts about the alternative approach I suggested?  I'm not a huge fan of 
> it since having all the nodes on different ticks makes correlating traces 
> between nodes very difficult.  Did you have a better approach in mind?

I agree that the alternative approach would make it difficult to correlate 
events between nodes. I would suggest to go ahead with your current CPU 
suspend/wakup approach for now. Would it be possible to issue a warning in case 
of a spurious wakeup event? If something goes wrong the warning could give a 
hint.


- Gabor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3595/#review8713
-----------------------------------------------------------


On Aug. 4, 2016, 4:51 p.m., Michael LeBeane wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3595/
> -----------------------------------------------------------
> 
> (Updated Aug. 4, 2016, 4:51 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 11561:1414a40eb1e2
> ---------------------------
> dev: Add m5 op to toggle synchronization for dist-gem5.
> This patch adds the ability for an application to request dist-gem5 to begin/
> end synchronization using an m5 op.  When toggling on sync, all nodes agree on
> the next sync point based on the maximum of all nodes' ticks.  CPUs are
> suspended until the sync point to avoid sending network messages until sync 
> has
> been enabled.  Toggling off sync acts like a global execution barrier, where
> all CPUs are disabled until every node reaches the toggle off point.  This
> avoids tricky situations such as one node hitting a toggle off followed by a
> toggle on before the other nodes hit the first toggle off.
> 
> 
> Diffs
> -----
> 
>   configs/common/Options.py 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/arch/x86/isa/decoder/two_byte_opcodes.isa 
> 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/Ethernet.py 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/dist_etherlink.cc 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/dist_iface.hh 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/dist_iface.cc 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/dist_packet.hh 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/tcp_iface.hh 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/dev/net/tcp_iface.cc 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/sim/pseudo_inst.hh 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   src/sim/pseudo_inst.cc 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   util/m5/m5op.h 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   util/m5/m5op_x86.S 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
>   util/m5/m5ops.h 91f58918a76abf1a1dedcaa70a9b95789da7b88c 
> 
> Diff: http://reviews.gem5.org/r/3595/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Michael LeBeane
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to