Hi Michael, Okay, thanks for the opportunity to push some code to gem5's repo. If it's okay to you, I'm going to push the MOESI patch now and later another fixing other protocols. I'm doing right now some experiments to reproduce the problem in the remaining ones.
Thanks again, Cano. 2017-03-10 17:46 GMT+01:00 Lebeane, Michael <[email protected]>: > Hi Cano, > > > > No, I haven’t started working on it yet. I guess your further along than > me, so feel free to take over if you wish J > > > > Thanks, > > Michael > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of *Javier > Cano Cano > *Sent:* Friday, March 10, 2017 10:17 AM > > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] Outstanding DMA requests and > MOESI_CMP_directory protocol > > > > Hi again Michael, > > Did you push the patch? I'm asking just to avoid push the same patch > because I was working on that too. > > Yes, I found this problem in others protocols. I can't remember the > protocols right now (I wrote them on my lab notebook, but I forgot it). The > Monday I will tell you. > > Cano. > > > > 2017-03-10 16:47 GMT+01:00 Lebeane, Michael <[email protected]>: > > Hi Cano, > > > > No problem, happy to help! > > > > Yeah, cur_state should really be in the TBE to allow multiple in-flight > requests, but that queue overflow panic does not appear to be an easy fix. > I think we should just push out this bug fix for now with a comment about > what we observed when moving cur_state into the TBE. > > > > I can go ahead and make this patch. I also want to check if any of the > other protocols have similar bugs when booting a full system image (unless > you already checked this in your experiments). > > > > Thanks, > > Michael > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of *Javier > Cano Cano > *Sent:* Friday, March 10, 2017 4:54 AM > > > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] Outstanding DMA requests and > MOESI_CMP_directory protocol > > > > Hi Michael, > > In the original file the cur_state variable isn't part of TBE's stucture, > I thought that should be on it, so I moved into TBE's structure. My bad. > That's the only additional difference, I didn't post that on my previous > email, sorry. > > I tested your last propose and seems to work fine for me to. I made some > experiments and all of them works, so I think that we solved the problem. > Probably we should try to push this changes to gem5's repo. There is at > least one more guy reporting this problem. > > Anyway, thanks a lot for your emails and your time Michael, really helped > me a lot, I hope that this conversation will help to other users as well. > > > > Best. > > Cano. > > > > 2017-03-09 20:38 GMT+01:00 Lebeane, Michael <[email protected]>: > > Hi Cano, > > > > I downloaded the x86 binaries and was able to reproduce the transition > error. Sorry I forgot the wakeups in my original suggestion, but it seems > like you figured out what I was trying to do anyway J. > > > > After implementing the stalls and wakeups exactly as you did, it seems to > work fine for me. > > > > I was able to replicate the overflow problem if I move the cur_state > variable to the TBEs so that the protocol can support more than one > outstanding DMA transaction. There is no backpressure being applied when > you are doing streaming writes from the DMA controller; the directory just > plops them in the packet queue to memory and allows the DMA controller to > send more, which eventually triggers the assertion. However, if I leave > cur_state alone and just fix the transition bug, it appears to work fine. > > > > Do you have any other modifications to the protocol besides what you show > on the previous email? > > > > -Michael > > > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of *Javier > Cano Cano > *Sent:* Wednesday, March 8, 2017 4:24 AM > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] Outstanding DMA requests and > MOESI_CMP_directory protocol > > > > Hi Michael, > > Thanks a lot for you response, I appreciate it. > > The error is reproducible even with the binaries provided by gem5's wiki: > http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2 > > I tried your propose but I getting this error: > > hda: ide_dma_sff_timer_expiry: DMA status (0x21) > hda: DMA timeout error > hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > hda: possibly failed opcode: 0xc8 > hda: ide_dma_sff_timer_expiry: DMA status (0x21) > hda: DMA timeout error > hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > hda: possibly failed opcode: 0xc8 > > This is because stall_and_wait() function push the messages into > dmaRequestQueue_in but the messages are never pulled out. The simulation > boots up after a huge amount of time, but it doesn't looks to be correct. > > I took a look into another protocol files and I added this lines: > > > action(zz_stallAndWaitRequestQueue, "zz", desc="...") { > stall_and_wait(dmaRequestQueue_in, address); > } > > > > * action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all > dependents") { wakeUpAllBuffers(); }* > > transition(BUSY_RD, Data, READY) { > t_updateTBEData; > d_dataCallbackFromTBE; > w_deallocateTBE; > //u_updateAckCount; > //o_checkForCompletion; > p_popResponseQueue; > *wkad_wakeUpAllDependents;* > } > > transition(BUSY_RD, All_Acks, READY) { > d_dataCallbackFromTBE; > //u_sendExclusiveUnblockToDir; > w_deallocateTBE; > p_popTriggerQueue; > *wkad_wakeUpAllDependents;* > } > > transition(BUSY_WR, All_Acks, READY) { > a_ackCallback; > u_sendExclusiveUnblockToDir; > w_deallocateTBE; > p_popTriggerQueue; > *wkad_wakeUpAllDependents;* > } > > transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) { > zz_stallAndWaitRequestQueue; > } > > With this changes, the msgs are pulled from queues at some point. However, > the queues still overflow and gem5 shows this error: > > > panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 > packets > Memory Usage: 1287832 KBytes > Program aborted at tick 5224739073500 > > I'm missing something, maybe a transition where the should be awaken as > well, but I can't figure it out. > Any suggestions? > > > > Thanks. > > Cano. > > > > > > 2017-03-02 23:08 GMT+01:00 Lebeane, Michael <[email protected]>: > > Hi Cano, > > > > I can’t test this as I don’t have your binaries to reproduce the problem, > but do adding these lines in MOESI_CMP_directory-dma.sm fix your problem? > > > > action(zz_stallAndWaitRequestQueue, "zz", desc="...") { > > stall_and_wait(dmaRequestQueue_in, address); > > } > > > > transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) { > > zz_stallAndWaitRequestQueue; > > } > > > > Thanks, > > Michael > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of *Javier > Cano Cano > *Sent:* Wednesday, March 1, 2017 10:23 AM > *To:* [email protected] > *Subject:* [gem5-users] Outstanding DMA requests and MOESI_CMP_directory > protocol > > > > Hi everybody, > > Some days ago, I updated my gem5 version in order to test Garnet2.0. But I > found that MOESI_CMP_directory protocol wasn't working. As far as I know, > the problem comes from the patch http://repo.gem5.org/gem5?cmd= > changeset;node=0bf388858d1e This patch allows outstanding DMA requests. > Some protocols had been updated to support this new feature but > MOESI_CMP_directory, as well as others, doesn't. > > I'm using the following command to build Gem5: > > scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory RUBY=True -j30 > > To run the simulation, the following instructions had been used: > > ./build/X86/gem5.opt configs/example/fs.py --ruby --kernel=/home/cano/gem5/ > system/binaries/x86_64-vmlinux-3.4.112.smp --disk-image=/home/cano/ > curgem5/canolab/system/disks/gentoo.img > > > > The error that gem5 shows when I try to run a simulation is: > > panic: Invalid transition > system.ruby.dma_cntrl0 time: 9702290051 <%28970%29%20229-0051> addr: > 504500288 event: WriteRequest state: BUSY_WR > @ tick 4851145025500 > > [doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135] > > > The problem has something to do with file MOESI_CMP_directory-dma.sm which > doesn't support the state changes introduced on patch 0bf388858d1e. > > > Did someone have this problem? > > I tried to update the protocol, but no luck, I get a error saying that the > queues have more than 100 messages stored on it. > > > Does anyone have the MOESI_CMP_directory files modified to support this > new feature? > > > > In order to temporally fix the problem, I rolled back the changes > introduced on the mentioned patch. > > Thanks for your time. > > Cano. > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
