Hi Cano,

I noticed your patch for this issue is ready to go but not yet committed:
https://gem5-review.googlesource.com/#/c/2380/

Just wanted to check up on the status.  I recall there were some users waiting 
for the fix.

Also you mentioned some follow up patches for other protocols.  Had any luck 
with those?

Thanks!
Michael

From: gem5-users [mailto:[email protected]] On Behalf Of Javier Cano 
Cano
Sent: Friday, March 10, 2017 10:58 AM
To: gem5 users mailing list <[email protected]>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory 
protocol

Hi Michael,
Okay, thanks for the opportunity to push some code to gem5's repo. If it's okay 
to you, I'm going to push the MOESI patch now and later another fixing other 
protocols. I'm doing right now some experiments to reproduce the problem in the 
remaining ones.
Thanks again,
Cano.

2017-03-10 17:46 GMT+01:00 Lebeane, Michael 
<[email protected]<mailto:[email protected]>>:
Hi Cano,

No, I haven’t started working on it yet.  I guess your further along than me, 
so feel free to take over if you wish ☺

Thanks,
Michael

From: gem5-users 
[mailto:[email protected]<mailto:[email protected]>] On 
Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 10:17 AM

To: gem5 users mailing list <[email protected]<mailto:[email protected]>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory 
protocol

Hi again Michael,
Did you push the patch? I'm asking just to avoid push the same patch because I 
was working on that too.
Yes, I found this problem in others protocols. I can't remember the protocols 
right now (I wrote them on my lab notebook, but I forgot it). The Monday I will 
tell you.
Cano.

2017-03-10 16:47 GMT+01:00 Lebeane, Michael 
<[email protected]<mailto:[email protected]>>:
Hi Cano,

No problem, happy to help!

Yeah, cur_state should really be in the TBE to allow multiple in-flight 
requests, but that queue overflow panic does not appear to be an easy fix.  I 
think we should just push out this bug fix for now with a comment about what we 
observed when moving cur_state into the TBE.

I can go ahead and make this patch.  I also want to check if any of the other 
protocols have similar bugs when booting a full system image (unless you 
already checked this in your experiments).

Thanks,
Michael

From: gem5-users 
[mailto:[email protected]<mailto:[email protected]>] On 
Behalf Of Javier Cano Cano
Sent: Friday, March 10, 2017 4:54 AM

To: gem5 users mailing list <[email protected]<mailto:[email protected]>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory 
protocol

Hi Michael,
In the original file the cur_state variable isn't part of TBE's stucture, I 
thought that should be on it, so I moved into TBE's structure. My bad. That's 
the only additional difference, I didn't post that on my previous email, sorry.

I tested your last propose and seems to work fine for me to. I made some 
experiments and all of them works, so I think that we solved the problem. 
Probably we should try to push this changes to gem5's repo. There is at least 
one more guy reporting this problem.

Anyway, thanks a lot for your emails and your time Michael, really helped me a 
lot, I hope that this conversation will help to other users as well.

Best.
Cano.

2017-03-09 20:38 GMT+01:00 Lebeane, Michael 
<[email protected]<mailto:[email protected]>>:
Hi Cano,

I downloaded the x86 binaries and was able to reproduce the transition error.  
Sorry I forgot the wakeups in my original suggestion, but it seems like you 
figured out what I was trying to do anyway ☺.

After implementing the stalls and wakeups exactly as you did, it seems to work 
fine for me.

I was able to replicate the overflow problem if I move the cur_state variable 
to the TBEs so that the protocol can support more than one outstanding DMA 
transaction.  There is no backpressure being applied when you are doing 
streaming writes from the DMA controller; the directory just plops them in the 
packet queue to memory and allows the DMA controller to send more, which 
eventually triggers the assertion.  However, if I leave cur_state alone and 
just fix the transition bug, it appears to work fine.

Do you have any other modifications to the protocol besides what you show on 
the previous email?

-Michael


From: gem5-users 
[mailto:[email protected]<mailto:[email protected]>] On 
Behalf Of Javier Cano Cano
Sent: Wednesday, March 8, 2017 4:24 AM
To: gem5 users mailing list <[email protected]<mailto:[email protected]>>
Subject: Re: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory 
protocol

Hi Michael,

Thanks a lot for you response, I appreciate it.

The error is reproducible even with the binaries provided by gem5's wiki: 
http://www.m5sim.org/dist/current/x86/x86-system.tar.bz2

I tried your propose but I getting this error:

hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8
hda: ide_dma_sff_timer_expiry: DMA status (0x21)
hda: DMA timeout error
hda: dma timeout error: status=0x50 { DriveReady SeekComplete }
hda: possibly failed opcode: 0xc8

This is because stall_and_wait() function push the messages into 
dmaRequestQueue_in but the messages are never pulled out. The simulation boots 
up after a huge amount of time, but it doesn't looks to be correct.

I took a look into another protocol files and I added this lines:

  action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
    stall_and_wait(dmaRequestQueue_in, address);
  }

  action(wkad_wakeUpAllDependents, "wkad", desc="wake-up all dependents") {
    wakeUpAllBuffers();
  }

transition(BUSY_RD, Data, READY) {
    t_updateTBEData;
    d_dataCallbackFromTBE;
    w_deallocateTBE;
    //u_updateAckCount;
    //o_checkForCompletion;
    p_popResponseQueue;
    wkad_wakeUpAllDependents;
  }

  transition(BUSY_RD, All_Acks, READY) {
    d_dataCallbackFromTBE;
    //u_sendExclusiveUnblockToDir;
    w_deallocateTBE;
    p_popTriggerQueue;
    wkad_wakeUpAllDependents;
  }
  transition(BUSY_WR, All_Acks, READY) {
    a_ackCallback;
    u_sendExclusiveUnblockToDir;
    w_deallocateTBE;
    p_popTriggerQueue;
    wkad_wakeUpAllDependents;
  }

  transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
   zz_stallAndWaitRequestQueue;
  }
With this changes, the msgs are pulled from queues at some point. However, the 
queues still overflow and gem5 shows this error:

panic: Packet queue system.ruby.dir_cntrl0.memory- has grown beyond 100 packets
Memory Usage: 1287832 KBytes
Program aborted at tick 5224739073500
I'm missing something, maybe a transition where the should be awaken as well, 
but I can't figure it out.
Any suggestions?

Thanks.
Cano.


2017-03-02 23:08 GMT+01:00 Lebeane, Michael 
<[email protected]<mailto:[email protected]>>:
Hi Cano,

I can’t test this as I don’t have your binaries to reproduce the problem, but 
do adding these lines in MOESI_CMP_directory-dma.sm fix your problem?

action(zz_stallAndWaitRequestQueue, "zz", desc="...") {
    stall_and_wait(dmaRequestQueue_in, address);
}

transition({BUSY_RD,BUSY_WR}, {ReadRequest,WriteRequest}) {
    zz_stallAndWaitRequestQueue;
}

Thanks,
Michael

From: gem5-users 
[mailto:[email protected]<mailto:[email protected]>] On 
Behalf Of Javier Cano Cano
Sent: Wednesday, March 1, 2017 10:23 AM
To: [email protected]<mailto:[email protected]>
Subject: [gem5-users] Outstanding DMA requests and MOESI_CMP_directory protocol

Hi everybody,
Some days ago, I updated my gem5 version in order to test Garnet2.0. But I 
found that MOESI_CMP_directory protocol wasn't working. As far as I know, the 
problem comes from the patch 
http://repo.gem5.org/gem5?cmd=changeset;node=0bf388858d1e This patch allows 
outstanding DMA requests. Some protocols had been updated to support this new 
feature but MOESI_CMP_directory, as well as others, doesn't.
I'm using the following command to build Gem5:
scons ./build/X86/gem5.opt PROTOCOL=MOESI_CMP_directory  RUBY=True -j30
To run the simulation, the following instructions had been used:

./build/X86/gem5.opt configs/example/fs.py --ruby 
--kernel=/home/cano/gem5/system/binaries/x86_64-vmlinux-3.4.112.smp 
--disk-image=/home/cano/curgem5/canolab/system/disks/gentoo.img


The error that gem5 shows when I try to run a simulation is:

panic: Invalid transition
system.ruby.dma_cntrl0 time: 9702290051<tel:%28970%29%20229-0051> addr: 
504500288 event: WriteRequest state: BUSY_WR
 @ tick 4851145025500
[doTransitionWorker:build/X86/mem/protocol/DMA_Transitions.cc, line 135]


The problem has something to do with file MOESI_CMP_directory-dma.sm which 
doesn't support the state changes introduced on patch 0bf388858d1e.

Did someone have this problem?
I tried to update the protocol, but no luck, I get a error saying that the 
queues have more than 100 messages stored on it.

Does anyone have the MOESI_CMP_directory files modified to support this new 
feature?

In order to temporally fix the problem, I rolled back the changes introduced on 
the mentioned patch.
Thanks for your time.
Cano.

_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to