H Malek, Hmm...I have never seen that type of error before. As you mentioned, I don't think any of my recent patches changed how DMA is executed for ALPHA_FS.
How long does it take for you to encounter the error? It would be great if you could tell me how I can reproduce the error. I would like to look at this in more detail and get a protocol trace of what is going on. Thanks, Brad > -----Original Message----- > From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] > On Behalf Of Malek Musleh > Sent: Thursday, February 10, 2011 5:05 AM > To: M5 Developer List > Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets > > Hi Brad, > > I tested your latest changeset, and it seems that it 'solves' the > handleResponse error I was getting when running 3 or more cores, but the > dma_expiry error is still there. > > Such that, now the error is consistent, no matter what number of cores I try > to run with: > > For more information see: http://www.m5sim.org/warn/3e0eccba > panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @ cycle > 62411238889001 > [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, > line 323] Memory Usage: 382600 KBytes > > ------------------------- M5 Terminal ------------------- > hda: max request size: 128KiB > hda: 101808 sectors (52 MB), CHS=101/16/63 > hda:<4>hda: dma_timer_expiry: dma status == 0x65 > hda: DMA interrupt recovery > hda: lost interrupt > unknown partition table > hdb: max request size: 128KiB > hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 > hdb:<4>hdb: dma_timer_expiry: dma status == 0x65 > hdb: DMA interrupt recovery > hdb: lost interrupt > > The panic error seems to suggest an inconsistent DMA state, so I tried > reverting to an older changeset (before DMA changes were pushed out) > such as 7936, and even 7930 but no such luck. > > The changeset that I know works from last week or so is changeset 7842. > Looking at the changset summaries between 7842 and 7930 seem to indicate > a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86 > changes. That being said, I did not do a diff on those intermediate changesets > to verify that maybe a related file was slightly modified in the process. > > I might be able to spend some more time trying changesets till I narrow down > which one its coming from, but maybe the new panic message might give > you some indication on how to fix it? > > (I think the panic messaged appeared now and not before because I let the > simulation terminate itself when running overnight as opposed to me killing it > once I saw the dma_expiry message on the M5 Terminal). > > Malek > > On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad > <brad.beckm...@amd.com> wrote: > > Hi Malek, > > > > Yes, thanks for letting us know. I'm pretty sure I know what the problem > is. Previously, if a SC operation failed, the RubyPort would convert the > request packet to a response packet, bypassed writing the functional view of > memory, and pass it back up to the CPU. In my most recent patches I > generalized the mechanism that converts request packets to response > packets and avoids writing functional memory. However, I forgot to remove > the duplicate request to response conversion for failed SC > requests. Therefore, I bet you are encounter that assertion error on that > duplicate call. It should be a simple one line change that fixes your > problem. I'll push it momentarily and it would be great if you could confirm > that my change does indeed fix your problem. > > > > Brad > > > > > > > >> -----Original Message----- > >> From: m5-dev-boun...@m5sim.org [mailto:m5-dev- > boun...@m5sim.org] On > >> Behalf Of Gabe Black > >> Sent: Wednesday, February 09, 2011 3:54 PM > >> To: M5 Developer List > >> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets > >> > >> Thanks for letting us know. If it wouldn't be too much trouble, could > >> you please try some other changesets near the one that isn't working > >> and try to determine which one specifically broke things? A bunch of > >> changes went in recently so it would be helpful to narrow things > >> down. I'm not very involved with Ruby right now personally, but I > >> assume that would be useful information for the people that are. > >> > >> Gabe > >> > >> On 02/09/11 14:51, Malek Musleh wrote: > >> > Hello, > >> > > >> > I first started using the Ruby Model in M5 about a week or so ago, > >> > and was able to boot in FS mode (up to 64 cores once applying the > >> > BigTsunami patches). > >> > > >> > In order to keep up with the changes in the Ruby code, I have > >> > started fetching recent updates from the devrepo. > >> > > >> > However, in fetching the updates to the recent changesets (from the > >> > last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory > >> > and MOESI_CMP_directory. > >> > > >> > If running 2 cores or less I get this at the terminal screen after > >> > letting it run for some time: > >> > > >> > hda: M5 IDE Disk, ATA DISK drive > >> > hdb: M5 IDE Disk, ATA DISK drive > >> > hda: UDMA/33 mode selected > >> > hdb: UDMA/33 mode selected > >> > ide0 at 0x8410-0x8417,0x8422 on irq 31 > >> > ide1 at 0x8418-0x841f,0x8426 on irq 31 > >> > ide_generic: please use "probe_mask=0x3f" module parameter for > >> > probing all legacy ISA IDE ports > >> > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 > >> > ide3 at 0x170-0x177,0x376 on irq 15 > >> > hda: max request size: 128KiB > >> > hda: 101808 sectors (52 MB), CHS=101/16/63 > >> > hda:<4>hda: dma_timer_expiry: dma status == 0x65 > >> > <------------------------------------------------------- problem > >> > > >> > > >> > When running 3 or more cores, I get the following assertion failure: > >> > > >> > > >> > info: kernel located at: > >> > /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux > >> > Listening for system connection on port 3456 > >> > 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 > >> > 00:00:00 2009 > >> > 0: system.remote_gdb.listener: listening for remote gdb #0 on port > >> > 7000 > >> > 0: system.remote_gdb.listener: listening for remote gdb #1 on port > >> > 7001 > >> > 0: system.remote_gdb.listener: listening for remote gdb #2 on port > >> > 7002 > >> > 0: system.remote_gdb.listener: listening for remote gdb #3 on port > >> > 7003 > >> > **** REAL SIMULATION **** > >> > info: Entering event queue @ 0. Starting simulation... > >> > info: Launching CPU 1 @ 834794000 > >> > info: Launching CPU 2 @ 845489000 > >> > info: Launching CPU 3 @ 856101000 > >> > m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: > void > >> > Packet::makeResponse(): Assertion `needsResponse()' failed. > >> > Program aborted at cycle 977160000 > >> > Aborted > >> > > >> > The top of the tree is this last changeset: > >> > > >> > changeset: 7939:215c8be67063 > >> > tag: tip > >> > user: Brad Beckmann <brad.beckm...@amd.com> > >> > date: Tue Feb 08 18:07:54 2011 -0800 > >> > summary: regess: protocol regression tester updates > >> > > >> > I am not sure if those whom it concern are aware of it or not, or > >> > if there will be a soon to be updated changeset already in the > >> > works for this or not, but I figured I would bring it to your attention. > >> > > >> > Malek > >> > _______________________________________________ > >> > m5-dev mailing list > >> > m5-dev@m5sim.org > >> > http://m5sim.org/mailman/listinfo/m5-dev > >> > >> _______________________________________________ > >> m5-dev mailing list > >> m5-dev@m5sim.org > >> http://m5sim.org/mailman/listinfo/m5-dev > > > > > > _______________________________________________ > > m5-dev mailing list > > m5-dev@m5sim.org > > http://m5sim.org/mailman/listinfo/m5-dev > > > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org > http://m5sim.org/mailman/listinfo/m5-dev _______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev