H Malek,

Hmm...I have never seen that type of error before.  As you mentioned, I don't 
think any of my recent patches changed how DMA is executed for ALPHA_FS.

How long does it take for you to encounter the error?  It would be great if you 
could tell me how I can reproduce the error.  I would like to look at this in 
more detail and get a protocol trace of what is going on.

Thanks,

Brad


> -----Original Message-----
> From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
> On Behalf Of Malek Musleh
> Sent: Thursday, February 10, 2011 5:05 AM
> To: M5 Developer List
> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
> 
> Hi Brad,
> 
> I tested your latest changeset, and it seems that it 'solves' the
> handleResponse error I was getting when running 3 or more cores, but the
> dma_expiry error is still there.
> 
> Such that, now the error is consistent, no matter what number of cores I try
> to run with:
> 
> For more information see: http://www.m5sim.org/warn/3e0eccba
> panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
> 62411238889001
> [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
> line 323] Memory Usage: 382600 KBytes
> 
> ------------------------- M5 Terminal -------------------
> hda: max request size: 128KiB
> hda: 101808 sectors (52 MB), CHS=101/16/63
>  hda:<4>hda: dma_timer_expiry: dma status == 0x65
> hda: DMA interrupt recovery
> hda: lost interrupt
>  unknown partition table
> hdb: max request size: 128KiB
> hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
>  hdb:<4>hdb: dma_timer_expiry: dma status == 0x65
> hdb: DMA interrupt recovery
> hdb: lost interrupt
> 
> The panic error seems to suggest an inconsistent DMA state, so I tried
> reverting to an older changeset (before DMA changes were pushed out)
> such as 7936, and even 7930 but no such luck.
> 
> The changeset that I know works from last week or so is changeset 7842.
> Looking at the changset summaries between 7842 and 7930 seem to indicate
> a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
> changes. That being said, I did not do a diff on those intermediate changesets
> to verify that maybe a related file was slightly modified in the process.
> 
> I might be able to spend some more time trying changesets till I narrow down
> which one its coming from, but maybe the new panic message might give
> you some indication on how to fix it?
> 
> (I think the panic messaged appeared now and not before because I let the
> simulation terminate itself when running overnight as opposed to me killing it
> once I saw the dma_expiry message on the M5 Terminal).
> 
> Malek
> 
> On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
> <brad.beckm...@amd.com> wrote:
> > Hi Malek,
> >
> > Yes, thanks for letting us know.  I'm pretty sure I know what the problem
> is.  Previously, if a SC operation failed, the RubyPort would convert the
> request packet to a response packet, bypassed writing the functional view of
> memory, and pass it back up to the CPU.  In my most recent patches I
> generalized the mechanism that converts request packets to response
> packets and avoids writing functional memory.  However, I forgot to remove
> the duplicate request to response conversion for failed SC
> requests.  Therefore, I bet you are encounter that assertion error on that
> duplicate call.  It should be a simple one line change that fixes your
> problem.  I'll push it momentarily and it would be great if you could confirm
> that my change does indeed fix your problem.
> >
> > Brad
> >
> >
> >
> >> -----Original Message-----
> >> From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
> boun...@m5sim.org] On
> >> Behalf Of Gabe Black
> >> Sent: Wednesday, February 09, 2011 3:54 PM
> >> To: M5 Developer List
> >> Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
> >>
> >> Thanks for letting us know. If it wouldn't be too much trouble, could
> >> you please try some other changesets near the one that isn't working
> >> and try to determine which one specifically broke things? A bunch of
> >> changes went in recently so it would be helpful to narrow things
> >> down. I'm not very involved with Ruby right now personally, but I
> >> assume that would be useful information for the people that are.
> >>
> >> Gabe
> >>
> >> On 02/09/11 14:51, Malek Musleh wrote:
> >> > Hello,
> >> >
> >> > I first started using the Ruby Model in M5  about a week or so ago,
> >> > and was able to boot in FS mode (up to 64 cores once applying the
> >> > BigTsunami patches).
> >> >
> >> > In order to keep up with the changes in the Ruby code, I have
> >> > started fetching recent updates from the devrepo.
> >> >
> >> > However, in fetching the updates to the recent changesets (from the
> >> > last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
> >> > and MOESI_CMP_directory.
> >> >
> >> > If running 2 cores or less I get this at the terminal screen after
> >> > letting it run for some time:
> >> >
> >> > hda: M5 IDE Disk, ATA DISK drive
> >> > hdb: M5 IDE Disk, ATA DISK drive
> >> > hda: UDMA/33 mode selected
> >> > hdb: UDMA/33 mode selected
> >> > ide0 at 0x8410-0x8417,0x8422 on irq 31
> >> > ide1 at 0x8418-0x841f,0x8426 on irq 31
> >> > ide_generic: please use "probe_mask=0x3f" module parameter for
> >> > probing all legacy ISA IDE ports
> >> > ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
> >> > ide3 at 0x170-0x177,0x376 on irq 15
> >> > hda: max request size: 128KiB
> >> > hda: 101808 sectors (52 MB), CHS=101/16/63
> >> >  hda:<4>hda: dma_timer_expiry: dma status == 0x65
> >> > <------------------------------------------------------- problem
> >> >
> >> >
> >> > When running 3 or more cores, I get the following assertion failure:
> >> >
> >> >
> >> > info: kernel located at:
> >> > /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
> >> > Listening for system connection on port 3456
> >> >       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
> >> > 00:00:00 2009
> >> > 0: system.remote_gdb.listener: listening for remote gdb #0 on port
> >> > 7000
> >> > 0: system.remote_gdb.listener: listening for remote gdb #1 on port
> >> > 7001
> >> > 0: system.remote_gdb.listener: listening for remote gdb #2 on port
> >> > 7002
> >> > 0: system.remote_gdb.listener: listening for remote gdb #3 on port
> >> > 7003
> >> > **** REAL SIMULATION ****
> >> > info: Entering event queue @ 0.  Starting simulation...
> >> > info: Launching CPU 1 @ 834794000
> >> > info: Launching CPU 2 @ 845489000
> >> > info: Launching CPU 3 @ 856101000
> >> > m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590:
> void
> >> > Packet::makeResponse(): Assertion `needsResponse()' failed.
> >> > Program aborted at cycle 977160000
> >> > Aborted
> >> >
> >> > The top of the tree is this last changeset:
> >> >
> >> > changeset:   7939:215c8be67063
> >> > tag:         tip
> >> > user:        Brad Beckmann <brad.beckm...@amd.com>
> >> > date:        Tue Feb 08 18:07:54 2011 -0800
> >> > summary:     regess: protocol regression tester updates
> >> >
> >> > I am not sure if those whom it concern are aware of it or not, or
> >> > if there will be a soon to be updated changeset already in the
> >> > works for this or not, but I figured I would bring it to your attention.
> >> >
> >> > Malek
> >> > _______________________________________________
> >> > m5-dev mailing list
> >> > m5-dev@m5sim.org
> >> > http://m5sim.org/mailman/listinfo/m5-dev
> >>
> >> _______________________________________________
> >> m5-dev mailing list
> >> m5-dev@m5sim.org
> >> http://m5sim.org/mailman/listinfo/m5-dev
> >
> >
> > _______________________________________________
> > m5-dev mailing list
> > m5-dev@m5sim.org
> > http://m5sim.org/mailman/listinfo/m5-dev
> >
> _______________________________________________
> m5-dev mailing list
> m5-dev@m5sim.org
> http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to