Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Malek Musleh
Hi Brad,

I tested your latest changeset, and it seems that it 'solves' the
handleResponse error I was getting when running 3 or more cores, but
the dma_expiry error is still there.

Such that, now the error is consistent, no matter what number of cores
I try to run with:

For more information see: http://www.m5sim.org/warn/3e0eccba
panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1
 @ cycle 62411238889001
[doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, line 323]
Memory Usage: 382600 KBytes

- M5 Terminal ---
hda: max request size: 128KiB
hda: 101808 sectors (52 MB), CHS=101/16/63
 hda:4hda: dma_timer_expiry: dma status == 0x65
hda: DMA interrupt recovery
hda: lost interrupt
 unknown partition table
hdb: max request size: 128KiB
hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
 hdb:4hdb: dma_timer_expiry: dma status == 0x65
hdb: DMA interrupt recovery
hdb: lost interrupt

The panic error seems to suggest an inconsistent DMA state, so I tried
reverting to an older changeset (before DMA changes were pushed out)
such as 7936, and even 7930 but no such luck.

The changeset that I know works from last week or so is changeset
7842. Looking at the changset summaries between 7842 and 7930 seem to
indicate a lot of changes 'unrelated' to the DMA, such as O3,
InOrderCPU, and x86 changes. That being said, I did not do a diff on
those intermediate changesets to verify that maybe a related file was
slightly modified in the process.

I might be able to spend some more time trying changesets till I
narrow down which one its coming from, but maybe the new panic message
might give you some indication on how to fix it?

(I think the panic messaged appeared now and not before because I let
the simulation terminate itself when running overnight as opposed to
me killing it once I saw the dma_expiry message on the M5 Terminal).

Malek

On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad brad.beckm...@amd.com wrote:
 Hi Malek,

 Yes, thanks for letting us know.  I'm pretty sure I know what the problem is. 
  Previously, if a SC operation failed, the RubyPort would convert the request 
 packet to a response packet, bypassed writing the functional view of memory, 
 and pass it back up to the CPU.  In my most recent patches I generalized the 
 mechanism that converts request packets to response packets and avoids 
 writing functional memory.  However, I forgot to remove the duplicate request 
 to response conversion for failed SC requests.  Therefore, I bet you are 
 encounter that assertion error on that duplicate call.  It should be a simple 
 one line change that fixes your problem.  I'll push it momentarily and it 
 would be great if you could confirm that my change does indeed fix your 
 problem.

 Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Wednesday, February 09, 2011 3:54 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets

 Thanks for letting us know. If it wouldn't be too much trouble, could you
 please try some other changesets near the one that isn't working and try to
 determine which one specifically broke things? A bunch of changes went in
 recently so it would be helpful to narrow things down. I'm not very involved
 with Ruby right now personally, but I assume that would be useful
 information for the people that are.

 Gabe

 On 02/09/11 14:51, Malek Musleh wrote:
  Hello,
 
  I first started using the Ruby Model in M5  about a week or so ago,
  and was able to boot in FS mode (up to 64 cores once applying the
  BigTsunami patches).
 
  In order to keep up with the changes in the Ruby code, I have started
  fetching recent updates from the devrepo.
 
  However, in fetching the updates to the recent changesets (from the
  last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
  and MOESI_CMP_directory.
 
  If running 2 cores or less I get this at the terminal screen after
  letting it run for some time:
 
  hda: M5 IDE Disk, ATA DISK drive
  hdb: M5 IDE Disk, ATA DISK drive
  hda: UDMA/33 mode selected
  hdb: UDMA/33 mode selected
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for probing
  all legacy ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 101808 sectors (52 MB), CHS=101/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  --- problem
 
 
  When running 3 or more cores, I get the following assertion failure:
 
 
  info: kernel located at:
  /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
  Listening for system connection on port 3456
        0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
  00:00:00 2009
  0: system.remote_gdb.listener: listening for remote gdb #0

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Beckmann, Brad
H Malek,

Hmm...I have never seen that type of error before.  As you mentioned, I don't 
think any of my recent patches changed how DMA is executed for ALPHA_FS.

How long does it take for you to encounter the error?  It would be great if you 
could tell me how I can reproduce the error.  I would like to look at this in 
more detail and get a protocol trace of what is going on.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 5:05 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
 Hi Brad,
 
 I tested your latest changeset, and it seems that it 'solves' the
 handleResponse error I was getting when running 3 or more cores, but the
 dma_expiry error is still there.
 
 Such that, now the error is consistent, no matter what number of cores I try
 to run with:
 
 For more information see: http://www.m5sim.org/warn/3e0eccba
 panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
 62411238889001
 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
 line 323] Memory Usage: 382600 KBytes
 
 - M5 Terminal ---
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
  hdb:4hdb: dma_timer_expiry: dma status == 0x65
 hdb: DMA interrupt recovery
 hdb: lost interrupt
 
 The panic error seems to suggest an inconsistent DMA state, so I tried
 reverting to an older changeset (before DMA changes were pushed out)
 such as 7936, and even 7930 but no such luck.
 
 The changeset that I know works from last week or so is changeset 7842.
 Looking at the changset summaries between 7842 and 7930 seem to indicate
 a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
 changes. That being said, I did not do a diff on those intermediate changesets
 to verify that maybe a related file was slightly modified in the process.
 
 I might be able to spend some more time trying changesets till I narrow down
 which one its coming from, but maybe the new panic message might give
 you some indication on how to fix it?
 
 (I think the panic messaged appeared now and not before because I let the
 simulation terminate itself when running overnight as opposed to me killing it
 once I saw the dma_expiry message on the M5 Terminal).
 
 Malek
 
 On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  Hi Malek,
 
  Yes, thanks for letting us know.  I'm pretty sure I know what the problem
 is.  Previously, if a SC operation failed, the RubyPort would convert the
 request packet to a response packet, bypassed writing the functional view of
 memory, and pass it back up to the CPU.  In my most recent patches I
 generalized the mechanism that converts request packets to response
 packets and avoids writing functional memory.  However, I forgot to remove
 the duplicate request to response conversion for failed SC
 requests.  Therefore, I bet you are encounter that assertion error on that
 duplicate call.  It should be a simple one line change that fixes your
 problem.  I'll push it momentarily and it would be great if you could confirm
 that my change does indeed fix your problem.
 
  Brad
 
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Gabe Black
  Sent: Wednesday, February 09, 2011 3:54 PM
  To: M5 Developer List
  Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
  Thanks for letting us know. If it wouldn't be too much trouble, could
  you please try some other changesets near the one that isn't working
  and try to determine which one specifically broke things? A bunch of
  changes went in recently so it would be helpful to narrow things
  down. I'm not very involved with Ruby right now personally, but I
  assume that would be useful information for the people that are.
 
  Gabe
 
  On 02/09/11 14:51, Malek Musleh wrote:
   Hello,
  
   I first started using the Ruby Model in M5  about a week or so ago,
   and was able to boot in FS mode (up to 64 cores once applying the
   BigTsunami patches).
  
   In order to keep up with the changes in the Ruby code, I have
   started fetching recent updates from the devrepo.
  
   However, in fetching the updates to the recent changesets (from the
   last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
   and MOESI_CMP_directory.
  
   If running 2 cores or less I get this at the terminal screen after
   letting it run for some time:
  
   hda: M5 IDE Disk, ATA DISK drive
   hdb: M5 IDE Disk, ATA DISK drive
   hda: UDMA/33 mode selected
   hdb: UDMA/33 mode selected
   ide0 at 0x8410-0x8417,0x8422 on irq 31
   ide1 at 0x8418

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Malek Musleh
 on irq 31
 ide1 at 0x8418-0x841f,0x8426 on irq 31
 ide_generic: please use probe_mask=0x3f module parameter for probing
 all legacy ISA IDE ports
 ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
 ide3 at 0x170-0x177,0x376 on irq 15
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63

 Is it possible to generate a trace for Ruby in M5 the way it is for
 Ruby in GEMS like something of this sort:

 http://www.cs.wisc.edu/gems/doc/gems-wiki/moin.cgi/How_do_I_understand_a_Protocol

 ?

 Let me know if you need anymore information.

 Malek

 On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad brad.beckm...@amd.com wrote:
 H Malek,

 Hmm...I have never seen that type of error before.  As you mentioned, I 
 don't think any of my recent patches changed how DMA is executed for 
 ALPHA_FS.

 How long does it take for you to encounter the error?  It would be great if 
 you could tell me how I can reproduce the error.  I would like to look at 
 this in more detail and get a protocol trace of what is going on.

 Thanks,

 Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 5:05 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets

 Hi Brad,

 I tested your latest changeset, and it seems that it 'solves' the
 handleResponse error I was getting when running 3 or more cores, but the
 dma_expiry error is still there.

 Such that, now the error is consistent, no matter what number of cores I try
 to run with:

 For more information see: http://www.m5sim.org/warn/3e0eccba
 panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
 62411238889001
 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
 line 323] Memory Usage: 382600 KBytes

 - M5 Terminal ---
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
  hdb:4hdb: dma_timer_expiry: dma status == 0x65
 hdb: DMA interrupt recovery
 hdb: lost interrupt

 The panic error seems to suggest an inconsistent DMA state, so I tried
 reverting to an older changeset (before DMA changes were pushed out)
 such as 7936, and even 7930 but no such luck.

 The changeset that I know works from last week or so is changeset 7842.
 Looking at the changset summaries between 7842 and 7930 seem to indicate
 a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
 changes. That being said, I did not do a diff on those intermediate 
 changesets
 to verify that maybe a related file was slightly modified in the process.

 I might be able to spend some more time trying changesets till I narrow down
 which one its coming from, but maybe the new panic message might give
 you some indication on how to fix it?

 (I think the panic messaged appeared now and not before because I let the
 simulation terminate itself when running overnight as opposed to me killing 
 it
 once I saw the dma_expiry message on the M5 Terminal).

 Malek

 On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  Hi Malek,
 
  Yes, thanks for letting us know.  I'm pretty sure I know what the problem
 is.  Previously, if a SC operation failed, the RubyPort would convert the
 request packet to a response packet, bypassed writing the functional view of
 memory, and pass it back up to the CPU.  In my most recent patches I
 generalized the mechanism that converts request packets to response
 packets and avoids writing functional memory.  However, I forgot to remove
 the duplicate request to response conversion for failed SC
 requests.  Therefore, I bet you are encounter that assertion error on that
 duplicate call.  It should be a simple one line change that fixes your
 problem.  I'll push it momentarily and it would be great if you could 
 confirm
 that my change does indeed fix your problem.
 
  Brad
 
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Gabe Black
  Sent: Wednesday, February 09, 2011 3:54 PM
  To: M5 Developer List
  Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
  Thanks for letting us know. If it wouldn't be too much trouble, could
  you please try some other changesets near the one that isn't working
  and try to determine which one specifically broke things? A bunch of
  changes went in recently so it would be helpful to narrow things
  down. I'm not very involved with Ruby right now personally, but I
  assume that would be useful information for the people

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Malek Musleh
:

 and thats where I notice the handleResponse()

 7920:

 M5 compiled Feb 10 2011 14:49:49
 M5 revision 39c86a8306d2+ 7920+ default
 M5 started Feb 10 2011 14:53:38
 M5 executing on sherpa05
 command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
 ./configs/example/ruby\
 _fs.py -n 4 --topology Crossbar
 Global frequency set at 1 ticks per second
 info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
 Listening for system connection on port 3456
       0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 
 2009
 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
  REAL SIMULATION 
 info: Entering event queue @ 0.  Starting simulation...
 info: Launching CPU 1 @ 835461000
 info: Launching CPU 2 @ 846156000
 info: Launching CPU 3 @ 856768000
 warn: Prefetch instrutions is Alpha do not do anything
 For more information see: http://www.m5sim.org/warn/3e0eccba
 1128875500: system.terminal: attach terminal 0
 warn: Prefetch instrutions is Alpha do not do anything
 For more information see: http://www.m5sim.org/warn/3e0eccba
 m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590: void
 Packet::makeResponse(): Assertion `needsResponse()' failed.
 Program aborted at cycle 36235566500
 Aborted

 Note that I have not tested changesets 7911-7918.

 I have tested the MOESI_CMP_directory protocol on all of these with
 m5.opt. I have testes using MESI_CMP_directory for some of them and
 got the same messages.

 This is my command line:

 ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt -
 ./configs/example/ruby_fs.py -n 4 --topology Crossbar

 The error comes at about 15 minutes in to boot the kernel. Note that
 it takes a while for the io to be scheduled.

 io scheduler noop registered
 io scheduler anticipatory registered
 io scheduler deadline registered
 io scheduler cfq registered (default)

 In all cases though where the dma_expiry occurs (which does not
 include changesets 7906-7908), the last thing that appears is this:

 ide0 at 0x8410-0x8417,0x8422 on irq 31
 ide1 at 0x8418-0x841f,0x8426 on irq 31
 ide_generic: please use probe_mask=0x3f module parameter for probing
 all legacy ISA IDE ports
 ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
 ide3 at 0x170-0x177,0x376 on irq 15
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63

 Is it possible to generate a trace for Ruby in M5 the way it is for
 Ruby in GEMS like something of this sort:

 http://www.cs.wisc.edu/gems/doc/gems-wiki/moin.cgi/How_do_I_understand_a_Protocol

 ?

 Let me know if you need anymore information.

 Malek

 On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad brad.beckm...@amd.com 
 wrote:
 H Malek,

 Hmm...I have never seen that type of error before.  As you mentioned, I 
 don't think any of my recent patches changed how DMA is executed for 
 ALPHA_FS.

 How long does it take for you to encounter the error?  It would be great if 
 you could tell me how I can reproduce the error.  I would like to look at 
 this in more detail and get a protocol trace of what is going on.

 Thanks,

 Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 5:05 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets

 Hi Brad,

 I tested your latest changeset, and it seems that it 'solves' the
 handleResponse error I was getting when running 3 or more cores, but the
 dma_expiry error is still there.

 Such that, now the error is consistent, no matter what number of cores I 
 try
 to run with:

 For more information see: http://www.m5sim.org/warn/3e0eccba
 panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
 62411238889001
 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
 line 323] Memory Usage: 382600 KBytes

 - M5 Terminal ---
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
  hdb:4hdb: dma_timer_expiry: dma status == 0x65
 hdb: DMA interrupt recovery
 hdb: lost interrupt

 The panic error seems to suggest an inconsistent DMA state, so I tried
 reverting to an older changeset (before DMA changes were pushed out)
 such as 7936, and even 7930 but no such luck.

 The changeset that I know works from last week or so

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Beckmann, Brad
Ah, ok this assert problem makes sense to me.  I suspect this is one of those 
situations where unexpected normal operations proves the assert is incorrect.  
As the comment associated with patch 7906 says, I added that assert because I 
wanted to make sure that unaligned x86 cpu accesses were not passed to the Ruby 
sequencer.  However, I didn't realized at the time that DMA accesses go through 
the same path (i.e. the dma sequencer also inherits from RubyPort).  While the 
normal cpu sequencer cannot handle unaligned accesses, the dma sequencer can.  
Therefore, that assert is incorrect and should be removed.  Though I've been 
running a lot of FS simulations lately, none of them have had any DMA activity. 
 Thus I haven't encountered the error myself.  I will try to check in a fix as 
soon as I can but right now I'm having trouble connecting to m5sim.org.  As 
soon as that problem is resolved, I'll push the fix (removing the assert).

Now that should fix your problem with 7906, but I'm not sure if that will fix 
your dma_expiry error.  So do you encounter that error running .fast and/or 
.debug?  Can you provide me a call stack for the error?  Is there an easy way 
for me to reproduce it?  I doubt the topology makes a bit of difference.  And 
yes you can get a protocol trace by specifying the ProtocolTrace trace-flag.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 2:51 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets

 Ah yes sorry about that. I have 2 directories one in which is a clean copy of
 the m5 repo, and one private one in which i fetch changes into from the
 clean one. The changesets I referred to all correspond to the strictly clean 
 m5
 repo, but never the less, Here is a copy from an hg log to the changesets I
 have referred too.

 changeset:   7922:7532067f818e
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:19 2011 -0800
 summary: ruby: support to stallAndWait the mandatory queue

 changeset:   7921:351f1761765f
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:19 2011 -0800
 summary: ruby: minor fix to deadlock panic message

 changeset:   7920:39c86a8306d2
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:19 2011 -0800
 summary: boot: script that creates a checkpoint after Linux boot up

 changeset:   7919:3a02353d6e43
 user:Joel Hestness hestn...@cs.utexas.edu
 date:Sun Feb 06 22:14:19 2011 -0800
 summary: garnet: Split network power in ruby.stats


 changeset:   7910:8a92b39be50e
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: ruby: Fix RubyPort to properly handle retrys

 changeset:   7909:eee578ed2130
 user:Joel Hestness hestn...@cs.utexas.edu
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: Ruby: Fix to return cache block size to CPU for split data 
 transfer
 s

 changeset:   7908:4e83ebb67794
 user:Joel Hestness hestn...@cs.utexas.edu
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: Ruby: Add support for locked memory accesses in X86_FS

 changeset:   7907:d648b8409d4c
 user:Joel Hestness hestn...@cs.utexas.edu
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: Ruby: Update the Ruby request type names for LL/SC

 changeset:   7906:5ccd97218ca0
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: ruby: Assert for x86 misaligned access

 changeset:   7905:00ad807ed2ca
 user:Brad Beckmann brad.beckm...@amd.com
 date:Sun Feb 06 22:14:18 2011 -0800
 summary: ruby: x86 fs config support

 Malek

 On Thu, Feb 10, 2011 at 5:35 PM, Gabe Black gbl...@eecs.umich.edu
 wrote:
  Numbers like 7905 are only meaningful in a strict sense in your own
  tree since different trees might number things differently. The longer
  hex value is universal. It's possible the trees are similar enough
  that those would match, but there's no guarantee.
 
  Gabe
 
  On 02/10/11 14:26, Malek Musleh wrote:
  Hi Brad,
 
  I tested the different changesets and have narrowed down to where it
 begins.
 
  The last changeset that works (since 7842) is 7905.
 
  At 7906 this is the error:
 
  command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
  ./configs/example/ruby\
  _fs.py -n 4 --topology Crossbar
  Global frequency set at 1 ticks per second
  info: kernel located at:
  /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
  Listening for system connection on port 3456
0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
  00:00:00 2009
  0: system.remote_gdb.listener: listening for remote gdb #0 on port
  7000
  0: system.remote_gdb.listener: listening for remote gdb #1 on port
  7001
  0: system.remote_gdb.listener

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Malek Musleh
/libpython2.5.so.1.0
#20 0x7f2c1384fdac in PyRun_StringFlags ()
   from /usr/lib/libpython2.5.so.1.0
#21 0x007d2d41 in m5Main (argc=6, argv=0x7fff1c323a88)
at build/ALPHA_FS_MOESI_CMP_directory/sim/init.cc:248
#22 0x0040a717 in main (argc=6, argv=0x7fff1c323a88)
at build/ALPHA_FS_MOESI_CMP_directory/sim/main.cc:57

I thought of trying to checkpoint to a given point in the process, but
I noticed that checkpointing is not yet supported (comment in
ruby_fs.py as well as another thread on dev about this). Out of
curiousity, is that something currently in the works and or would
require a lot of time to implement?

Let me know if there is anything else that I can help with.

Malek

On Thu, Feb 10, 2011 at 7:45 PM, Beckmann, Brad brad.beckm...@amd.com wrote:
 Ah, ok this assert problem makes sense to me.  I suspect this is one of those 
 situations where unexpected normal operations proves the assert is incorrect. 
  As the comment associated with patch 7906 says, I added that assert because 
 I wanted to make sure that unaligned x86 cpu accesses were not passed to the 
 Ruby sequencer.  However, I didn't realized at the time that DMA accesses go 
 through the same path (i.e. the dma sequencer also inherits from RubyPort).  
 While the normal cpu sequencer cannot handle unaligned accesses, the dma 
 sequencer can.  Therefore, that assert is incorrect and should be removed.  
 Though I've been running a lot of FS simulations lately, none of them have 
 had any DMA activity.  Thus I haven't encountered the error myself.  I will 
 try to check in a fix as soon as I can but right now I'm having trouble 
 connecting to m5sim.org.  As soon as that problem is resolved, I'll push the 
 fix (removing the assert).

 Now that should fix your problem with 7906, but I'm not sure if that will fix 
 your dma_expiry error.  So do you encounter that error running .fast and/or 
 .debug?  Can you provide me a call stack for the error?  Is there an easy way 
 for me to reproduce it?  I doubt the topology makes a bit of difference.  And 
 yes you can get a protocol trace by specifying the ProtocolTrace trace-flag.

 Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 2:51 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets

 Ah yes sorry about that. I have 2 directories one in which is a clean copy of
 the m5 repo, and one private one in which i fetch changes into from the
 clean one. The changesets I referred to all correspond to the strictly clean 
 m5
 repo, but never the less, Here is a copy from an hg log to the changesets I
 have referred too.

 changeset:   7922:7532067f818e
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:19 2011 -0800
 summary:     ruby: support to stallAndWait the mandatory queue

 changeset:   7921:351f1761765f
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:19 2011 -0800
 summary:     ruby: minor fix to deadlock panic message

 changeset:   7920:39c86a8306d2
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:19 2011 -0800
 summary:     boot: script that creates a checkpoint after Linux boot up

 changeset:   7919:3a02353d6e43
 user:        Joel Hestness hestn...@cs.utexas.edu
 date:        Sun Feb 06 22:14:19 2011 -0800
 summary:     garnet: Split network power in ruby.stats


 changeset:   7910:8a92b39be50e
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     ruby: Fix RubyPort to properly handle retrys

 changeset:   7909:eee578ed2130
 user:        Joel Hestness hestn...@cs.utexas.edu
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     Ruby: Fix to return cache block size to CPU for split data 
 transfer
 s

 changeset:   7908:4e83ebb67794
 user:        Joel Hestness hestn...@cs.utexas.edu
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     Ruby: Add support for locked memory accesses in X86_FS

 changeset:   7907:d648b8409d4c
 user:        Joel Hestness hestn...@cs.utexas.edu
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     Ruby: Update the Ruby request type names for LL/SC

 changeset:   7906:5ccd97218ca0
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     ruby: Assert for x86 misaligned access

 changeset:   7905:00ad807ed2ca
 user:        Brad Beckmann brad.beckm...@amd.com
 date:        Sun Feb 06 22:14:18 2011 -0800
 summary:     ruby: x86 fs config support

 Malek

 On Thu, Feb 10, 2011 at 5:35 PM, Gabe Black gbl...@eecs.umich.edu
 wrote:
  Numbers like 7905 are only meaningful in a strict sense in your own
  tree since different trees might number things differently. The longer
  hex value is universal. It's possible the trees are similar enough
  that those would match, but there's no guarantee

[m5-dev] Ruby FS Fails with recent Changesets

2011-02-09 Thread Malek Musleh
Hello,

I first started using the Ruby Model in M5  about a week or so ago,
and was able to boot in FS mode (up to 64 cores once applying the
BigTsunami patches).

In order to keep up with the changes in the Ruby code, I have started
fetching recent updates from the devrepo.

However, in fetching the updates to the recent changesets (from the
last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
and MOESI_CMP_directory.

If running 2 cores or less I get this at the terminal screen after
letting it run for some time:

hda: M5 IDE Disk, ATA DISK drive
hdb: M5 IDE Disk, ATA DISK drive
hda: UDMA/33 mode selected
hdb: UDMA/33 mode selected
ide0 at 0x8410-0x8417,0x8422 on irq 31
ide1 at 0x8418-0x841f,0x8426 on irq 31
ide_generic: please use probe_mask=0x3f module parameter for probing
all legacy ISA IDE ports
ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
ide3 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 101808 sectors (52 MB), CHS=101/16/63
 hda:4hda: dma_timer_expiry: dma status == 0x65
--- problem


When running 3 or more cores, I get the following assertion failure:


info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
Listening for system connection on port 3456
  0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 2009
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
 REAL SIMULATION 
info: Entering event queue @ 0.  Starting simulation...
info: Launching CPU 1 @ 834794000
info: Launching CPU 2 @ 845489000
info: Launching CPU 3 @ 856101000
m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: void
Packet::makeResponse(): Assertion `needsResponse()' failed.
Program aborted at cycle 97716
Aborted

The top of the tree is this last changeset:

changeset:   7939:215c8be67063
tag: tip
user:Brad Beckmann brad.beckm...@amd.com
date:Tue Feb 08 18:07:54 2011 -0800
summary: regess: protocol regression tester updates

I am not sure if those whom it concern are aware of it or not, or if
there will be a soon to be updated changeset already in the works for
this or not, but I figured I would bring it to your attention.

Malek
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-09 Thread Gabe Black
Thanks for letting us know. If it wouldn't be too much trouble, could
you please try some other changesets near the one that isn't working and
try to determine which one specifically broke things? A bunch of changes
went in recently so it would be helpful to narrow things down. I'm not
very involved with Ruby right now personally, but I assume that would be
useful information for the people that are.

Gabe

On 02/09/11 14:51, Malek Musleh wrote:
 Hello,

 I first started using the Ruby Model in M5  about a week or so ago,
 and was able to boot in FS mode (up to 64 cores once applying the
 BigTsunami patches).

 In order to keep up with the changes in the Ruby code, I have started
 fetching recent updates from the devrepo.

 However, in fetching the updates to the recent changesets (from the
 last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
 and MOESI_CMP_directory.

 If running 2 cores or less I get this at the terminal screen after
 letting it run for some time:

 hda: M5 IDE Disk, ATA DISK drive
 hdb: M5 IDE Disk, ATA DISK drive
 hda: UDMA/33 mode selected
 hdb: UDMA/33 mode selected
 ide0 at 0x8410-0x8417,0x8422 on irq 31
 ide1 at 0x8418-0x841f,0x8426 on irq 31
 ide_generic: please use probe_mask=0x3f module parameter for probing
 all legacy ISA IDE ports
 ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
 ide3 at 0x170-0x177,0x376 on irq 15
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 --- problem


 When running 3 or more cores, I get the following assertion failure:


 info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
 Listening for system connection on port 3456
   0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00 
 2009
 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
  REAL SIMULATION 
 info: Entering event queue @ 0.  Starting simulation...
 info: Launching CPU 1 @ 834794000
 info: Launching CPU 2 @ 845489000
 info: Launching CPU 3 @ 856101000
 m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: void
 Packet::makeResponse(): Assertion `needsResponse()' failed.
 Program aborted at cycle 97716
 Aborted

 The top of the tree is this last changeset:

 changeset:   7939:215c8be67063
 tag: tip
 user:Brad Beckmann brad.beckm...@amd.com
 date:Tue Feb 08 18:07:54 2011 -0800
 summary: regess: protocol regression tester updates

 I am not sure if those whom it concern are aware of it or not, or if
 there will be a soon to be updated changeset already in the works for
 this or not, but I figured I would bring it to your attention.

 Malek
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-09 Thread Beckmann, Brad
Hi Malek,

Yes, thanks for letting us know.  I'm pretty sure I know what the problem is.  
Previously, if a SC operation failed, the RubyPort would convert the request 
packet to a response packet, bypassed writing the functional view of memory, 
and pass it back up to the CPU.  In my most recent patches I generalized the 
mechanism that converts request packets to response packets and avoids writing 
functional memory.  However, I forgot to remove the duplicate request to 
response conversion for failed SC requests.  Therefore, I bet you are encounter 
that assertion error on that duplicate call.  It should be a simple one line 
change that fixes your problem.  I'll push it momentarily and it would be great 
if you could confirm that my change does indeed fix your problem.

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Wednesday, February 09, 2011 3:54 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
 Thanks for letting us know. If it wouldn't be too much trouble, could you
 please try some other changesets near the one that isn't working and try to
 determine which one specifically broke things? A bunch of changes went in
 recently so it would be helpful to narrow things down. I'm not very involved
 with Ruby right now personally, but I assume that would be useful
 information for the people that are.
 
 Gabe
 
 On 02/09/11 14:51, Malek Musleh wrote:
  Hello,
 
  I first started using the Ruby Model in M5  about a week or so ago,
  and was able to boot in FS mode (up to 64 cores once applying the
  BigTsunami patches).
 
  In order to keep up with the changes in the Ruby code, I have started
  fetching recent updates from the devrepo.
 
  However, in fetching the updates to the recent changesets (from the
  last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
  and MOESI_CMP_directory.
 
  If running 2 cores or less I get this at the terminal screen after
  letting it run for some time:
 
  hda: M5 IDE Disk, ATA DISK drive
  hdb: M5 IDE Disk, ATA DISK drive
  hda: UDMA/33 mode selected
  hdb: UDMA/33 mode selected
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for probing
  all legacy ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 101808 sectors (52 MB), CHS=101/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  --- problem
 
 
  When running 3 or more cores, I get the following assertion failure:
 
 
  info: kernel located at:
  /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
  Listening for system connection on port 3456
0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
  00:00:00 2009
  0: system.remote_gdb.listener: listening for remote gdb #0 on port
  7000
  0: system.remote_gdb.listener: listening for remote gdb #1 on port
  7001
  0: system.remote_gdb.listener: listening for remote gdb #2 on port
  7002
  0: system.remote_gdb.listener: listening for remote gdb #3 on port
  7003
   REAL SIMULATION 
  info: Entering event queue @ 0.  Starting simulation...
  info: Launching CPU 1 @ 834794000
  info: Launching CPU 2 @ 845489000
  info: Launching CPU 3 @ 856101000
  m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: void
  Packet::makeResponse(): Assertion `needsResponse()' failed.
  Program aborted at cycle 97716
  Aborted
 
  The top of the tree is this last changeset:
 
  changeset:   7939:215c8be67063
  tag: tip
  user:Brad Beckmann brad.beckm...@amd.com
  date:Tue Feb 08 18:07:54 2011 -0800
  summary: regess: protocol regression tester updates
 
  I am not sure if those whom it concern are aware of it or not, or if
  there will be a soon to be updated changeset already in the works for
  this or not, but I figured I would bring it to your attention.
 
  Malek
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev