from:"Joel Hestness"

[m5-dev] Linux Kernel/Boot Time for X86_FS

2010-06-09 Thread Joel Hestness

Hi everyone,
  I am interested in helping develop X86_FS boot up and testing.
  Under X86_FS, I have been able to boot a couple different versions of the
Linux kernel (v2.6.22.9 and v2.6.28.4), but the bring up requires more than
12 hours of simulation time.  I am hoping to reduce the boot time to make it
more usable.
  I recall that the M5 patches for alpha-linux play some tricks to speed
bootup, so I tried building an x86 kernel v2.6.27 with the patches.  It
looks like many of the patches are specific to ALPHA, so (maybe
unsurprisingly) I encountered errors quickly in the build.
  I am wondering if anyone is currently working on this, or if I could get
some pointers on where to dig in.
  Thank you,
  Joel


-- 
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
 http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Linux Kernel/Boot Time for X86_FS

2010-06-10 Thread Joel Hestness

Hi Gabe and Ali,
  Thanks for the leads!  I'd love to get my hands on the x86-specific
patches if you can find them.
I have been booting to shell with m5term so far, and you're right, I need to
disable the init services.  Is it safe to disable all of them, or all under
certain runlevels?
  Thanks,
  Joel


On Wed, Jun 9, 2010 at 10:00 PM, Ali Saidi sa...@umich.edu wrote:

 Hi Joel,

 The patches do two things to improve the simulation speed. First, they
 calculate what loopsperjiffy would be given the processor frequency and
 write that value into the global variable. You can get around this by just
 passing the lpj=XX boot argument to the kernel, so this change isn't
 particularly needed anymore. The other thing they do is re-write the __delay
 to use an pseudo instruction (a made up opcode that does simulator specific
 functionality) that encodes how long the processor should sleep for. Thus
 when udelay() and nsdelay() are used in the kernel, the cpu model can just
 jump to the right time (either the end of the delay or an interrupt).

 The various other patches provide additional pseudo instructions, but none
 of them relate to performance.

 Ali

 On Jun 9, 2010, at 4:09 PM, Joel Hestness wrote:

  Hi everyone,
I am interested in helping develop X86_FS boot up and testing.
Under X86_FS, I have been able to boot a couple different versions of
 the Linux kernel (v2.6.22.9 and v2.6.28.4), but the bring up requires more
 than 12 hours of simulation time.  I am hoping to reduce the boot time to
 make it more usable.
I recall that the M5 patches for alpha-linux play some tricks to speed
 bootup, so I tried building an x86 kernel v2.6.27 with the patches.  It
 looks like many of the patches are specific to ALPHA, so (maybe
 unsurprisingly) I encountered errors quickly in the build.
I am wondering if anyone is currently working on this, or if I could
 get some pointers on where to dig in.
Thank you,
Joel
 
 
  --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
   http://www.cs.utexas.edu/~hestness
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
 http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Configuration file for building Linux x86

2010-06-16 Thread Joel Hestness

Hi,
  This might be a question for Gabe:
  Steve Reinhardt pointed me to a Linux binary that I have been able to boot
with X86_FS.  I have built a couple different binaries from the Linux
source, including the M5 specific patches, but it appears that M5 hangs when
trying to boot them.  I am wondering if there are any critical options that
I need to look for in the .config file, or if anyone has a .config
specifically for building X86_FS kernel binaries.
  Also, any tips for debugging M5 Linux boot?
  Thank you,
  Joel

-- 
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
 http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] M5 X86_FS pseudo instruction: readfile

2010-06-28 Thread Joel Hestness

Hi,
  This is probably a question for Nate, Gabe or Ali:
  I have built the m5 util application for x86 and I have been testing it
under X86_FS simulation.  It looks like /sbin/m5 readfile is failing to
print the script to the console of the simulated system.  I have been able
to verify that the pseudo instruction executes correctly, and the
appropriate function (PseudoInst::readfile) in the simulator is called with
the correct parameters.  There, the file is read into M5s memory, but it
isn't ever printed to the terminal in the simulated system.
  Call graph: PseudoInst::readfile - VirtualPort::CopyIn
- VirtualPort::writeBlob - Port::writeBlob - Port::blobHelper
  At that point, blobHelper calls sendFunctional, to transfer the contents
of the file into the simulated system, but I'm having trouble tracing where
the packets end up.
  Any ideas on whats going on or how I can debug?
  Thanks,
  Joel


-- 
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
 http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] X86_FS vtophys implementation

2010-07-01 Thread Joel Hestness

Hi,
  It turns out that the readfile bug I posted previously (see below) is a
result of an unimplemented vtophys function: CopyIn reads the file in, but
the virtual address where it should be placed is not translated to a
physical address before sendFunctional is called.  This results in a
BadAddressError and the packets are dropped.

So, I've started looking at the vtophys function.  It looks like it will be
trickier to implement than it was for prior architectures because of the
page table hardware organization and walker.  I think vtophys should be
implemented by making a functional access to the page table walker.  The
only problem is that the state machine controlling the walker is updated in
each of the access functions.  I see a couple possible solutions:

1. vtophys uses a separate walker to look up the entry.  The walker could be
dynamically instantiated when needed, or it could be saved as a system
object specifically for functional accesses.  This option seems pretty
hacky.

2. vtophys uses the ITB or DTB walker to look up the entry.  This would
require functional access to the walker so as to not upset its current
state.  Walker::start would need to take the desired memory mode, and in the
case of a functional access, it would need to make sure that it doesn't
perturb the current state.  This looks like a much better solution to me.

I am wondering if anyone has feedback on a choice here, or if there is maybe
a better solution.  I'd be willing to take a stab at the updates.

  Thanks,
  Joel


On Mon, Jun 28, 2010 at 4:19 PM, nathan binkert n...@binkert.org wrote:

This is probably a question for Nate, Gabe or Ali:
I have built the m5 util application for x86 and I have been testing it
  under X86_FS simulation.  It looks like /sbin/m5 readfile is failing to
  print the script to the console of the simulated system.  I have been
 able
  to verify that the pseudo instruction executes correctly, and the
  appropriate function (PseudoInst::readfile) in the simulator is called
 with
  the correct parameters.  There, the file is read into M5s memory, but it
  isn't ever printed to the terminal in the simulated system.
Call graph: PseudoInst::readfile - VirtualPort::CopyIn
  - VirtualPort::writeBlob - Port::writeBlob - Port::blobHelper
At that point, blobHelper calls sendFunctional, to transfer the
 contents
  of the file into the simulated system, but I'm having trouble tracing
 where
  the packets end up.
Any ideas on whats going on or how I can debug?
 There is a mechanism for tracing packets through the system.
 Basically, if you attach PrintReqState to the object, the system will
 print out info about the object moving through the memory system.
 (Search old e-mails and perhaps the history to find out how it really
 works.  I've never used it, Steve wrote it.)

 As for checking things.  Did you try firing this up in the debugger
 and then stepping over the CopyIn call to find out if it succeeded?
 I'd try that to find out if there is something else wrong before you
 dig through the memory system.

  Nate
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
 http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] X86_FS vtophys implementation

2010-07-01 Thread Joel Hestness

 So wouldn't a functional table walker be basically be the same as an
 atomic-mode one?

I guess I was envisioning a single walker that can handle each type of
access.  Walker::start handles both timing and atomic accesses currently,
but the way it updates state could be trouble for atomic (and functional)
accesses that are interleaved with timing accesses: For example, the state
assertion at the entrance would fail in the (extremely unlikely) corner case
where the MemoryMode was switched from timing to atomic (or functional)
while a timing access was in flight (i.e. state != Ready).  The ability to
interleave timing and functional accesses is going to be necessary
eventually, so sorting it out here would make sense.

 Also, note that in functional mode you don't want to change visible
 system state, so you don't want to update the access bits.

For the readfile problem, I hacked vtophys to just do a lookup in the TLBs,
and it solved the address translation problem (quick solution so I can
continue with further tests).  The TLB lookup function supports accesses
that don't update LRU bits, so I was thinking the same principle could be
applied to the walker state.  (I haven't dug into a lot of the code, so I'm
not sure if thats a common or agreed upon convention)

On Thu, Jul 1, 2010 at 11:43 AM, Steve Reinhardt ste...@gmail.com wrote:

 So wouldn't a functional table walker be basically be the same as an
 atomic-mode one?  I'd think it's only the timing-mode version that
 really needs all the explicit state.  That is, if you were going to
 have two versions, I'd think you'd have a functional/atomic one and a
 timing one, not a functional one and an atomic/timing one (which is I
 believe what you're advocating, since the current one seems to already
 handle both atomic and timing modes).

 Also, note that in functional mode you don't want to change visible
 system state, so you don't want to update the access bits.  I believe
 that also means it's OK to bypass the TLB as well, right?  (You still
 might want to check the TLB if you think there's a good chance you'll
 get a hit there, but the question is whether it's necessary for
 correctness.)

 Steve

 On Thu, Jul 1, 2010 at 11:19 AM, Gabe Black gbl...@eecs.umich.edu wrote:
  Yeah, I skipped implementing that so far. The reason the table walker is
  the way it is is that it needs to actually cooperate with the memory
  system and do real loads/stores, honor timing, etc. For functional
  accesses you should be able to write a simpler implementation that just
  uses its own functional accesses to read from the page tables in memory
  (and write to update the access bits, etc.). You should be careful,
  though, since the TLB acts like a cache and you'll need to check there
  first and not just always go straight to the in memory tables. There'll
  be a little duplication (which might be factored out into utility
  functions) but the page table walker sim object isn't really the right
  tool for this job.
 
  Gabe
 
  Joel Hestness wrote:
  Hi,
It turns out that the readfile bug I posted previously (see below)
  is a result of an unimplemented vtophys function: CopyIn reads the
  file in, but the virtual address where it should be placed is not
  translated to a physical address before sendFunctional is called.
   This results in a BadAddressError and the packets are dropped.
 
  So, I've started looking at the vtophys function.  It looks like it
  will be trickier to implement than it was for prior architectures
  because of the page table hardware organization and walker.  I think
  vtophys should be implemented by making a functional access to the
  page table walker.  The only problem is that the state machine
  controlling the walker is updated in each of the access functions.  I
  see a couple possible solutions:
 
  1. vtophys uses a separate walker to look up the entry.  The walker
  could be dynamically instantiated when needed, or it could be saved as
  a system object specifically for functional accesses.  This option
  seems pretty hacky.
 
  2. vtophys uses the ITB or DTB walker to look up the entry.  This
  would require functional access to the walker so as to not upset its
  current state.  Walker::start would need to take the desired memory
  mode, and in the case of a functional access, it would need to make
  sure that it doesn't perturb the current state.  This looks like a
  much better solution to me.
 
  I am wondering if anyone has feedback on a choice here, or if there is
  maybe a better solution.  I'd be willing to take a stab at the updates.
 
Thanks,
Joel
 
 
  On Mon, Jun 28, 2010 at 4:19 PM, nathan binkert n...@binkert.org
  mailto:n...@binkert.org wrote:
 
 This is probably a question for Nate, Gabe or Ali:
 I have built the m5 util application for x86 and I have been
  testing it
   under X86_FS simulation.  It looks like /sbin/m5 readfile is
  failing to
   print the script

[m5-dev] Booting Linux, X86_FS Timing CPU

2010-07-19 Thread Joel Hestness

Hi,
  I am currently experimenting with the timing CPU in X86_FS, and I have
encountered an assertion failure while booting Linux (using Linux boot as a
test):
m5.debug: build/X86_FS/cpu/simple/timing.cc:900: void
TimingSimpleCPU::completeDataAccess(Packet*): Assertion `_status ==
DcacheWaitResponse || _status == DTBWaitResponse' failed.
  I have attached a stack trace (note that completeDataAccess is called
twice in the trace).  The current macro-instruction is a POP_M, and the
current uop is the Cda.
  In timing mode since the Cda doesn't access memory (the Request::NO_ACCESS
flag is set by Cda), it doesn't wait on a memory access or TLB, so the
status of the CPU before the assertion is _status = Running.  I've tried
adding || _status == Running to the conditional in the assertion, and the
simulation gets past that point, but crashes later.  I'm not sure if this is
a sound fix, or if there is a better way to handle this.
  While browsing the code, I noticed that further up in the call stack,
TimingSimpleCPU::write is called, and when executing this same test using
the atomic CPU, AtomicSimpleCPU::write is called.  In the
AtomicSimpleCPU::write code, there is a special case test for
when the Request::NO_ACCESS flag is set.  I wonder if the same should occur
in TimingSimpleCPU::write?
  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
Starting program: /home/jhestnes/work/m5/build/X86_FS/m5.debug --outdir=$OUTDIR 
./configs/example/fs.py --timing
[Thread debugging using libthread_db enabled]

Program received signal SIGABRT, Aborted.
0x765f2a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
in ../nptl/sysdeps/unix/sysv/linux/raise.c
#0  0x765f2a75 in *__GI_raise (sig=value optimized out) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x765f65c0 in *__GI_abort () at abort.c:92
#2  0x765eb941 in *__GI___assert_fail (assertion=0xd5ef38 _status == 
DcacheWaitResponse || _status == DTBWaitResponse, 
file=value optimized out, line=900, function=0xd5fca0 void 
TimingSimpleCPU::completeDataAccess(Packet*)) at assert.c:81
#3  0x00486796 in TimingSimpleCPU::completeDataAccess (this=0x1d13830, 
pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:900
#4  0x00483f49 in TimingSimpleCPU::sendData (this=0x1d13830, 
req=0x1d1f5b0, data=0x2882b70 , res=0x0, read=false)
at build/X86_FS/cpu/simple/timing.cc:280
#5  0x00484c57 in TimingSimpleCPU::finishTranslation (this=0x1d13830, 
state=0x2878270) at build/X86_FS/cpu/simple/timing.cc:659
#6  0x0048caf9 in DataTranslationTimingSimpleCPU::finish 
(this=0x2882f00, fault=..., req=0x1d1f5b0, tc=0x1d154a0, mode=BaseTLB::Write)
at build/X86_FS/cpu/translation.hh:233
#7  0x0063be1d in X86ISA::TLB::translateTiming (this=0x1d10080, 
req=0x1d1f5b0, tc=0x1d154a0, translation=0x2882f00, mode=BaseTLB::Write)
at build/X86_FS/arch/x86/tlb.cc:721
#8  0x0048badd in TimingSimpleCPU::writeunsigned long 
(this=0x1d13830, data=0, addr=18446744071571259048, flags=524291, res=0x0)
at build/X86_FS/cpu/simple/timing.cc:580
#9  0x00ab143d in X86ISA::LdStOp::writeTimingSimpleCPU, unsigned long 
(this=0x2882cf0, xc=0x1d13830, m...@0x7fffc198, 
EA=18446744071571259048, flags=524291) at 
build/X86_FS/arch/x86/insts/microldstop.hh:141
#10 0x00aa3bfc in X86ISAInst::Cda::initiateAcc (this=0x2882cf0, 
xc=0x1d13830, traceData=0x0) at 
build/X86_FS/arch/x86/timing_simple_cpu_exec.cc:9199
#11 0x00485e32 in TimingSimpleCPU::completeIfetch (this=0x1d13830, 
pkt=0x0) at build/X86_FS/cpu/simple/timing.cc:770
#12 0x004853ac in TimingSimpleCPU::fetch (this=0x1d13830) at 
build/X86_FS/cpu/simple/timing.cc:690
#13 0x00485696 in TimingSimpleCPU::advanceInst (this=0x1d13830, 
fault=...) at build/X86_FS/cpu/simple/timing.cc:735
#14 0x0048699e in TimingSimpleCPU::completeDataAccess (this=0x1d13830, 
pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:932
#15 0x00487097 in TimingSimpleCPU::DcachePort::recvTiming 
(this=0x1d13b10, pkt=0x1d1fa90) at build/X86_FS/cpu/simple/timing.cc:964
#16 0x00487f5e in Port::sendTiming (this=0x1d16760, pkt=0x1d1fa90) at 
build/X86_FS/mem/port.hh:186
#17 0x00507f28 in Bus::recvTiming (this=0x1a4ce00, pkt=0x1d1fa90) at 
build/X86_FS/mem/bus.cc:243
#18 0x00510777 in Bus::BusPort::recvTiming (this=0x1d11650, 
pkt=0x1d1fa90) at build/X86_FS/mem/bus.hh:89
#19 0x00487f5e in Port::sendTiming (this=0x1d16170, pkt=0x1d1fa90) at 
build/X86_FS/mem/port.hh:186
#20 0x0053f7da in SimpleTimingPort::sendDeferredPacket (this=0x1d16170) 
at build/X86_FS/mem/tport.cc:150
#21 0x00540431 in SimpleTimingPort::processSendEvent (this=0x1d16170) 
at build/X86_FS

Re: [m5-dev] Review Request: util/m5/m5.c: in readfile(), added memset to touch all pages - ensure they are in the page table

2010-07-23 Thread Joel Hestness

Hey Gabe,
  Comments are in-lined below.  If you'd like me to resubmit another review
of all or part, just let me know.
  Thanks,
  Joel



 util/m5/Makefile.x86
 http://reviews.m5sim.org/r/64/#comment248

Why is this necessary? Is this so it runs under SE mode? In that case I
 think we should make it run like before as the default since 99% of the time
 this will run in FS, and provide a way to inject -static for the 1% of the
 time it runs in SE.

Compiling it as static all the time wouldn't be the end of the world,
 but it seems like we'd be making universal changes for a very uncommon case.


Building the m5 binary without -static allows it to dynamically link a few
libraries:
  j...@capillary:~/research/m5-new/util/m5$ ldd m5
linux-vdso.so.1 =  (0x7fff3f9ff000)
libc.so.6 = /lib/libc.so.6 (0x7fb05131f000)
/lib64/ld-linux-x86-64.so.2 (0x7fb05168f000)
When I was putting together a disk image using busybox, it had issues with
library versions.  In general, since the m5 utility isn't performance
critical and just implements simulator magic, I think it would be easiest if
it was always built statically whether for FS or SE.  On the other, I would
imagine that it's built very infrequently and only for initial disk image
creation, so perhaps its not worth changing.





 util/m5/m5ops.h
 http://reviews.m5sim.org/r/64/#comment249

It looks like Ali commandeered that value on line 61. It might have been
 better to use 0x5A for that, but it also might not be safe to change it now
 since there may be binaries out there that use it (probably not too many).
 It would be a little strange, but you could actually use 0x5A for
 reserved1_func. I don't know what restrictions there are in the various ISAs
 for function numbers, but in x86 it's a 16 bit value.


Ah, I didn't see that originally!
The only real trouble right now is that if you try to build the m5 utility
for x86_64 with the current version in the repo, it will fail with an
undefined reference to reserved1_func:
  gcc -O2  -o m5op_x86.o -c m5op_x86.S
  gcc -o m5 m5.o m5op_x86.o
  m5op_x86.o: In function `m5_reserved1_func':
  (.text+0x5c): undefined reference to `reserved1_func'
  collect2: ld returned 1 exit status
  make: *** [m5] Error 1
It looks like neither m5op_alpha.S or m5op_sparc.S use reserved1_func, so
another solution would be to remove it from m5op_x86.S (eliminate it
completely from the m5 utility codebase).



 - Gabe




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Review Request: SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still complete

2010-07-27 Thread Joel Hestness


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/65/
---

Review request for Default.


Summary
---

SIMPLE TIMING: when a request is NO_ACCESS (x86 CDA microinstruction), 
TimingSimpleCPU::completeDataAccess must still complete
./cpu/simple/timing.cc: fix for x86 CDA microop
 - since CDA doesn't read or update memory, completeDataAccess needs to handle 
the case where the current status of the CPU is _status = Running caused by a 
request NO_ACCESS

This change is RE: Booting Linux, X86_FS Timing CPU 
(http://www.mail-archive.com/m5-dev@m5sim.org/msg07290.html)
Gabe Black:
The assert is, as you said, from NO_ACCESS skipping the call out to the
memory system and going right to the code that finishes off execution of
that instruction, surprising that code by never having left the Running
state. Under any other circumstance, though, the CPU shouldn't be in the
Running state, and if we just added that to the assert we wouldn't catch
those bugs. What I think would be a better fix is to move the assert
(but not the assignment to _status) up above the code that aggregates
the components of a split packet  and add
pkt-req-getFlags().isSet(Request::NO_ACCESS) or something similar to
the or. This isn't perfect because it asserts every time the function is
called and not just once all the fragments (should be only two) are
gathered, but it's safer and the overhead should be minimal.

This change seems to have fixed the problem for X86_FS.  Since no other 
architectures use the request NO_ACCESS flag, it is unlikely they will be 
impacted, though they still need to be tested.


Diffs
-

  src/cpu/simple/timing.cc a75564db03c3 

Diff: http://reviews.m5sim.org/r/65/diff


Testing
---


Thanks,

Joel

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Review Request: TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess

2010-07-28 Thread Joel Hestness


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/66/
---

Review request for Default.


Summary
---

TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess
./cpu/simple/timing.cc: fix for x86 CDA microop
 - since CDA doesn't read or update memory, completeDataAccess needs to
   handle the case where the current status of the CPU is _status = Running
   caused by a request NO_ACCESS

Discarded previous review request (SIMPLE TIMING: when a request is NO_ACCESS 
(x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess must still 
complete)


Diffs
-

  src/cpu/simple/timing.cc a75564db03c3 

Diff: http://reviews.m5sim.org/r/66/diff


Testing
---


Thanks,

Joel

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: util/m5/m5.c: in readfile(), added memset to touch all pages - ensure they are in the page table

2010-07-29 Thread Joel Hestness

So, it appears that the only change that we agree on for now is the change
to m5.c.  Should I submit that change as its own patch and withdraw this
one?
  Thanks,
  Joel

On Fri, Jul 23, 2010 at 3:45 PM, Gabriel Michael Black 
gbl...@eecs.umich.edu wrote:

 Quoting Ali Saidi sa...@umich.edu:


 On Fri, 23 Jul 2010 16:59:08 -0400, Gabriel Michael Black
 gbl...@eecs.umich.edu wrote:

  Hmm, maybe we should be building these regularly too... What do you
 think, Ali? Would it be possible to return reserved1_func and use a
 different code?

 It was reserved for me while I was doing the bottleneck analysis work and
 didn't want anyone to grab that ID. Once I pushed all of the bottleneck
 analysis changes, I changed reserved into the actual cp_annotate
 operations. So, everything worked as intended.

 reserved1_func shouldn't be used anywhere and shouldn't be added back to
 the file.

 Ali


 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


 I don't understand how that made it reserved. Wouldn't anyone else be able
 to do the same thing you did but with some conflicting use? The comment next
 to those says Reserved for user, but it's not if it ends up being assigned
 an official use. Why would we want to have reserved2_func but not
 reserved1_func?

 Gabe
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Checkpointing x86

2010-08-03 Thread Joel Hestness

Hi,
  This question is probably for Gabe:
  I'm currently implementing checkpointing for x86, and I have run into a
question about inheritance with a couple x86-specific devices.
 src/dev/x86/i8042.hh defines a PS2Device, which doesn't inherit from
anything, but it looks like the PS2Keyboard and PS2Mouse have state that
might need to be checkpointed (e.g. mouse status in the case that Linux
enables/disables it).
  Should PS2Device descend from SimObject?  (if so, through a particular
subclass of SimObject?)
  Thanks,
  Joel


-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess

2010-08-09 Thread Joel Hestness

Is there a way for me to ship this, or does someone else need to push it
to the repo?
  Thanks,
  Joel


On Thu, Jul 29, 2010 at 8:21 AM, Steve Reinhardt ste...@gmail.com wrote:


 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/66/#review111
 ---

 Ship it!


 - Steve


 On 2010-07-28 16:05:00, Joel Hestness wrote:
 
  ---
  This is an automatically generated e-mail. To reply, visit:
  http://reviews.m5sim.org/r/66/
  ---
 
  (Updated 2010-07-28 16:05:00)
 
 
  Review request for Default.
 
 
  Summary
  ---
 
  TimingCPU: REPOST: Request::NO_ACCESS bypass in completeDataAccess
  ./cpu/simple/timing.cc: fix for x86 CDA microop
   - since CDA doesn't read or update memory, completeDataAccess needs to
 handle the case where the current status of the CPU is _status =
 Running
 caused by a request NO_ACCESS
 
  Discarded previous review request (SIMPLE TIMING: when a request is
 NO_ACCESS (x86 CDA microinstruction), TimingSimpleCPU::completeDataAccess
 must still complete)
 
 
  Diffs
  -
 
src/cpu/simple/timing.cc a75564db03c3
 
  Diff: http://reviews.m5sim.org/r/66/diff
 
 
  Testing
  ---
 
 
  Thanks,
 
  Joel
 
 




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Review Request: M5 utility: remove reserve1_func to build for x86

2010-08-09 Thread Joel Hestness


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/120/
---

Review request for Default.


Summary
---

./util/m5/m5op_x86.S: To get the m5 utility to build for x86, remove the 
reserved1_func link.


Diffs
-

  util/m5/m5op_x86.S a75564db03c3 

Diff: http://reviews.m5sim.org/r/120/diff


Testing
---

The M5 utility currently does not build for x86 because the reserved1_func was 
previously removed from m5ops.h by Ali Saidi.  This patch fixes the build 
problem by removing the reference to reserved1_func in m5op_x86.S.


Thanks,

Joel

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: M5 utility: Touch all pages in readfile buffer

2010-08-09 Thread Joel Hestness

I have also tested using this loop.  For a reasonably large file being read,
using the loop executes 300k fewer simulated instructions (~6% of the
readfile execution) than using memset.  On the other hand, the simulation
time of the readfile call using memset was actually 0.4s quicker (~5%) than
using the loop.  The simulated system still needs to do a pagetable walk and
mapping for each page regardless of the implementation.  So, my conclusion
was that it doesn't make a perceptible difference in performance, and the
use of memset abstracts away from possible problems with strided accesses
using a static page size.

  Joel


On Mon, Aug 9, 2010 at 12:50 PM, Ali Saidi sa...@umich.edu wrote:



  On 2010-08-09 12:13:59, Nathan Binkert wrote:
  

 How is this not a possible issue for every isa? We're talking about
 touching 256kB of data. That shouldn't take very long. We've beaten this to
 death, it's time to just call it done and move on. If we're that concerned
 just do:
 for (int x = 0; x  sizeof(buf); x += 512)
buf[x] = 0;

 It improves the speed by three orders of magnitude, doesn't require and
 ifdef and will work on everything that doesn't have a unbelievably small
 page size.


 - Ali


 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/121/#review154
 ---


 On 2010-08-09 10:35:49, Joel Hestness wrote:
 
  ---
  This is an automatically generated e-mail. To reply, visit:
  http://reviews.m5sim.org/r/121/
  ---
 
  (Updated 2010-08-09 10:35:49)
 
 
  Review request for Default.
 
 
  Summary
  ---
 
  util/m5/m5.c: in readfile(), added memset to touch all pages - ensure
 they are in the page table
 
  This problem is caused by Linux demand paging.  If the pages are not yet
 mapped in the page table, the M5 utility does not know the physical memory
 address in the simulated system to which it is sending the file read from
 the host machine.
 
 
  Diffs
  -
 
util/m5/m5.c a75564db03c3
 
  Diff: http://reviews.m5sim.org/r/121/diff
 
 
  Testing
  ---
 
  This fixes the functionality for x86, where the problem was first
 encountered.  I have also tested the utility for Alpha.  The simulated
 system executes approximately 10% more instructions during the readfile
 operation due to the memset, but the simulation time required for this is
 still marginal.  Using memset provides an ISA independent solution compared
 to buffer accesses that use a page-sized stride.
 
 
  Thanks,
 
  Joel
 
 




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] IntDev and intmessage question

2010-08-10 Thread Joel Hestness

Hi,
  I'm looking at the interrupt device interface (dev/x86/IntDev.hh) and
intmessage (arch/x86/intmessage.hh) code, and I have a question about
scoping.
  Currently, the methods defined in intmessage.hh are only used by methods
in the IntDev interface class.  I also notice that the only places where
MemCmd::MessageReq is referenced elsewhere in the code are in MemPort, from
which the IntDev::IntDevPort descends, and in arch/x86/interrupts.hh, which
descends from IntDev.  Is there any reason why these intmessage methods are
scoped to X86ISA, or could they be moved under the IntDev class?
  Thanks,
  Joel


-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] IntDev and intmessage question

2010-08-10 Thread Joel Hestness

*MessagePort (mem/mport.hh) not MemPort... sorry for any confusion
  Joel

On Tue, Aug 10, 2010 at 3:46 PM, Joel Hestness hestn...@cs.utexas.eduwrote:

 Hi,
   I'm looking at the interrupt device interface (dev/x86/IntDev.hh) and
 intmessage (arch/x86/intmessage.hh) code, and I have a question about
 scoping.
   Currently, the methods defined in intmessage.hh are only used by methods
 in the IntDev interface class.  I also notice that the only places where
 MemCmd::MessageReq is referenced elsewhere in the code are in MemPort, from
 which the IntDev::IntDevPort descends, and in arch/x86/interrupts.hh, which
 descends from IntDev.  Is there any reason why these intmessage methods are
 scoped to X86ISA, or could they be moved under the IntDev class?
   Thanks,
   Joel


 --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
   http://www.cs.utexas.edu/~hestness

 --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
   http://www.cs.utexas.edu/~hestness
  http://www.cs.utexas.edu/~hestness

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Regression tests for X86

2010-08-11 Thread Joel Hestness

Hi Dibakar,
  I recently built a Linux kernel for M5 X86_FS that was based on v2.6.28.4,
but modified and compiled specifically for M5.  I have patches for my
changes, but we're still in the debugging phases of bringing up X86_FS with
multicore support, so I'm not confident enough to send them around just yet.
 Aside from building a Linux kernel, you will need to build and configure a
disk image as well, which is also a fair amount of work.  I've found that,
unfortunately due to the long simulation time of Linux boot up, the
iteration time to debug the X86_FS bootup is quite long.
  I think it would be useful if you could describe what you would like to do
with X86_FS, and I can maybe give you some direction on how to get there or
whether it makes sense to wait for updates to M5.
  Thanks,
  Joel


On Sat, Aug 7, 2010 at 12:30 PM, dibakar gope dibakar...@gmail.com wrote:

 Hi All,

 I have few queries regarding the regression tests for X86.

 (1) I could build the x86 in FS mode for AtomicSimpleCPU, O3CPU and
 SimpleTimingCPU mode (I am using a bunch of x86-specific patches from
 http://www.csl.cornell.edu/~vince/projects/m5/m5_x86_64_se_status.html).
 I guess that the pre-compiled linux kernel (that can be downloaded
 from M5 site) was complied for alpha arch only. So I actually
 downloaded the linux-dist tarball from M5 site for x86 build. This
 tarball has a .config.M5 that can be used for compiling the kernel,
 but that .config.m5 is ALPHA-specific.

 So in order to compile the linux for x86, I used the config of my
 native linux machine kernel as a basis for our x86 config kernel and
 got the vmlinux for x86. Following commands are used for that:-

 cp /boot/config ./.config
 make menuconfig
 make-kpkg clean
 fakeroot make-kpkg --initrd --append-to-version=-v2.6.27 kernel_image
 kernel_headers

 I used that vmliux for X86_FS build and did not get any error during
 the build process.

 So my query is, are there any x86-specific patches (configurations)
 that I should have considered for compiling the linux kernel for x86.


 (2)Then I tried to test that X86_FS m5.opt using regression tests. All
 the several test programs present for the regression tests have
 config.ini files only for alpha in m5-dev tarball, but they don't have
 the same for X86. But using the following command, I can generate
 those x86-specific config-ini for the test-programs used in FS mode
 regression.

 build/X86_FS/m5.opt -re configs/example/fs.py
 --cmd=tests/test-progs/test program name/bin/x86/test program
 binary

 But the problem is that the m5-dev tarball (m5/tests/test-progs/*)
 does not have the test program binaries  (m5/tests/quick/*) (for
 example, 10.linux-boot,80.netperf-stream,50.memtest etc) except hello
 (which is not used for FS mode regression). So I could not generate
 the config.ini for x86 in order to run the regression tests.

 So my query is, have anyone worked on the X86 regression tests / faced
 the same problem? Before I use the x86_FS.opt for SPEC2000/2006
 benchmarks, I want that to pass the regression tests first.


 Thanks and Regards,

 Dibakar Gope
 Texas AM University
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: TimingSimpleCPU: fix NO_ACCESS memory op handling

2010-08-12 Thread Joel Hestness

changeset cfbbc9178e7a in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=cfbbc9178e7a
description:
TimingSimpleCPU: fix NO_ACCESS memory op handling

When a request is NO_ACCESS (x86 CDA microinstruction), the memory op
doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs
to handle the case where the current status of the CPU is Running
and not DcacheWaitResponse or DTBWaitResponse

diffstat:

 src/cpu/simple/timing.cc |  3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diffs (20 lines):

diff -r 82453f1b46c5 -r cfbbc9178e7a src/cpu/simple/timing.cc
--- a/src/cpu/simple/timing.cc  Sun Aug 08 22:57:16 2010 -0700
+++ b/src/cpu/simple/timing.cc  Thu Aug 12 17:16:02 2010 -0700
@@ -868,6 +868,8 @@
 // received a response from the dcache: complete the load or store
 // instruction
 assert(!pkt-isError());
+assert(_status == DcacheWaitResponse || _status == DTBWaitResponse ||
+   pkt-req-getFlags().isSet(Request::NO_ACCESS));
 
 numCycles += tickToCycles(curTick - previousTick);
 previousTick = curTick;
@@ -897,7 +899,6 @@
 }
 }
 
-assert(_status == DcacheWaitResponse || _status == DTBWaitResponse);
 _status = Running;
 
 Fault fault = curStaticInst-completeAcc(pkt, this, traceData);
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: util/m5/m5.c: ensure readfile() buffer pages ar...

2010-08-12 Thread Joel Hestness

changeset b69cc0fd934d in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=b69cc0fd934d
description:
util/m5/m5.c: ensure readfile() buffer pages are in page table
(and marked dirty, in case that matters) by touching them beforehand

diffstat:

 util/m5/m5.c |  5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diffs (15 lines):

diff -r cfbbc9178e7a -r b69cc0fd934d util/m5/m5.c
--- a/util/m5/m5.c  Thu Aug 12 17:16:02 2010 -0700
+++ b/util/m5/m5.c  Thu Aug 12 17:16:04 2010 -0700
@@ -65,6 +65,11 @@
 int offset = 0;
 int len;
 
+// Touch all buffer pages to ensure they are mapped in the
+// page table. This is required in the case of X86_FS, where
+// Linux does demand paging.
+memset(buf, 0, sizeof(buf));
+
 while ((len = m5_readfile(buf, sizeof(buf), offset))  0) {
 write(dest_fid, buf, len);
 offset += len;
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] TimingSimpleCPU, x86: sendSplitData packet sender states

2010-08-17 Thread Joel Hestness

Hi,
  I am currently looking at the sendSplitData function in TimingSimpleCPU
(cpu/simple/timing.cc:~307), and I'm encountering a problem with the packet
sender states when running with Ruby.  After the call to buildSplitPacket,
pkt1 and pkt2 have senderState type SplitFragmentSenderState.  However, with
Ruby enabled, the call to handleReadPacket sends the packet to a RubyPort,
and in RubyPort::M5Port::recvTiming (mem/ruby/system/RubyPort.cc:~173), a
new senderState is pushed into the packet that has type SenderState (note
that the old senderState is saved in the new senderState. After the packet
transfer, Ruby restores the old senderState).  When the stack unwinds back
to sendSplitData, the dynamic_cast after handleReadPacket fails because of
the type difference.
  It looks like the senderState variable is used elsewhere as a stack to
store data while the packet traverses from source to destination and on the
way back as a response, which makes sense.  I'm wondering why the
clearFromParent call needs to happen in sendSplitData, since it seems like
it should happen in completeDataAccess when cleaning up the packets.
  Thanks,
  Joel

PS.  In sendSplitData after handleReadPacket(pkt2), it looks like there is a
bug with the dynamic_cast and clearFromParent since the cast is called on
pkt1-senderState.  This doesn't affect correctness, but it does leave
references that affect deletion of the packets.  Is that correct?

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] TimingSimpleCPU, x86: sendSplitData packet sender states

2010-08-17 Thread Joel Hestness

I just realized that the clearFromParent call is used for tracking which of
the packets have successfully sent, so that if the send port is busy, it can
retry them when a recvRetry is received later.  It appears that maybe a
better solution to this is to hold a pointer on the stack in sendSplitData
to the senderState that may eventually call clearFromParent rather than
trying to get the senderState back out after the call to handleReadPacket.
  Does sound reasonable?
  Thanks,
  Joel

On Tue, Aug 17, 2010 at 3:11 PM, Joel Hestness hestn...@cs.utexas.eduwrote:

 Hi,
   I am currently looking at the sendSplitData function in TimingSimpleCPU
 (cpu/simple/timing.cc:~307), and I'm encountering a problem with the packet
 sender states when running with Ruby.  After the call to buildSplitPacket,
 pkt1 and pkt2 have senderState type SplitFragmentSenderState.  However, with
 Ruby enabled, the call to handleReadPacket sends the packet to a RubyPort,
 and in RubyPort::M5Port::recvTiming (mem/ruby/system/RubyPort.cc:~173), a
 new senderState is pushed into the packet that has type SenderState (note
 that the old senderState is saved in the new senderState. After the packet
 transfer, Ruby restores the old senderState).  When the stack unwinds back
 to sendSplitData, the dynamic_cast after handleReadPacket fails because of
 the type difference.
   It looks like the senderState variable is used elsewhere as a stack to
 store data while the packet traverses from source to destination and on the
 way back as a response, which makes sense.  I'm wondering why the
 clearFromParent call needs to happen in sendSplitData, since it seems like
 it should happen in completeDataAccess when cleaning up the packets.
   Thanks,
   Joel

 PS.  In sendSplitData after handleReadPacket(pkt2), it looks like there is
 a bug with the dynamic_cast and clearFromParent since the cast is called on
 pkt1-senderState.  This doesn't affect correctness, but it does leave
 references that affect deletion of the packets.  Is that correct?

 --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
   http://www.cs.utexas.edu/~hestness




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] TimingSimpleCPU, x86: sendSplitData + TLB miss

2010-08-18 Thread Joel Hestness

Hi,
  I am currently running a benchmark in X86_FS timing mode (single or
multicore) that crashes due to the page table walker.  On a data write (or
read) instruction that causes TimingSimpleCPU::write to split the TLB access
into two accesses (cpu/simple/timing.cc:~560), if the first TLB access
misses, it causes the page table walker to start a walk and its state =
Waiting.  Since the second access happens immediately
in TimingSimpleCPU::write, if the second request also misses, it causes
another walk that fails the (state == Ready) assertion in
X86ISA::Walker::start (arch/x86/pagetable_walker.cc:~316).
  Seems this is a corner case of a corner case, namely, an unaligned (split)
data access, whose split TLB accesses both miss.  It doesn't look like there
is any code to handle the situation yet, and I'm hoping to get some guidance
on how to address it.
  It seems to me that since this only happens on a TLB miss, that the TLB or
walker should be able to handle the multiple requests.  I see that in the
ARM code, the page table walker has a queue of walks that are currently in
flight (I'm having trouble convincing myself that the queues can't conflict
when multiple walks are in flight :\).  Would it make sense to have similar
state queuing in the x86 page table walker?
  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Unable to checkpoint restore into detailed/timing CPU

2010-09-14 Thread Joel Hestness

Hi,
  I'm working on something else right now, and I might not have a chance to
dig into this for a while, so I figured I would post to the list:

  I just updated to the most recent repo, and ALPHA checkpoint restore into
timing-enabled CPUs doesn't appear to be working:

   1) Checkpoint by running this and use m5 utility to checkpoint from
command line:
   % ./build/ALPHA_FS/m5.debug configs/example/fs.py --num-cpus=4
   2) Try to restore from checkpoint:
   % ./build/ALPHA_FS/m5.debug --outdir=$OUTDIR configs/example/fs.py
--timing --caches --l2cache --num-cpus=4 -r 1
 OR:
   % ./build/ALPHA_FS/m5.debug --outdir=$OUTDIR configs/example/fs.py
--detailed --caches --l2cache --num-cpus=4 -r 1

---OUTPUT
M5 Simulator System

Copyright (c) 2001-2008
The Regents of The University of Michigan
All Rights Reserved


M5 compiled Sep 14 2010 16:11:04
M5 revision 37c56be05af0 7682 default tip
M5 started Sep 14 2010 17:12:12
M5 executing on RADLAB-0002
command line: ./build/ALPHA_FS/m5.debug
--outdir=/home/jhestnes/work/m5/m5out/2010-09-14_mcpat_o3_test-0
configs/example/fs.py --timing --caches --l2cache --num-cpus=4 -r 1
Script to execute:
Global frequency set at 1 ticks per second
info: kernel located at:
/home/jhestnes/work/disk-images/binaries/vmlinux_2.6.27-gcc_4.3.4_test64
Listening for system connection on port 3456
  0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1 00:00:00
2009
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001
0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002
0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003
Switch at curTick count:1
info: Entering event queue @ 5473006271000.  Starting simulation...
Switched CPUS @ cycle = 5473006281000
Traceback (most recent call last):
  File string, line 1, in module
  File /home/jhestnes/work/public-m5/src/python/m5/main.py, line 359, in
main
exec filecode in scope
  File configs/example/fs.py, line 192, in module
Simulation.run(options, root, test_sys, FutureClass)
  File /home/jhestnes/work/public-m5/configs/common/Simulation.py, line
257, in run
m5.changeToTiming(testsys)
  File /home/jhestnes/work/public-m5/src/python/m5/simulate.py, line 188,
in changeToTiming
if system.getMemoryMode() != objects.params.timing:
AttributeError: 'module' object has no attribute 'timing'
---OUTPUT

  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU

2010-09-16 Thread Joel Hestness

Hi guys,
  I am using swig 1.3.40, so per Nate's note in the last changeset, I don't
think that should be an issue.

@Steve: Thanks for the pointer to 'bisect'.  Also, in debugging this
problem, I ran into the same MC146818 checkpointing/drain bug from before.
 Previously, I had written a patch to fix the problem.  However, I just
talked to Brad about it, and he mentioned that you were thinking about
backing out that changeset (7559).  Based on what I see in the code, it's
unclear whether my patch does the right thing and I think the previous code
is certainly more correct.  I backed out that changeset locally, and it
fixed the MC146818 problem, so my recommendation would be to back out that
changeset in the repo.  If we're in agreement, should I submit that patch
for review?

So, back to the bug at hand: After a fair amount of testing, it looks like
the checkpoint restore problem was introduced somewhere between changeset
7674 and 7678.  (it would probably be another 30-40 minutes of testing for
me to identify exactly)

@Nate: I tried changing the module to 'internal', but it gave the same
error.  Is it just about getting the correct imports?

  Thanks,
  Joel


On Wed, Sep 15, 2010 at 5:55 PM, nathan binkert n...@binkert.org wrote:

  if system.getMemoryMode() != objects.params.timing:
  AttributeError: 'module' object has no attribute 'timing'

 I'm pretty sure that this is my fault.  Can you change it to
 internal.params.timing and let me know if it works?

 Thanks,

  Nate
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev

 --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
   http://www.cs.utexas.edu/~hestness
  http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU

2010-09-17 Thread Joel Hestness

 are related?

  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Unable to checkpoint restore into detailed/timing CPU

2010-09-21 Thread Joel Hestness


 I would guess that the issues aren't related, but it's always difficult to
 be certain. Have you tried running it with valgrind?

 Valgrind shows an initialization error and what appears to be an
interesting memcpy bug in packet handling in the cache tags, but it doesn't
look like either are the cause of the seg fault (see attached).

  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness


valgrind.out
Description: Binary data
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] Statistics Output Conventions

2010-11-04 Thread Joel Hestness

Hi,
  I'm currently trying to leverage a Python script and McPAT to consume M5
statistics (stats.txt) and calculate power estimates for a simulated system.
 Stats variables in the SimObjects are lowerCamelCase according to the
coding style, but it looks like the names output to the stats.txt file are
mixed, either lowerCamelCase or lower_case_with_underscores.  I'm wondering
if there is a convention for statistic names that are output to the
stats.txt that I (we) can be aiming for.
  Thanks,
  Joel

-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: IntDev: latency fix

2011-01-07 Thread Joel Hestness



 On 2011-01-07 04:34:28, Gabe Black wrote:
  See review of the earlier IntDev patch. Basically this is displacing the 
  latency value from the base class that uses it into the subclass that gets 
  it from the config. I don't think it's necessary as described previously, 
  but also that decentralizes a value that's always used in the same place 
  for the same purpose.

**Note that this patch removes the latency member from IntPort.**  This patch 
doesn't indicate where the latency member should end up (I'll comment on that 
in the other review request).  Regardless of where the latency is handled, the 
rest of the codebase indicates that a port should not be responsible for 
assessing latency (see mem/port.*, mem/tport.* and mem/mport.*), so this is why 
I removed latency from the IntPort definition.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/384/#review641
---


On 2011-01-06 15:57:01, Brad Beckmann wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/384/
 ---
 
 (Updated 2011-01-06 15:57:01)
 
 
 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
 Nathan Binkert.
 
 
 Summary
 ---
 
 IntDev: latency fix
 
 Since the device should be responsible for latency of packets, remove the
 latency field of the IntPort completely.
 
 
 Diffs
 -
 
   src/dev/x86/intdev.hh 9f9e10967912 
 
 Diff: http://reviews.m5sim.org/r/384/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Brad
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: MessagePort: implemented virtual recvTiming avoiding double delete

2011-01-07 Thread Joel Hestness



 On 2011-01-07 04:21:05, Gabe Black wrote:
  I think there are two problems with this patch. First, if at all possible 
  we should avoid the code duplication we'd now have for the recvTiming 
  function. Second, while this probably does fix the legitimate problem of 
  deleting packets twice, I think it creates a memory leak in the process. I 
  suspect if you leave your other changes in place but get rid of your custom 
  recvTiming function, things will still work. The packet won't be deleted by 
  the device, won't be deleted after being received as a request in either 
  atomic or timing mode, but will be deleted in both modes after being 
  received as a response. The virtual you added in tport.hh could almost 
  certainly go away then too.
 
 Brad Beckmann wrote:
 Joel is the one who actually wrote this patch, so hopefully he can 
 elaborate on the possible the memory leak.  I'll hold off on this patch until 
 he can respond.

Actually, the double delete problem still exists if we removed the (almost) 
replicated recvTiming code.  This is because pkt-needsResponse() returns false 
when the message type is MemCmd::MessageResp, which causes execution of the 
needsResponse else clause in SimpleTimingPort::recvTiming.  It would be freed 
there, as well as in recvAtomic.

I think when I tested this with Valgrind, I didn't see the memory leak (doesn't 
mean it doesn't exist).  However, I don't think I was able to justify to myself 
why it didn't occur.

I remember that I spent a while trying to figure out how to make this work 
nicely, but the inheritance SimpleTimingPort - MessagePort - IntPort, and the 
overloading that that implies makes this quite difficult to analyze.  For 
instance, I'm still not clear why the new MemCmd, MessageReq/Resp, needed to be 
defined for this.


 On 2011-01-07 04:21:05, Gabe Black wrote:
  src/mem/tport.hh, line 145
  http://reviews.m5sim.org/r/382/diff/1/?file=9048#file9048line145
 
  Marking this as explicitly virtual shouldn't really be necessary. Is 
  there a reason you want to?

I think I had trouble compiling since MessagePort overloads recvTiming.  In 
this patch, MessagePort would become the first (only) descendant class of 
SimpleTimingPort that overloads recvTiming.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/382/#review639
---


On 2011-01-06 15:56:19, Brad Beckmann wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/382/
 ---
 
 (Updated 2011-01-06 15:56:19)
 
 
 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
 Nathan Binkert.
 
 
 Summary
 ---
 
 MessagePort: implemented virtual recvTiming avoiding double delete
 
 Double packet delete problem is due to an interrupt device deleting a packet
 that the SimpleTimingPort also deletes. Since MessagePort descends from
 SimpleTimingPort, simply reimplement the failing code from SimpleTimingPort:
 recvTiming.
 
 
 Diffs
 -
 
   src/arch/x86/interrupts.cc 9f9e10967912 
   src/dev/x86/intdev.hh 9f9e10967912 
   src/mem/mport.hh 9f9e10967912 
   src/mem/mport.cc 9f9e10967912 
   src/mem/tport.hh 9f9e10967912 
 
 Diff: http://reviews.m5sim.org/r/382/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Brad
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: x86: page table walker functional support

2011-01-07 Thread Joel Hestness



 On 2011-01-07 04:45:16, Gabe Black wrote:
  src/arch/x86/vtophys.cc, line 58
  http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line58
 
  Better wording might be Need access to page tables.

I like that change


 On 2011-01-07 04:45:16, Gabe Black wrote:
  src/arch/x86/vtophys.cc, line 70
  http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line70
 
  Having a temporary variable here seems unnecessary unless it's to 
  prevent having to wrap the next line. It's not a big deal, though.

As far as I can tell, convention in ALL other code is to store the fault as a 
temporary variable, even if it could simply be pushed into the if-clause.


 On 2011-01-07 04:45:16, Gabe Black wrote:
  src/arch/x86/vtophys.cc, line 73
  http://reviews.m5sim.org/r/385/diff/1/?file=9054#file9054line73
 
  This is very suspicious. The request size was set to 0 when you 
  constructed the request object, so this is anding the original address with 
  -1. That doesn't do anything, so you're really just oring the addresses 
  together. The TLB will already have taken care of any page offset/page 
  number munging that you need. Actually, this whole function is suspect (not 
  because of your code) since there's no guarantee code/data and/or different 
  forms of data will be translated the same, or that flags aren't important.
 
 Brad Beckmann wrote:
 I agree, something seems off here.  However, I'll let Joel respond before 
 changing it.  At least there needs to be a comment explaining why this 
 calculation is necessary.

The size field of the request is set in the functional portion of 
Walker::WalkerState::startWalk in my other patch for review.  The physical 
address that is returned from vtophys needs to include the offset into the 
page, which in x86 can have multiple different sizes.  The page table contains 
the information about the page size, so it needs to be passed in the request 
object through startFunctional().


- Joel


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/385/#review642
---


On 2011-01-06 15:59:24, Brad Beckmann wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/385/
 ---
 
 (Updated 2011-01-06 15:59:24)
 
 
 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
 Nathan Binkert.
 
 
 Summary
 ---
 
 x86: page table walker functional support
 
 src/arch/x86/pagetable_walker.hh: Added method to functionally walk page table
 src/arch/x86/pagetable_walker.cc: Added method to functionally walk page table
 src/arch/x86/tlb.cc: Added method to return pointer to walker
 src/arch/x86/tlb.hh: Added method to return pointer to walker
 src/arch/x86/vtophys.cc: Calls walker to look up virt. to phys. page mapping
 
 
 Diffs
 -
 
   src/arch/x86/vtophys.cc 9f9e10967912 
 
 Diff: http://reviews.m5sim.org/r/385/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Brad
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: x86: Timing support for pagetable walker

2011-01-07 Thread Joel Hestness



 On 2011-01-07 05:51:30, Gabe Black wrote:
  The code seems ok, but why do we need to have multiple outstanding page 
  walks in timing mode again?
 
 Gabe Black wrote:
 Actually, I wrote the above before I'd read it carefully. My question 
 still stands, but there are some areas that need to be fixed up. Also, since 
 translation is very much on the critical path, make sure you measure how much 
 this change affects performance. I expect with the addition indirection at 
 least there will be some slow down, and we should know what that is before we 
 commit anything.

In timing mode x86, if a memory address translation misses in the TLB AND 
happens to be an unaligned access (one that straddles a page boundary), the TLB 
promptly fires both of the requests to the page table walker.  The old 
implementation of the walker doesn't support multiple outstanding requests, so 
it immediately crashes simulation with a state assertion failure (I asked a few 
questions about this in June and July, back when I made the changes to the 
walker).  The implementation in this patch can queue the requests and service 
them sequentially.  It should be a simple future extension to service them 
concurrently.

I modeled this implementation after the ARM implementation in 
arch/arm/table_walker.*.

Concerning the slowdown, the frequency of unaligned accesses that miss in the 
TLB is extremely rare (10 in seconds of simulated system time).  Since timing 
mode doesn't work without this fix, there isn't a way to compare performance 
against a baseline.


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.hh, line 187
  http://reviews.m5sim.org/r/396/diff/1/?file=9102#file9102line187
 
  Why call this reqType instead of leaving it as mode? requests have 
  types which are orthogonal to this, and it's called mode everywhere else.

Good point.  I'm not sure why I named it that.  mode would be better.


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.cc, line 77
  http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line77
 
  These should use FastAlloc if at all possible since they're on a 
  critical path and the heap is slow.

Is this as simple as having WalkerState inherit from FastAlloc?


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.cc, line 89
  http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line89
 
  Memory leak.

Well, that's embarrassing :P


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.cc, line 179
  http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line179
 
  Is letting translations pass each other realistic? I worry we're making 
  our walker artificially powerful. These loops will also slow things down 
  potentially.

This is an abstract implementation just to get the walker to work.  It can be 
easily molded to order the requests appropriately.

On the topic of slowdown, having more than one request in the queue is 
extremely rare, so any slowdown should be trivial.


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.cc, line 541
  http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line541
 
  Declare this where it's used.

I think I had other plans for this variable, but it doesn't look like I 
followed through.  Moving it to line 566 should be a simple fix.


 On 2011-01-07 05:51:30, Gabe Black wrote:
  src/arch/x86/pagetable_walker.cc, line 574
  http://reviews.m5sim.org/r/396/diff/1/?file=9103#file9103line574
 
  Why is this pulled out into its own switch statement? That will slow 
  down the code and makes things more complicated.

As I recall, this was part of my intermediate solution to the unaligned access 
problem.  These lines can be moved back to the previous locations.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/396/#review649
---


On 2011-01-06 16:12:34, Brad Beckmann wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/396/
 ---
 
 (Updated 2011-01-06 16:12:34)
 
 
 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
 Nathan Binkert.
 
 
 Summary
 ---
 
 x86: Timing support for pagetable walker
 
 Move page table walker state to its own object type, and make the
 walker instantiate state for each outstanding walk. By storing the
 states in a queue, the walker is able to handle multiple outstanding
 timing requests. Note that functional walks use separate state
 elements.
 
 
 Diffs
 -
 
   src/arch/x86/pagetable_walker.hh 9f9e10967912 
   src/arch/x86/pagetable_walker.cc 9f9e10967912 
   src/arch/x86/tlb.hh 9f9e10967912 
   src/arch/x86/tlb.cc 9f9e10967912

Re: [m5-dev] Error in Simulating Mesh Network

2011-01-20 Thread Joel Hestness

Hi Nilay,
  I believe that this error is fixed in one of the patches that I worked on
while at AMD.  Brad has pushed it up for review:
http://reviews.m5sim.org/r/381/.  It's a one line fix.
  Hope this helps,
  Joel


On Thu, Jan 20, 2011 at 8:15 AM, Nilay Vaish ni...@cs.wisc.edu wrote:

 Brad, I tried simulating a mesh network with four processors.

 ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py
 --maxtick 2000 -n 4 --topology Mesh --mesh-rows 2 --num-l2cache 4
 --num-dir 4

 I receive the following error:

 panic: FIFO ordering violated: [MessageBuffer:  consumer-yes [ [71227521,
 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in]
  name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512 delta:
 1 arrival_time: 71227513 last arrival_time: 71227521
  @ cycle 35613756000
 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageBuffer.cc,
 line 198]

 Do you think that the options I have specified should work correctly?

 Thanks
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] changeset in m5: checkpointing: fix bug from curTick accessor co...

2011-01-21 Thread Joel Hestness

I like this idea a lot.  Not only would it solve the SERIALIZE_* v. paramOut
usage problem, but it would also decouple the code variable name from the
name written to the checkpoint.  If used intelligently, this could alleviate
some of the pain of fixing old checkpoints when code changes.

  Joel


On Fri, Jan 21, 2011 at 12:57 AM, Gabe Black gbl...@eecs.umich.edu wrote:

  From time to time It seems to be that we need to serialize something
 but call it something other than its variable name. Would it make sense
 to add SERIALIZE_*_AS macros that take a name argument as well? It's not
 that hard to create a temporary variable or use those param functions
 directly, but it would at least make things look more consistent to
 always (or almost always) use SERIALIZE_FOO.

 Gabe

 On 01/20/11 22:11, Steve Reinhardt wrote:
  changeset 494b5426e70d in /z/repo/m5
  details: http://repo.m5sim.org/m5?cmd=changeset;node=494b5426e70d
  description:
checkpointing: fix bug from curTick accessor conversion.
 
Regex replacement of curTick with curTick() accidentally
changed checkpoint key string for serialization but not
for unserialization.
 
  diffstat:
 
   src/sim/serialize.cc |  2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
 
  diffs (12 lines):
 
  diff -r f84bfd45d607 -r 494b5426e70d src/sim/serialize.cc
  --- a/src/sim/serialize.ccWed Jan 19 16:22:23 2011 -0800
  +++ b/src/sim/serialize.ccThu Jan 20 22:13:33 2011 -0800
  @@ -400,7 +400,7 @@
   Globals::serialize(ostream os)
   {
   nameOut(os);
  -SERIALIZE_SCALAR(curTick());
  +paramOut(os, curTick, curTick());
 
   nameOut(os, MainEventQueue);
   mainEventQueue.serialize(os);
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: IntDev: packet latency fix

2011-02-06 Thread Joel Hestness

changeset 8b05ff5ef958 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=8b05ff5ef958
description:
IntDev: packet latency fix

The x86 local apic now includes a separate latency parameter for 
interrupts.

diffstat:

 src/arch/x86/X86LocalApic.py |  2 ++
 src/arch/x86/interrupts.cc   |  3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diffs (22 lines):

diff -r 38eca2df1124 -r 8b05ff5ef958 src/arch/x86/X86LocalApic.py
--- a/src/arch/x86/X86LocalApic.py  Sun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/X86LocalApic.py  Sun Feb 06 22:14:17 2011 -0800
@@ -34,3 +34,5 @@
 cxx_class = 'X86ISA::Interrupts'
 pio_latency = Param.Latency('1ns', 'Programmed IO latency in simticks')
 int_port = Port(Port for sending and receiving interrupt messages)
+int_latency = Param.Latency('1ns', \
+Latency for an interrupt to propagate through this device.)
diff -r 38eca2df1124 -r 8b05ff5ef958 src/arch/x86/interrupts.cc
--- a/src/arch/x86/interrupts.ccSun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/interrupts.ccSun Feb 06 22:14:17 2011 -0800
@@ -595,7 +595,8 @@
 
 
 X86ISA::Interrupts::Interrupts(Params * p) :
-BasicPioDevice(p), IntDev(this), latency(p-pio_latency), clock(0),
+BasicPioDevice(p), IntDev(this, p-int_latency), latency(p-pio_latency), 
+clock(0),
 apicTimerEvent(this),
 pendingSmi(false), smiVector(0),
 pendingNmi(false), nmiVector(0),
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: x86: implements vtophys

2011-02-06 Thread Joel Hestness

changeset f9b675da608a in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=f9b675da608a
description:
x86: implements vtophys

Calls walker to look up virt. to phys. page mapping

diffstat:

 src/arch/x86/pagetable_walker.hh |   1 +
 src/arch/x86/system.cc   |   1 +
 src/arch/x86/vtophys.cc  |  29 ++---
 src/arch/x86/vtophys.hh  |   3 ---
 4 files changed, 28 insertions(+), 6 deletions(-)

diffs (87 lines):

diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/pagetable_walker.hh
--- a/src/arch/x86/pagetable_walker.hh  Sun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/pagetable_walker.hh  Sun Feb 06 22:14:17 2011 -0800
@@ -48,6 +48,7 @@
 #include mem/mem_object.hh
 #include mem/packet.hh
 #include params/X86PagetableWalker.hh
+#include sim/faults.hh
 
 class ThreadContext;
 
diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/system.cc
--- a/src/arch/x86/system.ccSun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/system.ccSun Feb 06 22:14:17 2011 -0800
@@ -39,6 +39,7 @@
 
 #include arch/x86/bios/smbios.hh
 #include arch/x86/bios/intelmp.hh
+#include arch/x86/isa_traits.hh
 #include arch/x86/regs/misc.hh
 #include arch/x86/system.hh
 #include arch/vtophys.hh
diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/vtophys.cc
--- a/src/arch/x86/vtophys.cc   Sun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/vtophys.cc   Sun Feb 06 22:14:17 2011 -0800
@@ -39,19 +39,42 @@
 
 #include string
 
+#include arch/x86/pagetable_walker.hh
+#include arch/x86/tlb.hh
 #include arch/x86/vtophys.hh
+#include base/trace.hh
+#include config/full_system.hh
+#include cpu/thread_context.hh
+#include sim/fault.hh
 
 using namespace std;
 
 namespace X86ISA
 {
-Addr vtophys(Addr vaddr)
+Addr
+vtophys(Addr vaddr)
 {
+#if FULL_SYSTEM
+panic(Need access to page tables\n);
+#endif
 return vaddr;
 }
 
-Addr vtophys(ThreadContext *tc, Addr addr)
+Addr
+vtophys(ThreadContext *tc, Addr vaddr)
 {
-return addr;
+#if FULL_SYSTEM
+Walker *walker = tc-getDTBPtr()-getWalker();
+Addr size;
+Addr addr = vaddr;
+Fault fault = walker-startFunctional(tc, addr, size, BaseTLB::Read);
+if (fault != NoFault)
+panic(vtophys page walk returned fault\n);
+Addr masked_addr = vaddr  (size - 1);
+Addr paddr = addr | masked_addr;
+DPRINTF(VtoPhys, vtophys(%#x) - %#x\n, vaddr, paddr);
+return paddr;
+#endif
+return vaddr;
 }
 }
diff -r 8b05ff5ef958 -r f9b675da608a src/arch/x86/vtophys.hh
--- a/src/arch/x86/vtophys.hh   Sun Feb 06 22:14:17 2011 -0800
+++ b/src/arch/x86/vtophys.hh   Sun Feb 06 22:14:17 2011 -0800
@@ -40,12 +40,9 @@
 #ifndef __ARCH_X86_VTOPHYS_HH__
 #define __ARCH_X86_VTOPHYS_HH__
 
-#include arch/x86/isa_traits.hh
-#include arch/x86/pagetable.hh
 #include base/types.hh
 
 class ThreadContext;
-class FunctionalPort;
 
 namespace X86ISA
 {
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: Ruby: Add support for locked memory accesses in...

2011-02-06 Thread Joel Hestness

changeset 4e83ebb67794 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=4e83ebb67794
description:
Ruby: Add support for locked memory accesses in X86_FS

diffstat:

 src/mem/ruby/libruby.cc |   8 ++
 src/mem/ruby/libruby.hh |   2 +
 src/mem/ruby/system/DMASequencer.cc |   2 +
 src/mem/ruby/system/RubyPort.cc |  36 --
 src/mem/ruby/system/Sequencer.cc|  43 +---
 5 files changed, 70 insertions(+), 21 deletions(-)

diffs (213 lines):

diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/libruby.cc
--- a/src/mem/ruby/libruby.cc   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/libruby.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -58,6 +58,10 @@
 return RMW_Read;
   case RubyRequestType_RMW_Write:
 return RMW_Write;
+  case RubyRequestType_Locked_RMW_Read:
+return Locked_RMW_Read;
+  case RubyRequestType_Locked_RMW_Write:
+return Locked_RMW_Write;
   case RubyRequestType_NULL:
   default:
 assert(0);
@@ -82,6 +86,10 @@
 return RubyRequestType_RMW_Read;
 else if (str == RMW_Write)
 return RubyRequestType_RMW_Write;
+else if (str == Locked_RMW_Read)
+return RubyRequestType_Locked_RMW_Read;
+else if (str == Locked_RMW_Write)
+return RubyRequestType_Locked_RMW_Write;
 else
 assert(0);
 return RubyRequestType_NULL;
diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/libruby.hh
--- a/src/mem/ruby/libruby.hh   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/libruby.hh   Sun Feb 06 22:14:18 2011 -0800
@@ -44,6 +44,8 @@
   RubyRequestType_Store_Conditional,
   RubyRequestType_RMW_Read,
   RubyRequestType_RMW_Write,
+  RubyRequestType_Locked_RMW_Read,
+  RubyRequestType_Locked_RMW_Write,
   RubyRequestType_NUM
 };
 
diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/DMASequencer.cc
--- a/src/mem/ruby/system/DMASequencer.cc   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/system/DMASequencer.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -70,6 +70,8 @@
   case RubyRequestType_Store_Conditional:
   case RubyRequestType_RMW_Read:
   case RubyRequestType_RMW_Write:
+  case RubyRequestType_Locked_RMW_Read:
+  case RubyRequestType_Locked_RMW_Write:
   case RubyRequestType_NUM:
 panic(DMASequencer::makeRequest does not support RubyRequestType);
 return RequestStatus_NULL;
diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/RubyPort.cc
--- a/src/mem/ruby/system/RubyPort.cc   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/system/RubyPort.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -26,6 +26,10 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include config/the_isa.hh
+#if THE_ISA == X86_ISA
+#include arch/x86/insts/microldstop.hh
+#endif // X86_ISA
 #include cpu/testers/rubytest/RubyTester.hh
 #include mem/physical.hh
 #include mem/ruby/slicc_interface/AbstractController.hh
@@ -201,22 +205,38 @@
 assert(pkt-isRead());
 type = RubyRequestType_Load_Linked;
 }
+} else if (pkt-req-isLocked()) {
+if (pkt-isWrite()) {
+DPRINTF(MemoryAccess, Issuing Locked RMW Write\n);
+type = RubyRequestType_Locked_RMW_Write;
+} else {
+DPRINTF(MemoryAccess, Issuing Locked RMW Read\n);
+assert(pkt-isRead());
+type = RubyRequestType_Locked_RMW_Read;
+}
 } else {
 if (pkt-isRead()) {
 if (pkt-req-isInstFetch()) {
 type = RubyRequestType_IFETCH;
 } else {
-type = RubyRequestType_LD;
+#if THE_ISA == X86_ISA
+uint32_t flags = pkt-req-getFlags();
+bool storeCheck = flags 
+(TheISA::StoreCheck  TheISA::FlagShift);
+#else
+bool storeCheck = false;
+#endif // X86_ISA
+if (storeCheck) {
+type = RubyRequestType_RMW_Read;
+} else {
+type = RubyRequestType_LD;
+}
 }
 } else if (pkt-isWrite()) {
+//
+// Note: M5 packets do not differentiate ST from RMW_Write
+//
 type = RubyRequestType_ST;
-} else if (pkt-isReadWrite()) {
-// Fix me.  This conditional will never be executed
-// because isReadWrite() is just an OR of isRead() and
-// isWrite().  Furthermore, just because the packet is a
-// read/write request does not necessary mean it is a
-// read-modify-write atomic operation.
-type = RubyRequestType_RMW_Write;
 } else {
 panic(Unsupported ruby packet type\n);
 }
diff -r d648b8409d4c -r 4e83ebb67794 src/mem/ruby/system/Sequencer.cc
--- a/src/mem/ruby/system/Sequencer.cc  Sun Feb 06 22:14:18 2011 -0800
+++

[m5-dev] changeset in m5: Ruby: Fix to return cache block size to CPU for...

2011-02-06 Thread Joel Hestness

changeset eee578ed2130 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=eee578ed2130
description:
Ruby: Fix to return cache block size to CPU for split data transfers

diffstat:

 src/mem/ruby/system/RubyPort.cc |  6 ++
 src/mem/ruby/system/RubyPort.hh |  2 ++
 2 files changed, 8 insertions(+), 0 deletions(-)

diffs (32 lines):

diff -r 4e83ebb67794 -r eee578ed2130 src/mem/ruby/system/RubyPort.cc
--- a/src/mem/ruby/system/RubyPort.cc   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/system/RubyPort.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -370,3 +370,9 @@
 }
 return false;
 }
+
+unsigned
+RubyPort::M5Port::deviceBlockSize() const
+{
+return (unsigned) RubySystem::getBlockSizeBytes();
+}
diff -r 4e83ebb67794 -r eee578ed2130 src/mem/ruby/system/RubyPort.hh
--- a/src/mem/ruby/system/RubyPort.hh   Sun Feb 06 22:14:18 2011 -0800
+++ b/src/mem/ruby/system/RubyPort.hh   Sun Feb 06 22:14:18 2011 -0800
@@ -36,6 +36,7 @@
 #include mem/physical.hh
 #include mem/protocol/RequestStatus.hh
 #include mem/ruby/libruby.hh
+#include mem/ruby/system/System.hh
 #include mem/tport.hh
 #include params/RubyPort.hh
 
@@ -54,6 +55,7 @@
 M5Port(const std::string _name, RubyPort *_port);
 bool sendTiming(PacketPtr pkt);
 void hitCallback(PacketPtr pkt);
+unsigned deviceBlockSize() const;
 
   protected:
 virtual bool recvTiming(PacketPtr pkt);
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: x86: Add checkpointing capability to devices

2011-02-06 Thread Joel Hestness

changeset 7fcfb515d7bf in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=7fcfb515d7bf
description:
x86: Add checkpointing capability to devices

Add checkpointing capability to the Intel 8254 timer, CMOS, I8042,
PS2 Keyboard and Mouse, I82094AA, I8237, I8254, I8259, and speaker
devices

diffstat:

 src/dev/intel_8254_timer.cc |4 +-
 src/dev/x86/cmos.cc |   20 
 src/dev/x86/cmos.hh |4 +
 src/dev/x86/i8042.cc|  109 
 src/dev/x86/i8042.hh|   11 
 src/dev/x86/i82094aa.cc |   29 +++
 src/dev/x86/i82094aa.hh |3 +
 src/dev/x86/i8237.cc|   14 +-
 src/dev/x86/i8237.hh|3 +
 src/dev/x86/i8254.cc|   12 
 src/dev/x86/i8254.hh|4 +
 src/dev/x86/i8259.cc|   36 ++
 src/dev/x86/i8259.hh|3 +
 src/dev/x86/speaker.cc  |   15 ++
 src/dev/x86/speaker.hh  |4 +
 15 files changed, 269 insertions(+), 2 deletions(-)

diffs (truncated from 440 to 300 lines):

diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/intel_8254_timer.cc
--- a/src/dev/intel_8254_timer.cc   Sun Feb 06 22:14:17 2011 -0800
+++ b/src/dev/intel_8254_timer.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -247,7 +247,9 @@
 paramIn(cp, section, base + .read_byte, read_byte);
 paramIn(cp, section, base + .write_byte, write_byte);
 
-Tick event_tick;
+Tick event_tick = 0;
+if (event.scheduled())
+parent-deschedule(event);
 paramIn(cp, section, base + .event_tick, event_tick);
 if (event_tick)
 parent-schedule(event, event_tick);
diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/cmos.cc
--- a/src/dev/x86/cmos.cc   Sun Feb 06 22:14:17 2011 -0800
+++ b/src/dev/x86/cmos.cc   Sun Feb 06 22:14:18 2011 -0800
@@ -111,6 +111,26 @@
 }
 }
 
+void
+X86ISA::Cmos::serialize(std::ostream os)
+{
+SERIALIZE_SCALAR(address);
+SERIALIZE_ARRAY(regs, numRegs);
+
+// Serialize the timer
+rtc.serialize(rtc, os);
+}
+
+void
+X86ISA::Cmos::unserialize(Checkpoint *cp, const std::string section)
+{
+UNSERIALIZE_SCALAR(address);
+UNSERIALIZE_ARRAY(regs, numRegs);
+
+// Serialize the timer
+rtc.unserialize(rtc, cp, section);
+}
+
 X86ISA::Cmos *
 CmosParams::create()
 {
diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/cmos.hh
--- a/src/dev/x86/cmos.hh   Sun Feb 06 22:14:17 2011 -0800
+++ b/src/dev/x86/cmos.hh   Sun Feb 06 22:14:18 2011 -0800
@@ -82,6 +82,10 @@
 Tick read(PacketPtr pkt);
 
 Tick write(PacketPtr pkt);
+
+virtual void serialize(std::ostream os);
+virtual void unserialize(Checkpoint *cp, const std::string section);
+
 };
 
 } // namespace X86ISA
diff -r aafb4a7384d4 -r 7fcfb515d7bf src/dev/x86/i8042.cc
--- a/src/dev/x86/i8042.cc  Sun Feb 06 22:14:17 2011 -0800
+++ b/src/dev/x86/i8042.cc  Sun Feb 06 22:14:18 2011 -0800
@@ -439,6 +439,115 @@
 return latency;
 }
 
+void
+X86ISA::I8042::serialize(std::ostream os)
+{
+uint8_t statusRegData = statusReg.__data;
+uint8_t commandByteData = commandByte.__data;
+
+SERIALIZE_SCALAR(dataPort);
+SERIALIZE_SCALAR(commandPort);
+SERIALIZE_SCALAR(statusRegData);
+SERIALIZE_SCALAR(commandByteData);
+SERIALIZE_SCALAR(dataReg);
+SERIALIZE_SCALAR(lastCommand);
+mouse.serialize(mouse, os);
+keyboard.serialize(keyboard, os);
+}
+
+void
+X86ISA::I8042::unserialize(Checkpoint *cp, const std::string section)
+{
+uint8_t statusRegData;
+uint8_t commandByteData;
+
+UNSERIALIZE_SCALAR(dataPort);
+UNSERIALIZE_SCALAR(commandPort);
+UNSERIALIZE_SCALAR(statusRegData);
+UNSERIALIZE_SCALAR(commandByteData);
+UNSERIALIZE_SCALAR(dataReg);
+UNSERIALIZE_SCALAR(lastCommand);
+mouse.unserialize(mouse, cp, section);
+keyboard.unserialize(keyboard, cp, section);
+
+statusReg.__data = statusRegData;
+commandByte.__data = commandByteData;
+}
+
+void
+X86ISA::PS2Keyboard::serialize(const std::string base, std::ostream os)
+{
+paramOut(os, base + .lastCommand, lastCommand);
+int bufferSize = outBuffer.size();
+paramOut(os, base + .outBuffer.size, bufferSize);
+uint8_t *buffer = new uint8_t[bufferSize];
+for (int i = 0; i  bufferSize; ++i) {
+buffer[i] = outBuffer.front();
+outBuffer.pop();
+}
+arrayParamOut(os, base + .outBuffer.elts, buffer,
+bufferSize*sizeof(uint8_t));
+delete buffer;
+}
+
+void
+X86ISA::PS2Keyboard::unserialize(const std::string base, Checkpoint *cp,
+const std::string section)
+{
+paramIn(cp, section, base + .lastCommand, lastCommand);
+int bufferSize;
+paramIn(cp, section, base + .outBuffer.size, bufferSize);
+uint8_t *buffer = new uint8_t[bufferSize];
+arrayParamIn(cp, section, base + .outBuffer.elts, buffer,
+bufferSize*sizeof(uint8_t));
+for (int i = 0; i  bufferSize; ++i) {
+

[m5-dev] changeset in m5: x86: Timing support for pagetable walker

2011-02-06 Thread Joel Hestness

changeset a9f05ab40763 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=a9f05ab40763
description:
x86: Timing support for pagetable walker

Move page table walker state to its own object type, and make the
walker instantiate state for each outstanding walk. By storing the
states in a queue, the walker is able to handle multiple outstanding
timing requests. Note that functional walks use separate state
elements.

diffstat:

 src/arch/x86/pagetable_walker.cc |  922 ++
 src/arch/x86/pagetable_walker.hh |  181 ---
 src/arch/x86/tlb.cc  |6 +
 src/arch/x86/tlb.hh  |2 +
 4 files changed, 647 insertions(+), 464 deletions(-)

diffs (truncated from 1247 to 300 lines):

diff -r 267e1e16e51b -r a9f05ab40763 src/arch/x86/pagetable_walker.cc
--- a/src/arch/x86/pagetable_walker.cc  Sun Feb 06 22:14:18 2011 -0800
+++ b/src/arch/x86/pagetable_walker.cc  Sun Feb 06 22:14:18 2011 -0800
@@ -40,6 +40,7 @@
 #include arch/x86/pagetable.hh
 #include arch/x86/pagetable_walker.hh
 #include arch/x86/tlb.hh
+#include arch/x86/vtophys.hh
 #include base/bitfield.hh
 #include cpu/thread_context.hh
 #include cpu/base.hh
@@ -67,328 +68,36 @@
 EndBitUnion(PageTableEntry)
 
 Fault
-Walker::doNext(PacketPtr write)
+Walker::start(ThreadContext * _tc, BaseTLB::Translation *_translation,
+  RequestPtr _req, BaseTLB::Mode _mode)
 {
-assert(state != Ready  state != Waiting);
-write = NULL;
-PageTableEntry pte;
-if (size == 8)
-pte = read-getuint64_t();
-else
-pte = read-getuint32_t();
-VAddr vaddr = entry.vaddr;
-bool uncacheable = pte.pcd;
-Addr nextRead = 0;
-bool doWrite = false;
-bool badNX = pte.nx  mode == BaseTLB::Execute  enableNX;
-switch(state) {
-  case LongPML4:
-DPRINTF(PageTableWalker,
-Got long mode PML4 entry %#016x.\n, (uint64_t)pte);
-nextRead = ((uint64_t)pte  (mask(40)  12)) + vaddr.longl3 * size;
-doWrite = !pte.a;
-pte.a = 1;
-entry.writable = pte.w;
-entry.user = pte.u;
-if (badNX || !pte.p) {
-stop();
-return pageFault(pte.p);
+// TODO: in timing mode, instead of blocking when there are other
+// outstanding requests, see if this request can be coalesced with
+// another one (i.e. either coalesce or start walk)
+WalkerState * newState = new WalkerState(this, _translation, _req);
+newState-initState(_tc, _mode, sys-getMemoryMode() == Enums::timing);
+if (currStates.size()) {
+assert(newState-isTiming());
+DPRINTF(PageTableWalker, Walks in progress: %d\n, currStates.size());
+currStates.push_back(newState);
+return NoFault;
+} else {
+currStates.push_back(newState);
+Fault fault = newState-startWalk();
+if (!newState-isTiming()) {
+currStates.pop_front();
+delete newState;
 }
-entry.noExec = pte.nx;
-nextState = LongPDP;
-break;
-  case LongPDP:
-DPRINTF(PageTableWalker,
-Got long mode PDP entry %#016x.\n, (uint64_t)pte);
-nextRead = ((uint64_t)pte  (mask(40)  12)) + vaddr.longl2 * size;
-doWrite = !pte.a;
-pte.a = 1;
-entry.writable = entry.writable  pte.w;
-entry.user = entry.user  pte.u;
-if (badNX || !pte.p) {
-stop();
-return pageFault(pte.p);
-}
-nextState = LongPD;
-break;
-  case LongPD:
-DPRINTF(PageTableWalker,
-Got long mode PD entry %#016x.\n, (uint64_t)pte);
-doWrite = !pte.a;
-pte.a = 1;
-entry.writable = entry.writable  pte.w;
-entry.user = entry.user  pte.u;
-if (badNX || !pte.p) {
-stop();
-return pageFault(pte.p);
-}
-if (!pte.ps) {
-// 4 KB page
-entry.size = 4 * (1  10);
-nextRead =
-((uint64_t)pte  (mask(40)  12)) + vaddr.longl1 * size;
-nextState = LongPTE;
-break;
-} else {
-// 2 MB page
-entry.size = 2 * (1  20);
-entry.paddr = (uint64_t)pte  (mask(31)  21);
-entry.uncacheable = uncacheable;
-entry.global = pte.g;
-entry.patBit = bits(pte, 12);
-entry.vaddr = entry.vaddr  ~((2 * (1  20)) - 1);
-tlb-insert(entry.vaddr, entry);
-stop();
-return NoFault;
-}
-  case LongPTE:
-DPRINTF(PageTableWalker,
-Got long mode PTE entry %#016x.\n, (uint64_t)pte);
-doWrite = !pte.a;
-pte.a = 1;
-entry.writable = entry.writable  pte.w;
-entry.user = entry.user  pte.u;
-if (badNX || !pte.p) {
-stop();
-return pageFault(pte.p);
-}
-

[m5-dev] changeset in m5: TimingSimpleCPU: split data sender state fix

2011-02-06 Thread Joel Hestness

changeset 267e1e16e51b in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=267e1e16e51b
description:
TimingSimpleCPU: split data sender state fix

In sendSplitData, keep a pointer to the senderState that may be updated 
after
the call to handle*Packet. This way, if the receiver updates the packet
senderState, it can still be accessed in sendSplitData.

diffstat:

 src/cpu/simple/timing.cc |  16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diffs (38 lines):

diff -r 8a92b39be50e -r 267e1e16e51b src/cpu/simple/timing.cc
--- a/src/cpu/simple/timing.cc  Sun Feb 06 22:14:18 2011 -0800
+++ b/src/cpu/simple/timing.cc  Sun Feb 06 22:14:18 2011 -0800
@@ -325,26 +325,26 @@
 pkt1-makeResponse();
 completeDataAccess(pkt1);
 } else if (read) {
+SplitFragmentSenderState * send_state =
+dynamic_castSplitFragmentSenderState *(pkt1-senderState);
 if (handleReadPacket(pkt1)) {
-SplitFragmentSenderState * send_state =
-dynamic_castSplitFragmentSenderState *(pkt1-senderState);
 send_state-clearFromParent();
+send_state = dynamic_castSplitFragmentSenderState *(
+pkt2-senderState);
 if (handleReadPacket(pkt2)) {
-send_state = dynamic_castSplitFragmentSenderState *(
-pkt1-senderState);
 send_state-clearFromParent();
 }
 }
 } else {
 dcache_pkt = pkt1;
+SplitFragmentSenderState * send_state =
+dynamic_castSplitFragmentSenderState *(pkt1-senderState);
 if (handleWritePacket()) {
-SplitFragmentSenderState * send_state =
-dynamic_castSplitFragmentSenderState *(pkt1-senderState);
 send_state-clearFromParent();
 dcache_pkt = pkt2;
+send_state = dynamic_castSplitFragmentSenderState *(
+pkt2-senderState);
 if (handleWritePacket()) {
-send_state = dynamic_castSplitFragmentSenderState *(
-pkt1-senderState);
 send_state-clearFromParent();
 }
 }
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

[m5-dev] changeset in m5: garnet: Split network power in ruby.stats

2011-02-06 Thread Joel Hestness

changeset 3a02353d6e43 in /z/repo/m5
details: http://repo.m5sim.org/m5?cmd=changeset;node=3a02353d6e43
description:
garnet: Split network power in ruby.stats

Split out dynamic and static power numbers for printing to ruby.stats

diffstat:

 src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc |  12 +++
 src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh   |   5 
 src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hh|   6 +
 src/mem/ruby/network/orion/NetworkPower.cc|   7 ++
 4 files changed, 30 insertions(+), 0 deletions(-)

diffs (111 lines):

diff -r 409a2692b8e6 -r 3a02353d6e43 
src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc
--- a/src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc Sun Feb 
06 22:14:19 2011 -0800
+++ b/src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.cc Sun Feb 
06 22:14:19 2011 -0800
@@ -319,16 +319,28 @@
 out  -  endl;
 
 double m_total_link_power = 0.0;
+double m_dynamic_link_power = 0.0;
+double m_static_link_power = 0.0;
 double m_total_router_power = 0.0;
+double m_dynamic_router_power = 0.0;
+double m_static_router_power = 0.0;
 
 for (int i = 0; i  m_link_ptr_vector.size(); i++) {
 m_total_link_power += m_link_ptr_vector[i]-calculate_power();
+m_dynamic_link_power += m_link_ptr_vector[i]-get_dynamic_power();
+m_static_link_power += m_link_ptr_vector[i]-get_static_power();
 }
 
 for (int i = 0; i  m_router_ptr_vector.size(); i++) {
 m_total_router_power += m_router_ptr_vector[i]-calculate_power();
+m_dynamic_router_power += m_router_ptr_vector[i]-get_dynamic_power();
+m_static_router_power += m_router_ptr_vector[i]-get_static_power();
 }
+out  Link Dynamic Power =   m_dynamic_link_power   W  endl;
+out  Link Static Power =   m_static_link_power   W  endl;
 out  Total Link Power =   m_total_link_power   W   endl;
+out  Router Dynamic Power =   m_dynamic_router_power   W  endl;
+out  Router Static Power =   m_static_router_power   W  endl;
 out  Total Router Power =   m_total_router_power   W  endl;
 out  -  endl;
 m_topology_ptr-printStats(out);
diff -r 409a2692b8e6 -r 3a02353d6e43 
src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh
--- a/src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh   Sun Feb 
06 22:14:19 2011 -0800
+++ b/src/mem/ruby/network/garnet/fixed-pipeline/NetworkLink_d.hh   Sun Feb 
06 22:14:19 2011 -0800
@@ -54,6 +54,8 @@
 int getLinkUtilization();
 std::vectorint getVcLoad();
 int get_id(){return m_id;}
+double get_dynamic_power(){return m_power_dyn;}
+double get_static_power(){return m_power_sta;}
 void wakeup();
 
 double calculate_power();
@@ -73,6 +75,9 @@
 int m_link_utilized;
 std::vectorint m_vc_load;
 int m_flit_width;
+
+double m_power_dyn;
+double m_power_sta;
 };
 
 #endif // __MEM_RUBY_NETWORK_GARNET_FIXED_PIPELINE_NETWORK_LINK_D_HH__
diff -r 409a2692b8e6 -r 3a02353d6e43 
src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hh
--- a/src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hhSun Feb 06 
22:14:19 2011 -0800
+++ b/src/mem/ruby/network/garnet/fixed-pipeline/Router_d.hhSun Feb 06 
22:14:19 2011 -0800
@@ -81,6 +81,9 @@
 double calculate_power();
 void calculate_performance_numbers();
 
+double get_dynamic_power(){return m_power_dyn;}
+double get_static_power(){return m_power_sta;}
+
   private:
 int m_id;
 int m_virtual_networks, m_num_vcs, m_vc_per_vnet;
@@ -100,6 +103,9 @@
 VCallocator_d *m_vc_alloc;
 SWallocator_d *m_sw_alloc;
 Switch_d *m_switch;
+
+double m_power_dyn;
+double m_power_sta;
 };
 
 #endif // __MEM_RUBY_NETWORK_GARNET_FIXED_PIPELINE_ROUTER_D_HH__
diff -r 409a2692b8e6 -r 3a02353d6e43 src/mem/ruby/network/orion/NetworkPower.cc
--- a/src/mem/ruby/network/orion/NetworkPower.ccSun Feb 06 22:14:19 
2011 -0800
+++ b/src/mem/ruby/network/orion/NetworkPower.ccSun Feb 06 22:14:19 
2011 -0800
@@ -206,6 +206,7 @@
  Pxbar_dyn +
  Pclk_dyn;
 
+m_power_dyn = Ptotal_dyn;
 
 // Static Power
 Pbuf_sta = orion_rtr_ptr-get_static_power_buf();
@@ -215,6 +216,8 @@
 
 Ptotal_sta += Pbuf_sta + Pvc_arb_sta + Psw_arb_sta + Pxbar_sta;
 
+m_power_sta = Ptotal_sta;
+
 Ptotal = Ptotal_dyn + Ptotal_sta;
 
 return Ptotal;
@@ -250,9 +253,13 @@
 double Plink_dyn = orion_link_ptr-calc_dynamic_energy(channel_width/2)*
 (m_link_utilized/ sim_cycles)*freq_Hz;
 
+m_power_dyn = Plink_dyn;
+
 // Static Power
 double Plink_sta = orion_link_ptr-get_static_power();
 
+m_power_sta = Plink_sta;
+
 double Ptotal = Plink_dyn + Plink_sta;
 
 return Ptotal;
___
m5-dev mailing list
m5-dev@m5sim.org

Re: [m5-dev] Testing Functional Access

2011-03-01 Thread Joel Hestness

Hi Nilay,
  I don't know if there is a regression for it, but the M5 utility
(./util/m5/) sets up functional accesses to memory.  For instance, in FS, if
you specify an rcS script to fs.py and call
  % /sbin/m5 readfile
from the command line of the simulated system, it will read the specified
rcS file off the host machine's disk and send it to the memory of the
simulated system using functional accesses.  I think there are other
functional access examples in the magic that the M5 utility provides.
  Hope this helps,
  Joel



On Tue, Mar 1, 2011 at 8:51 AM, Nilay ni...@cs.wisc.edu wrote:

 How can I test whether or not functional accesses to the memory are
 working correctly? Do we have some regression test for this?

 Thanks
 Nilay

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] changeset in m5: garnet: removed flit_width from Routers

2011-04-29 Thread Joel Hestness

 OrionRouter(
 num_in_port,
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] [m5-users] Tracing does not work

2011-05-06 Thread Joel Hestness

Hey Nilay,
  It looks like the tracing (debug) functionality is now working again,
but the M5 help message is still incorrect (and extremely misleading).  For
instance, trace-flags, and trace-file are still accepted, but they don't
do anything now.  They should be eliminated from the message.  We're also
missing the equivalent of trace-start and trace-file.  Do you mind
cleaning that up?
  Thanks,
  Joel

PS.  I haven't followed the tracing/debugging thread closely enough, but it
seems like trace and debug should be different things (though they are
currently implemented as the same thing).  Is there a reason why we moved
over to debug?


On Fri, Apr 29, 2011 at 8:28 AM, Nilay Vaish ni...@cs.wisc.edu wrote:

 On Fri, 29 Apr 2011, Korey Sewell wrote:

  Is it not now debug-help and debug-flags instead of trace-help and
 trace-flags???

 On Fri, Apr 29, 2011 at 9:18 AM, Nilay Vaish ni...@cs.wisc.edu wrote:

  On Thu, 28 Apr 2011, Nilay wrote:

  On Thu, April 28, 2011 7:55 pm, Andrea Pellegrini wrote:



 Hi all,
 I just downloaded the latest repo from: http://repo.m5sim.org/m5
 When I activate the trace functionalities through the flags nothing
 shows
 up in the output. The same command for older versions of m5 (few weeks
 ago)
 worked flawlessly.
 Can anybody help?
 Thanks
 -Andrea


 Andrea, we are aware of the problem. The solution is almost ready, and
 hopefully by tomorrow trace would start functioning again.

 --
 Nilay


  Andrea, trace facility is working now. In fact it was fixed yesterday
 itself.


 --
 Nilay
 ___
 m5-users mailing list
 m5-us...@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users




 --
 - Korey


 That's right, the option names have been changed. But there was some error
 in the trace facility it self that Nate corrected yesterday.


 Nilay
 ___
 m5-users mailing list
 m5-us...@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] [m5-users] Tracing does not work

2011-05-07 Thread Joel Hestness

Hey guys,
  I wasn't sure what the intended outcome with tracing v. debugging was
going to be.  It sounds like the move is to debug as a more general term,
though it will support all of the trace functionality.  In that case, my
confusion arose from the naming of the flags.  Since trace-file and
trace-start now go along with the other debug flags (i.e. you wouldn't use
them unless you're using the debug flags), it probably makes sense to change
the names to reflect the connection.  For instance, debug-trace-file and
debug-trace-start are clearer and still reflect that you'll be collecting
a trace.
  Joel


 I was thinking og doing it since Nate is not around. I'll do it soon.


  instance, trace-flags, and trace-file are still accepted, but they
  don't
  do anything now.  They should be eliminated from the message.  We're
 also
  missing the equivalent of trace-start and trace-file.  Do you mind
  cleaning that up?

 Are you sure that trace-file doesn't work?  I've basically renamed
 --trace-help to --debug-help, so the former can be removed.  Also I've
 renamed --trace-flags to --debug-flags, so that one can be removed
 too.  (I intended to, I just screwed up.)  The purpose of renaming
 trace flags to debug flags is that the flags themselves can be used
 for a lot more than tracing (I'm starting to use them to insert
 debugging breakpoints, they're used for exec trace which is really a
 different tracing facility, they can be used for whatever) and it
 seemed odd to have two different classes of flags (though we could do
 that if we wanted to).

 The only error that I know of right now is that --trace-help and
 --trace-flags still exist and silently act when they shouldn't.  I'm
 compiling right now, but things are slow on my laptop.  I'll test out
 --trace-file, but I'm not sure why that would have changed at all.

  Nate
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev




-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

46 matches

Mail list logo