[m5-users] ALPHA_FS Kernel panic - not syncing: Aiee, killing interrupt handler!
Hi, I have built a couple of the PARSEC benchmarks to run under alpha-linux and have a disk image set up with these binaries and input files to run in M5 ALPHA_FS. I have run the complete blackscholes benchmark using 4 cores/4 threads with no problems. Currently, I am trying scale up the number of processors to 64. Here is my command line: build/ALPHA_FS/m5.opt --outdir=$OUTDIR --trace-flags=Bus --trace-start=17000 --trace-file=$OUTDIR/tracelog.txt configs/joel/fs.py --detailed --caches --l2cache --num-cpus=64 --fast-forward=8000 --script=./configs/joel/parsec/blackscholes.rcS The first shot at running this resulted in an assertion failure that is documented in the mailing list thread: http://www.mail-archive.com/m5-users@m5sim.org/msg00107.html. Based on the recommendation from this thread, I commented the assertion, rebuilt M5 and reran the benchmark (I am not sure if this is related to the next problem that I am having). Now, the benchmark fails with the following output: Switched CPUS @ cycle = 28289857500 warn: Entering event queue @ 28289857500. Starting simulation... Changing memory mode to timing switching cpus REAL SIMULATION warn: Entering event queue @ 28289858000. Starting simulation... panic: Halt not implemented! @ cycle 28289931750 [halt:build/ALPHA_FS/cpu/o3/alpha/cpu.hh, line 108] Program aborted at cycle 28289931750 Abort Seems this is a result of a kernel panic. Here are the last few lines of console.system.sim_console (the whole file can be found @ http://www.cs.utexas.edu/~hestness/links/console.system.sim_console): [fc311c60] ret_from_fork+0x0/0x10 [fc33be34] fork_idle+0x54/0xb0 [fc345948] cpu_callback+0xa8/0x1b0 [fc33be08] fork_idle+0x28/0xb0 [fc31d858] __cpu_up+0x38/0x360 [fc310020] __smp_callin+0x0/0x28 [fc310020] __smp_callin+0x0/0x28 Code: a4850018 20650018 408305a1 f428 a4430008 a43e0050 b5a40008 b49e0050 Kernel panic - not syncing: Aiee, killing interrupt handler! I have encountered this problem with both the kernel distributed with M5, and the kernel that I have built (linux-2.6.16.53 using gcc-4.0.2). If anyone has encountered this or something similar, I would really appreciate any help you can offer. Thanks, Joel ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] ALPHA_FS Kernel panic - not syncing: Aiee, killing interrupt handler!
Thanks for the quick response. I checked out the version from the repository and rebuilt, and both errors went away. I am still working through some bugs to get a full run completed, but we'll save them for another thread. :P Thanks again, Joel On Wed, Jul 9, 2008 at 9:11 AM, Ali Saidi [EMAIL PROTECTED] wrote: Some quick guesses: * What version of M5 are you running? There were several cache bug fixes to b5, not just the one your pointed out. You really should be running the version in the repository. * Are you using the tsb_osfpal from the website (or one you compiled yourself) as opposed to ts_osfpal? * Is there a reason your switching to a detailed CPU in the middle of booting? If I run your command line I get a lot further in the booting process before switching CPUs. With an atomic cpu (with or without caches) I can boot 64 processors to the prompt. I would try M5 in the repository and see if that solves the problem. If not comparing the atomic trace to the detailed timing trace would probably give you an idea about where it's going wrong. Ali On Jul 9, 2008, at 2:23 AM, Joel Hestness wrote: Hi, I have built a couple of the PARSEC benchmarks to run under alpha- linux and have a disk image set up with these binaries and input files to run in M5 ALPHA_FS. I have run the complete blackscholes benchmark using 4 cores/4 threads with no problems. Currently, I am trying scale up the number of processors to 64. Here is my command line: build/ALPHA_FS/m5.opt --outdir=$OUTDIR --trace-flags=Bus -- trace-start=17000 --trace-file=$OUTDIR/tracelog.txt configs/ joel/fs.py --detailed --caches --l2cache --num-cpus=64 --fast- forward=8000 --script=./configs/joel/parsec/blackscholes.rcS The first shot at running this resulted in an assertion failure that is documented in the mailing list thread: http://www.mail-archive.com/m5-users@m5sim.org/msg00107.html . Based on the recommendation from this thread, I commented the assertion, rebuilt M5 and reran the benchmark (I am not sure if this is related to the next problem that I am having). Now, the benchmark fails with the following output: Switched CPUS @ cycle = 28289857500 warn: Entering event queue @ 28289857500. Starting simulation... Changing memory mode to timing switching cpus REAL SIMULATION warn: Entering event queue @ 28289858000. Starting simulation... panic: Halt not implemented! @ cycle 28289931750 [halt:build/ALPHA_FS/cpu/o3/alpha/cpu.hh, line 108] Program aborted at cycle 28289931750 Abort Seems this is a result of a kernel panic. Here are the last few lines of console.system.sim_console (the whole file can be found @ http://www.cs.utexas.edu/~hestness/links/console.system.sim_console) : [fc311c60] ret_from_fork+0x0/0x10 [fc33be34] fork_idle+0x54/0xb0 [fc345948] cpu_callback+0xa8/0x1b0 [fc33be08] fork_idle+0x28/0xb0 [fc31d858] __cpu_up+0x38/0x360 [fc310020] __smp_callin+0x0/0x28 [fc310020] __smp_callin+0x0/0x28 Code: a4850018 20650018 408305a1 f428 a4430008 a43e0050 b5a40008 b49e0050 Kernel panic - not syncing: Aiee, killing interrupt handler! I have encountered this problem with both the kernel distributed with M5, and the kernel that I have built (linux-2.6.16.53 using gcc-4.0.2). If anyone has encountered this or something similar, I would really appreciate any help you can offer. Thanks, Joel ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Running M5 in Condor
Hi, Has anyone tried to run M5 under Condor? I am getting very basic errors like ImportError: No module named m5.main when running a test try. I am wondering if there are conflicts with the Python interpretter embedded in the simulator since Condor doesn't support running Python scripts directly. Any help is greatly appreciated. Thanks, Joel ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Instruction counts
Hi, I am looking at instruction counts that are output in m5stats.txt for a couple simulations that I have run. I am using ALPHA_FS with the detailed core, and I am confused about the values that are output. In m5stats.txt, the value 'sim_insts' claims to the the number of instructions simulated. On further inspection, each of the cores also has commit statistics that include 'commit.COM:count', 'commit.commitCommittedInsts', 'committedInsts' and 'committedInsts_total'. I tried tracing through the code where these counters are updated, and some of them seem to be redundant. The problem that I am having is that when I sum any of these commit statistics over the set of cores, none of them are equal to 'sim_insts'. In fact, the difference is usually 5-10x. I am hoping someone could shed some light on the discrepency, and let me know the purpose of all these seemingly redundant counters. Thanks, Joel ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Resend: Possible M5 Contributions
Hi Rick, Mark Gebhart, another grad student at UT, has been working on PARSEC v2.1 as a follow-on to what he and I were able to complete last fall. We are just about finished with modifications to the build scripts included in the package, and we can forward everything along when we polish it up. At this point, I think we have all benchmarks compiled, and 10-11 of the 13 working. More updates soon, Joel On Mon, Oct 5, 2009 at 12:02 PM, Rick Strong rstr...@cs.ucsd.edu wrote: Based on all the feedback, it seems that the parsec scripts are in highest demand. I will work on getting a patch out to the community in the next few days for everyone's perusal. Best, -Rick Ali Saidi wrote: I think the power model and parsec are probably the most interesting at first. Additionally, it might be nice to have the directory coherence, mesh, crossbar, and memory checker somewhere even if they aren't integrated into the repository and the code doesn't meet the style guidelines. Is the memory checkpointing a big change? It seems like that should make it in soon too. Thanks, Ali On Oct 2, 2009, at 2:10 PM, Rick Strong wrote: I made an indentation error and have resent my list. Note the correct indentation of Memory Debugger. At the encouragement of Nate, I have formed a list of possible contributions that I could make to the M5 community. Like many of you, time is against me so if any of these features sounds interesting to the community, please let me know and give me a relative order of interest. If none of the additions seem interesting, I will not be hurt as that means more development time ... ^_^. *Possible M5 Contributions* * Parsec Simulation o shell script for running various workloads o tips in compiling parsec for alpha while using the general parsec build manager o Observed difficulties with the different input sets and solutions o Performance numbers to compare against * Directory Coherence Model -- based on a upgraded version Jiayuan Meng's work o MSI directory coherence model o Supports packet piggy backing for invalidate, fetch, writeback requests o Banked model exists -- Based on Jiayuan Meng's work o Verified to work in both the memory tester and Parsec benchmarks for simsmall and medium * Mesh Model --based on Jiayuan Meng's work * Crossbar model o Full connected network plugin that is a minor rewrite of the current bus model. * Memory Debugger o Timeout for each memory access o Each memory access read is verified against a value stored in atomic memory blog. o Each memory write is recorded in the atomic memory blog o Inserts at the cpu_side of each L1 cache. o A separate memory debugger is plugged into IObus and each cpu core. o All memory debuggers connect to the same memory blog: o Memory blog uses the default PhysicalMemory object already in M5 * New alpha console o I have not had a large block of time to figure out how to remake all the regression values. * Memory check pointing works for 2GB and greater * McPAT power model support o New power model replacement for Wattch that offers greater fidelity and completeness to actual systems. o You can use output statistics to generate power numbers o I have a script that generates a structured xml file with the combined information of config.ini and m5stats.txt that contains performance counter information and system settings. I am probably forgetting a few things here, but I think this includes most of the interesting changes that I could release. Let me know your thoughts. Thanks in advance, Richard ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Creating/Modifying disk images without sudo or root access
Hi, I seem to recall reading in the mailing list about how to create and modify disk images without having sudo or root access on a machine. I have searched through the archives, and I can't find anything about it. Is there a way to do this? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Creating/Modifying disk images without sudo or root access
Excellent. Thanks guys. I'll give these a shot. Joel On Thu, Nov 5, 2009 at 12:27 AM, Will Beazley wgbeaz...@my.lamar.eduwrote: Consider the following: I believe you can install virtual box with its additions with ubuntu/solaris/opensolaris without being root or on a laptop. Copy it through your virtual box shared directory. Loop mount the image make you changes copy it back through the shared dir. Might work. Steve Reinhardt wrote: On Wed, Nov 4, 2009 at 8:24 PM, nathan binkert n...@binkert.org mailto:n...@binkert.org wrote: I seem to recall reading in the mailing list about how to create and modify disk images without having sudo or root access on a machine. I have searched through the archives, and I can't find anything about it. Is there a way to do this? Since nobody has responded, I will. You can create a disk image without root access using dd to create a file and run fdisk on that file. As for editing it, you can't mount it if you're not root. Maybe some people suggested using the simulator itself or using a virtual machine. If you can get a one-time assist from someone who is root, you could get them to put a line in /etc/fstab with the 'user' line set, maybe pointing to a symlink that you can point at the image you actually want to mount... I'm not positive this would work, but it's a thought. Steve ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] What do I need to run my own benchmark on full system mode?
Hi, You need to tell the simulator to exit somewhere. We usually add the following line to the end of our .rcS files: /sbin/m5 exit Hope this helps, Joel 2009/11/10 junli gu guju...@gmail.com Hi everybody, I have been stuck on running my own benchmark. I guess I am lost in the script files. My situation is: 1.I have compiled the full system. It can run default benchmarks. And I installed M5term to check the running process. 2.I made my own benchmarks and cross compiled it into alpha executables,with the name 'hello'. 3.I copy the binaries 'hello' into my image disk, in the folder of benchmark. 4.I made a run script called 'hello.rcS' and put it into $Home/m5/configs/example/fs.py. THe content is like the following: #! /bin/sh echo executing hello world eval /benchmark/hello 5.Run in full system mode: build/ALPHA_FS/m5.opt -d /home/junligu/output configs/example/fs.py --script configs/boot/hello.rcS And I get the output of the benchmark by M5term and then the simulator seems like stuck there like the following: output from M5term: loading script... executing hello world hello world ! i= 9 Then the simulator stuck here by two days: command line: build/ALPHA_FS/m5.opt -d /home/junligu/output configs/example/fs.py --script configs/boot/hello.rcS Global frequency set at 1 ticks per second info: kernel located at: /dist/m5/system/binaries/vmlinux Listening for system connection on port 3456 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... As to my knowledge, fs.py is about the configuration of simulated system . .rcS file is like a boot script file describing how to run the benchmark. So I need to create my own configuration and the boot scripts. Nut I believe the easiest way to start is to use the existed fs.py? Any one can help me out? Thank you very muuch! -- Gu Junli--谷俊丽 PHD Candidate of Tsinghua University Beijing 100084,China Tel: 86-10-62795139 ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] no contention bus in detailed simulation
Hi, I recently pulled down the current m5-stable repository to do some testing, and I patched it with my changes from our group's older repository (versions below). When I try to run some of our tests, I get a segmentation fault. I have traced it back to our modification the the bus in ./src/mem/bus.cc. We had modified calcPacketTiming so that we could simulate a zero-latency/no-contention bus in detailed simulation (recvTiming). I am not sure where to start debugging this problem. I have attached the gdb output. Is there, perhaps, a better way to simulate a bus with no contention in detailed simulation? Thanks, Joel Revisions: Old: changeset: 5589:733318abb7b1 Current: changeset: 6283:94c016415053 -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness gdb --args ./build/ALPHA_FS/m5.debug --trace-flags=Bus --trace-start=182876000 --trace-file=tracelog.txt ./configs/example/fs.py --script=../m5-parsec/bash-m5/test.rcS --detailed --caches --l2cache --fast-forward=100 j...@capillary:~/research/m5-stable$ gdb --args ./build/ALPHA_FS/m5.debug --trace-flags=Bus --trace-start=182876000 --trace-file=tracelog.txt ./configs/example/fs.py --script=../m5-parsec/bash-m5/test.rcS --detailed --caches --l2cache --fast-forward=1 GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as i486-linux-gnu... (gdb) run Starting program: /home/joel/research/m5-stable/build/ALPHA_FS/m5.debug --trace-flags=Bus --trace-start=182876000 --trace-file=tracelog.txt ./configs/example/fs.py --script=../m5-parsec/bash-m5/test.rcS --detailed --caches --l2cache --fast-forward=1 [Thread debugging using libthread_db enabled] [New Thread 0xb7a9d8e0 (LWP 3241)] M5 Simulator System Copyright (c) 2001-2008 The Regents of The University of Michigan All Rights Reserved M5 compiled Nov 27 2009 01:36:44 M5 revision 94c016415053+ 6283+ default tip M5 started Nov 27 2009 01:40:21 M5 executing on capillary command line: /home/joel/research/m5-stable/build/ALPHA_FS/m5.debug --trace-flags=Bus --trace-start=182876000 --trace-file=tracelog.txt ./configs/example/fs.py --script=../m5-parsec/bash-m5/test.rcS --detailed --caches --l2cache --fast-forward=1 Global frequency set at 1 ticks per second info: kernel located at: /home/joel/research/m5-parsec/binaries/vmlinux Listening for system connection on port 3456 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 Switch at instruction count:1 info: Entering event queue @ 0. Starting simulation... hack: be nice to actually delete the event here Switched CPUS @ cycle = 1828747071000 Changing memory mode to timing switching cpus REAL SIMULATION info: Entering event queue @ 1828747071000. Starting simulation... Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb7a9d8e0 (LWP 3241)] 0x080befe0 in std::listPacket*, std::allocatorPacket* ::empty (this=0x4) at /usr/include/c++/4.3/bits/stl_list.h:759 759 { return this-_M_impl._M_node._M_next == this-_M_impl._M_node; } (gdb) info stack #0 0x080befe0 in std::listPacket*, std::allocatorPacket* ::empty (this=0x4) at /usr/include/c++/4.3/bits/stl_list.h:759 #1 0x080b8f03 in BasePrefetcher::getPacket (this=0x0) at build/ALPHA_FS/mem/cache/prefetch/base.cc:134 #2 0x08112493 in CacheLRU::getNextMSHR (this=0x9d655a0) at build/ALPHA_FS/mem/cache/cache_impl.hh:1307 #3 0x08112532 in CacheLRU::getTimingPacket (this=0x9d655a0) at build/ALPHA_FS/mem/cache/cache_impl.hh:1325 #4 0x08112744 in CacheLRU::MemSidePort::sendPacket (this=0x9d93790) at build/ALPHA_FS/mem/cache/cache_impl.hh:1518 #5 0x0811364e in CacheLRU::MemSidePort::processSendEvent (this=0x9d93790) at build/ALPHA_FS/mem/cache/cache_impl.hh:1575 #6 0x0810a41c in EventWrapperCacheLRU::MemSidePort, (CacheLRU::MemSidePort::processSendEvent())::process (this=0x9456df0) at build/ALPHA_FS/sim/eventq.hh:468 #7 0x082233de in EventQueue::serviceOne (this=0x884eeb8) at build/ALPHA_FS/sim/eventq.cc:202 #8 0x08587dcc in simulate (num_cycles=9223372036854775807) at build/ALPHA_FS/sim/simulate.cc:73 #9 0x085fa28d in _wrap_simulate__SWIG_0 (args=0x9435acc) at build/ALPHA_FS/python/swig/event_wrap.cc:4156 #10 0x085fa395 in _wrap_simulate (self=0x0, args=0x9435acc) at build/ALPHA_FS/python/swig/event_wrap.cc:4206 #11 0xb7dadaed in PyCFunction_Call () from /usr/lib/libpython2.6.so.1.0 #12 0xb7d6798c in PyObject_Call () from /usr/lib/libpython2.6.so.1.0 #13 0xb7e0d0b5 in PyEval_EvalFrameEx () from /usr/lib/libpython2.6.so.1.0 #14 0xb7e11910
Re: [m5-users] no contention bus in detailed simulation
and the attachment... On Fri, Nov 27, 2009 at 6:31 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Steve, Thanks for the quick response. I tried the m5 repository and ran into the same issue. I have attached the updated src/mem/bus.cc that we had in our old source. The changes that cause the problem are in calcPacketTiming, updating tickNextIdle and the packet firstWordTime and finishTime. Thanks, Joel On Fri, Nov 27, 2009 at 4:14 PM, Steve Reinhardt ste...@gmail.com wrote: This bug may or may not be related to your changes; there were a number of prefetcher bugs that I worked through a while ago and I'm not sure if they're in m5-stable or not.. In general, m5-stable is pretty stale and we haven't been very good about pushing bug fixes there, so if you're trying to bring your code up to date it's much better to go with the main m5 repository than m5-stable. If you still have trouble at that point, let us know. Steve On Fri, Nov 27, 2009 at 12:32 PM, Joel Hestness hestn...@cs.utexas.edu wrote: Hi, I recently pulled down the current m5-stable repository to do some testing, and I patched it with my changes from our group's older repository (versions below). When I try to run some of our tests, I get a segmentation fault. I have traced it back to our modification the the bus in ./src/mem/bus.cc. We had modified calcPacketTiming so that we could simulate a zero-latency/no-contention bus in detailed simulation (recvTiming). I am not sure where to start debugging this problem. I have attached the gdb output. Is there, perhaps, a better way to simulate a bus with no contention in detailed simulation? Thanks, Joel Revisions: Old: changeset: 5589:733318abb7b1 Current: changeset: 6283:94c016415053 -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness bus.cc Description: Binary data ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] ALPHA crosscompiler info
Hi, On UTCS systems, we have always had trouble using the precompiled crosscompilers on the M5 site, as we have run into issues with directory structure or host system architecture. However, we have been able to build our own crosscompilers using crosstool to get by. We noticed that the M5 site has gcc-4.3.2 available in the downloads section, but we were hoping to get more information about the build, so that we can replicate it. Can someone let us know which versions of glibc, binutils, linux headers and patches were used to build each of the crosscompilers? Thanks, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Infrastructure for running PARSEC 2.1 on M5
Hi everyone, Mark Gebhart led a small group, including myself, to document the process of building PARSEC 2.1 to run under M5 ALPHA. In the process, we put together a technical report for building PARSEC 2.1, a disk image and other infrastructure, which are all available on our site: http://www.cs.utexas.edu/~cart/parsec_m5/ . Let us know if you have additions or questions. Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Linux v2.6.27 limits the number of cores to 32
Hi, I am running into an issue with running applications in ALPHA_FS on more than 32 cores. The cores of index greater than or equal to 32 sit and spin, as indicated by their uniformly small number of committed instructions, during parallel sections of benchmarks with more than 32 threads. The problem exists in kernel v2.6.27, which we built using the method described on the M5 site. I have done some testing, and I have identified that our kernels, v2.6.13 and v2.3.16, both schedule threads on all 64 cores. Further, the file, `.config.m5' contains a line to configure the maximum number of cores, which is different for different linux versions: 2.6.13, 2.6.16: CONFIG_NR_CPUS=64 2.6.22, 2.6.27: CONFIG_NR_CPUS=32 I've tried building a few different versions, and each asks Use M5 64 Processor Tsumani Modification (BIG_TSUNAMI) [Y/n/?], but setting this doesn't seem to have an affect on kernels 2.6.22 or 2.6.27. The config script restricts the CONFIG_NR_CPUS flag to 2-32 cores for kernel versions 2.6.22 and 2.6.26. Can someone give some insight into how to fix this so the v2.6.27 scheduler works with 64 cores? Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Linux v2.6.27 limits the number of cores to 32
Hi, It appears that I have fixed this issue by simply changing the range restriction on CONFIG_NR_CPUS from 2-32 to 2-64 cores in the file `./arch/alpha/Kconfig', and then setting CONFIG_NR_CPUS=64 in the file `.config.m5' before building. I would really appreciate if someone more familiar with M5 kernel modifications could verify that this is a valid correction and fix the M5 linux-patches repo to include this change. The affected versions of linux-2.6 are v2.6.22 and v2.6.27. Thanks, Joel On Wed, Mar 3, 2010 at 3:50 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi, I am running into an issue with running applications in ALPHA_FS on more than 32 cores. The cores of index greater than or equal to 32 sit and spin, as indicated by their uniformly small number of committed instructions, during parallel sections of benchmarks with more than 32 threads. The problem exists in kernel v2.6.27, which we built using the method described on the M5 site. I have done some testing, and I have identified that our kernels, v2.6.13 and v2.3.16, both schedule threads on all 64 cores. Further, the file, `.config.m5' contains a line to configure the maximum number of cores, which is different for different linux versions: 2.6.13, 2.6.16: CONFIG_NR_CPUS=64 2.6.22, 2.6.27: CONFIG_NR_CPUS=32 I've tried building a few different versions, and each asks Use M5 64 Processor Tsumani Modification (BIG_TSUNAMI) [Y/n/?], but setting this doesn't seem to have an affect on kernels 2.6.22 or 2.6.27. The config script restricts the CONFIG_NR_CPUS flag to 2-32 cores for kernel versions 2.6.22 and 2.6.26. Can someone give some insight into how to fix this so the v2.6.27 scheduler works with 64 cores? Thank you, Joel -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] PARSEC 2.1 with m5 - segfault!
Excellent, I'm glad that worked. Also keep in mind that your crosscompiler might need to be updated to reflect which version of the kernel you are using. More information on crosscompilation is in the tech report. Good luck, Joel On Wed, Mar 24, 2010 at 6:04 PM, Bhushan mo...@cs.virginia.edu wrote: Ah! The problem was, I was using the kernel from the m5 website instead of using the kernel packaged along with the tech report. That was the reason of the crash. Using the correct kernel did the trick. Thanks! On Wed, Mar 24, 2010 at 5:10 PM, ef snorla...@gmail.com wrote: What disk image are you using as well? You need to make sure the libraries in the disk image you are running are compatible with the benchmarks libc (your compiler) as well. As libc is not entirely statically link. On Wed, Mar 24, 2010 at 2:44 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Bhusan, Can you attach the output from the simulation and let us know which version of the linux kernel you are using? Thanks, Joel On Wed, Mar 24, 2010 at 2:02 PM, Bhushan mo...@cs.virginia.edu wrote: Hi, I'm a newbie to m5 and trying to run PARSEC 2.1on m5 under alpha in full system mode. First, I would like to thank people in UT Austin for publishing a tech report on this, it was very helpful. Now coming to the issue, even though I was able to boot the kernel in m5, when I try to run the any application in parsec suite, the app segfaults. Here are some more details: # ./build/ALPHA_FS/m5.opt ./configs/example/fs.py -n 1 --detailed --caches --l2cache and once the kernel boots, I do (as was mentioned in the tech report): /parsec/install/bin/blackscholes 64 /parsec/install/inputs/blackscholes/in_64K.txt /parsec/install/inputs/blackscholes/prices.txt This segfaults. The stack trace after crash looks like this: #0 0x000120016d8c in __libc_message () #1 0x00012001714c in __libc_fatal () #2 0x00012000c9cc in __libc_start_main () #3 0x00012218 in _start () This happens with all other applications in parsec suite. However, a simple hello world program does run correctly. My guess is there are some library incompatibilities which causes this crash. I used the kernel available in the m5 website ( http://www.m5sim.org/wiki/index.php/Download) under the Full System files section. I used the disk image from the UT Austin website ( http://www.cs.utexas.edu/~parsec_m5/linux-parsec-2-1-m5-with-test-inputs.img.bz2http://www.cs.utexas.edu/%7Eparsec_m5/linux-parsec-2-1-m5-with-test-inputs.img.bz2) for the parsec suite. So how do I go about fixing this and make PARSEC 2.1 work correctly with m5? -- Regards, Bhushan ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestnesshttp://www.cs.utexas.edu/%7Ehestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Regards, Bhushan ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Fwd: Checkpointing PARSEC benchmark in m5.
Hi Bhushan, We haven't run across this issue in the context of checkpointing using our infrastructure. However, I did have this problem when I was trying to modify M5, and I believe its caused by scheduling issues in the event queue. I have a couple questions: 1) Have you made any modifications to M5? If so, are they timing related? 2) Are you using the same simulated system and kernel image? Thanks, Joel -- Forwarded message -- From: Mark Gebhart mgebh...@cs.utexas.edu Date: Sun, Apr 4, 2010 at 8:24 PM Subject: Re: [m5-users] Checkpointing PARSEC benchmark in m5. To: Joel Hestness jthestn...@gmail.com Hey Joel, I haven't ever seen this error. It looks like from his commands that is doing everything right so I am not sure why it is crashing. Mark On Sun, Apr 4, 2010 at 8:16 PM, Joel Hestness jthestn...@gmail.com wrote: Hey Mark, Do you have any insights on this? Thanks, Joel -- Forwarded message -- From: Bhushan mo...@cs.virginia.edu Date: Sun, Apr 4, 2010 at 7:22 PM Subject: Re: [m5-users] Checkpointing PARSEC benchmark in m5. To: M5 users mailing list m5-users@m5sim.org Any thoughts on the crash that I have mentioned? On Fri, Apr 2, 2010 at 6:10 PM, ef snorla...@gmail.com wrote: A good explanation can be found here on ROI and checkpoints, also check out parsec benchmark code : http://www.cs.utexas.edu/~parsec_m5/TR-09-32.pdf basically roi region is where all the parallel portion of execution is done, exlcuding startup and exit code Ive also heard, take this with a grain of salt running 64 cores and trying to restore checkpoint is a bit buggy. I think other people have had similar problems. On Fri, Apr 2, 2010 at 4:03 PM, Bhushan mo...@cs.virginia.edu wrote: Hi, I'm trying to use the checkpoint feature in m5 for the benchmarks in the PARSEC suite. In the first run, the checkpoint gets created and in the second run when I try to run in detailed mode using the restore checkpoint option, I get some errors. first run - creating checkpoint - successful. # ./build/ALPHA_FS/m5.opt ./configs/example/fs.py -n 1 --script=./scripts/blackscholes_64c_simdev_ckpts.rcS second run - running in detailed mode: #./build/ALPHA_FS/m5.opt ./configs/example/fs.py --detailed --caches --l2cache --checkpoint-restore=1 -n 1 .. .. Switch at curTick count:1 info: Entering event queue @ 2254485270500. Starting simulation... m5.opt: build/ALPHA_FS/sim/simulate.cc:68: SimLoopExitEvent* simulate(Tick): Assertion `curTick = mainEventQueue.nextTick() event scheduled in the past' failed. Program aborted at cycle 2254485270500 The benchmarks in the PARSEC suite run fine if I do not use the checkpointing feature. Also, I have been trying to understand how exactly checkpointing is invoked? How does m5 know from which part the ROI starts? Where does (in the scripts) m5 create a checkpoint? If these questions sound repetitive, could anyone point me to the mailing list discussions that explain checkpointing (references to checkpointing in mail archive seem to explain specific cases instead of the general working)? -- Regards, Bhushan ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Regards, Bhushan ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] PARSEC multi core
Hi Sheng, The script generator that we distribute for running these puts the switchcpu command in each of the run scripts. If you haven't specified a set of CPUs for M5 to switch to when the command is called, it will exit. Our normal usage of M5 with the PARSEC run scripts uses atomic CPU simulation to fast-forward through Linux boot, and then it switches to the detailed CPUs just before running the benchmark (hence the switchcpu command). If you want to do this, you will need to specify that you want to fast-forward using the command line parameter to the fs.py script. If you would rather not fast-forward, but still run the PARSEC benchmark with this run script, you will need to comment out the switchcpu command. Hope this helps, Joel On Mon, Apr 26, 2010 at 11:54 AM, sheng qiu herbert1984...@gmail.comwrote: Hi all, i have a question about running PARSEC with 8 cores using M5. my script is: #!/bin/sh # File to run the blackscholes benchmark cd /parsec/install/bin /sbin/m5 switchcpu /sbin/m5 dumpstats /sbin/m5 resetstats ./blackscholes 8 /parsec/install/inputs/blackscholes/in_4K.txt /parsec/install/inputs/blackscholes/prices.txt echo Done :D /sbin/m5 exit /sbin/m5 exit my command line is: build/ALPHA_FS/m5.opt configs/example/fs.py --script=/../../blackscholes.rcS --caches --l2cache during the running it shows this information always: warn: clear IPI for CPU=#num, but NO IPI, is this normal? if i change to 4 cores, there is information at the end of running: Exiting @ cycle 2267922695500 because switchcpu is this a normal ending? Thanks, Sheng ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] PARSEC multi core
Hi Sheng, I'm not exactly sure what the problem is here, but it could be the case that the Linux kernel that you are using will only support 4 cores. There is a Linux kernel supporting up to 64 cores on our website that you can try to see if that fixes the issue: http://userweb.cs.utexas.edu/~cart/parsec_m5/http://userweb.cs.utexas.edu/%7Ecart/parsec_m5/ . Good luck, Joel On Mon, Apr 26, 2010 at 3:27 PM, sheng qiu herbert1984...@gmail.com wrote: hi Joel, now it's ok when the core number is no more than 4. but when i set more than 4 cores, it will show information: clear IPI for CPU #num, but NO IPI all the time. and the system.terminal shows that the booting stopped at this process: Slave CPU 4 console command START SlaveCmd: restart FC310020 FC310020 vptb FFFE my_rpb FC018B80 my_rpb_phys 18B80 is there anything wrong? i download the disk image of PARSEC from the website:http://userweb.cs.utexas.edu/~cart/parsec_m5/http://userweb.cs.utexas.edu/%7Ecart/parsec_m5/ Thanks, Sheng ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] DPRINT information of PARSEC's ROI
If you're just interested in the ROI rather than the whole benchmark, you could also use checkpointing functionality with the PARSEC hooks. Run the application with tracing off and checkpoint at the beginning of the ROI, and then rerun the application starting from the checkpoint with tracing enabled. Its a 2-step process, but it might be easier than writing code to do it. Hope this helps, Joel On Thu, May 20, 2010 at 12:04 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: I'm pretty sure there isn't any way to do that with the code as is. After Googling a little, it looks like you could use the PARSEC hooks to run a new M5 pseudo instruction that turned on or off tracing. There are probably other ways to do that, although I expect you'll have to write some code no matter what. Gabe Quoting sheng qiu herbert1984...@gmail.com: Hi all, I am using PARSEC benchmarks on M5, i want to DPRINT some information of ROI of PARSEC applications. But how can i let M5 only DPRINT information of ROI. Thanks, Sheng ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] [m5-dev] Regression tests for X86
Yep. By eliminating the Linux init, boot time of a single CPU dropped from ~12 hours down to ~2. I tried applying the patches from the M5 Linux patch queue, and a couple for v2.6.27 worked out of the box. From there, I ported a delay loop patch (maybe two?) from the queue for x86_64, and that cut boot time down to about 70 minutes (for perspective, ALPHA_FS boot time on this machine takes about 11 minutes). I think there are some more patches in the queue that might also help the boot time, but they probably also need to be ported. Once I got down to a tolerable iteration time, I moved on to other things (like checkpointing, which will hopefully be ready soon :] ). I can dig out my build configs and give some more gory detail on the patches if anyone's interested. I'm not sure when I will have time to circle back on the patches. Joel On Sun, Aug 15, 2010 at 11:49 PM, Gabe Black gbl...@eecs.umich.edu wrote: nathan binkert wrote: Aside from building a Linux kernel, you will need to build and configure a disk image as well, which is also a fair amount of work. I've found that, unfortunately due to the long simulation time of Linux boot up, the iteration time to debug the X86_FS bootup is quite long. Really, bootup is quite long? It's pretty fast on Alpha and I wouldn't expect that much more to be going on in x86. There are a lot of delay loops during bootup that slow things down. Have you elided all of those? Nate ___ m5-dev mailing list m5-...@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev Just to clarify, multicore bootup of an unmodified kernel -does- work, at least in the limited circumstances I've tried it (atomic CPU, a particular version of Linux, etc.). Please correct me if I'm wrong, Joel, but I'm guessing you mean multicore support with your modified kernel, right? There are patches out there which cut out the delay loops, but I don't know if they've actually made it anywhere people can get at them. I wasn't confident they were correct at the time since I didn't know how to really test them thoroughly, so I never pushed them upstream. They do make a noticeable difference, but it's not night and day. If you boot Linux to the end of kernel initialization where it starts the first user process, the boot time isn't too bad. If you leave in all the init scripts that start up the various services turned on by default in a stage 3 Gentoo image, the time to a login is quite substantial. If that's what you're doing, you could save *alot* of time by getting rid of the unnecessary scripts. I've never attempted this myself, but I'm guessing it isn't too bad. Gabe ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Some of PARSEC benchmarks never end.
Hi Aleksei, I've used these benchmarks fairly extensively, and I've only experienced problems with hangs in very old versions of M5 when using the O3 CPU model. Keep in mind that some of these benchmarks can run for a *long* time, so rather than hanging, you might just be experiencing (perceptibly) very slow simulation. Can you tell us more about your configuration (cache hierarchy, core type)? Unfortunately, I don't have much experience with Ruby, but I know that certain cache configurations and coherence protocols work. Brad Beckmann might be able to give a better update there. Joel On Sun, Sep 5, 2010 at 12:27 AM, Lesha Jolondz aleksei.jolo...@gmail.comwrote: Hi, I have a troubles running Parsec benchmarks at FS mode as it is described at PARSEC v2.1 for M5(http://userweb.cs.utexas.edu/~parsec_m5/). When I run the benchmarks at 4 cores configuration some of them just hang, depending on timing. It is probably because of a problem at current memory system - a memory request gets lost. Does anybody has encountered the same problem? any solutions? patches to memory system? As I know you are currently working on new memory system called ruby. When are you expecting it to work stable? Thanks in advance, Aleksei ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] What is a reasonable size for memory under ALPHA_FS?
Additionally, in FS mode if you're running benchmarks with very large memory footprints, you can mount a swap space disk to make the system more realistic (as opposed to exorbitant amounts of simulated memory). Before running the benchmark, just run: % /sbin/swapon /dev/hdc Joel On Wed, Oct 13, 2010 at 3:20 PM, Steve Reinhardt ste...@gmail.com wrote: There are bugs in the O3 model such that restoring directly from a checkpoint into O3 doesn't work. That's why the standard-switch model exists. I don't think it has anything to do with the memory size. Steve On Wed, Oct 13, 2010 at 11:54 AM, Lide Duan leaderd...@gmail.com wrote: Hi, I noticed that the default memory size set in Benchmarks.py is 128MB, isn't it too small for reasonable simulations? Previously when I was using ALPHA_SE, the physmem is set to 2GB, and the simulation ran well. In FS mode, however, if 2GB is used, booting up Linux (with atomic CPU) becomes extremely slow; if 1GB or 512MB is used, I can boot up the OS, start the program and make a checkpoint successfully. However, restoring from the checkpoint directly with detailed CPU (--detailed) gives me segmentation fault, the interesting thing is: if I restore the checkpoint with atomic CPU and then switch to timing and detailed ones (--standard-switch), the simulation runs well. For the default value 128MB, both --detailed and --standard-switch can run. I am confused by this observation. Am I missing anything here? What is a reasonable memory size in FS mode (say, for PARSEC programs)? Thanks, Lide ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Cross Compile Linux Kernel for M5 simulator
Hi wj, Double check that alphaev67-unknown-linux-gnu-gcc is actually found in your path by using: % which alphaev67-unknown-linux-gnu-gcc It should print the directory where it is, otherwise it hasn't been included in your path correctly. Joel On Tue, Dec 14, 2010 at 5:26 AM, Ong Wen Jian wen.jian.o...@gmail.comwrote: Hi everyone,, I'm trying to cross compile a linux kernel for M5 simulator and add the GPU driver on the kernel as I'm modelling a GPU device plugin onto M5 simulator. Once I try to cross compile , I receive the error as shown below. I set my PATH to this /home/ongwenjian/x-tools/alphaev67-unknown-linux-gnu/bin ongwenj...@ongwenjian-vpccw25fg:~/Desktop/crosstool-ng-1.8.2/targets/src/linux-2.6.27.48$ sudo make ARCH=alpha CROSS_COMPILE=alphaev67-unknown-linux-gnu- vmlinux make: alphaev67-unknown-linux-gnu-gcc: Command not found CHK include/linux/version.h CHK include/linux/utsrelease.h CC kernel/bounds.s /bin/sh: alphaev67-unknown-linux-gnu-gcc: not found make[1]: *** [kernel/bounds.s] Error 127 make: *** [prepare0] Error 2 Anyway to correct this error ?? regards wj ONG WEN JIAN Student Department of Computer and Communication Systems Engineering, Faculty of Engineering, Universiti Putra Malaysia 43400 UPM Serdang, Selangor Darul Ehsan Tel : 014 - 930 2150 / 017 - 613 6231 ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Cannot resume checkpoint
Hi Sheng, Did you collect the checkpoints from a simulated system with 512MB of memory? The checkpoints encode the current state of memory in the simulated system including the capacity, so you'll need to make sure that the simulated system in both runs (to collect the checkpoint and to restore from it) use the same amount of simulated memory. More generally, an M5 checkpoint is specific to the ISA/architecture, number of cores, and the capacity of memory in the simulated system that you collect the checkpoint from. Hope this helps, Joel On Wed, Feb 9, 2011 at 12:41 PM, Sheng Li sheng@gmail.com wrote: After spending several hours to guess what was wrong, here are my findings: It seems that if I set PhysicalMemory as 512MB, checkpointing can work. However, if I set it as 4096MB (I did this because SPECCPU2006 requires at least 2GB free memory), checkpoint will not work. The place I changed this is in common/example/se.py system = System(cpu = [CPUClass(cpu_id=i) for i in xrange(np)], physmem = PhysicalMemory(range=AddrRange(4096MB)), membus = Bus(), mem_mode = test_mem_mode) Could anyone give some suggestions? Thanks! -Sheng On Wed, Feb 9, 2011 at 12:05 AM, Sheng Li sheng@gmail.com wrote: Hi Guys, I tried to use checkpoints in M5 but could not have it work. I used ALPHA_SE. The commands I use to create/resume checkpoints are M5 outputs are: Creating checkpoint: __ [sli2@newcell ~/m5-work-stable]$ ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2 --take-checkpoint=2200 --at-instruction ... command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2 --take-checkpoint=2200 --at-instruction 22 Global frequency set at 1 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 Creating checkpoint at inst:2200 info: Entering event queue @ 0. Starting simulation... info: Increasing stack size by one page. hack: be nice to actually delete the event here exit cause = a thread reached the max instruction count Writing checkpoint Checkpoint written. Exiting @ cycle 000 because a thread reached the max instruction count Resume checkpoint: _ command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2 --checkpoint-restore=2200 --at-instruction 22 Global frequency set at 1 ticks per second 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 warn: optional parameter system.cpu.workload:M5_pid not present For more information see: http://www.m5sim.org/warn/aa78cda1 REAL SIMULATION info: Entering event queue @ 000. Starting simulation... hack: be nice to actually delete the event here Exiting @ cycle 500 because halt instruction encountered --Here is the problem. Any help would be highly appreciated! Thanks -Sheng ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [m5-users] Cannot resume checkpoint
Hi Sheng, I've dug back through some of my simulations, and I haven't been able to find a case where I used 4GB of simulated memory, so I don't know if I have a baseline to show that the checkpoint restore works with that much memory. On the other hand, I have simulated with 512MB and 1GB of simulated memory, and it has worked fine. For full-system simulations, we often mount a swap disk in the simulated system in order to avoid the small virtual memory constraints imposed by the operating system. I'd have to defer to others on the list for knowledge about whether that would work with SE mode. I can attempt to address your other questions as well: 1) The way that you described the O3 parameters is how I have set them in the past, so that should work. 2) I've seen this problem before... It has had to do with the way that certain SimObjects are instantiated as children of other SimObjects at the beginning of the simulation, and with checkpoint restore, this isn't the cleanest process. When I ran into this problem, I was working on getting x86 timing mode working with Ruby, and Brad Beckmann was able to help me debug. He might be able to suggest first steps for figuring out what's wrong here. Hope this helps, Joel On Wed, Feb 9, 2011 at 3:14 PM, Sheng Li sheng@gmail.com wrote: An two other questions: 1. What should I do to change the O3 parameters such as issueWidth, commitWidth, etc? I added a few lines in se.py as below. It runs fine if I just run the benchmarks, but if I resume a checkpoint (created without -d option), then it will complain the CPU class has no such parameters. I think these parameters can only be set after M5 performs CPU mode switch, then how can I set these parameters so that M5 will use them after switching CPU mode? if options.detailed: CPUClass.commitWidth= 4 CPUClass.decodeWidth= 4 CPUClass.dispatchWidth = 4 CPUClass.fetchWidth = 4 CPUClass.issueWidth = 4 CPUClass.commitWidth= 4 CPUClass.renameWidth= 4 CPUClass.squashWidth= 4 CPUClass.wbWidth= 4 CPUClass.numROBEntries = 128 CPUClass.numIQEntries = 36 CPUClass.LQEntries = 48 2. When I resume a checkpoint with -d --caches options, I got RuntimeError: Attempt to instantiate orphan node. I am trying to figure out what the orphan node is. What should I do to find the orphan node? I tried print self.name in File /afs/ crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 822, in getCCObject, but got nothing. command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2 --checkpoint-restore=0 --simpoint -d --caches --l2cache 2200 m5out/cpt.bzip2.2200 Global frequency set at 1 ticks per second Traceback (most recent call last): File string, line 1, in ? File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/main.py, line 359, in main exec filecode in scope File configs/example/se.py, line 179, in ? Simulation.run(options, root, system, FutureClass) File /afs/ crc.nd.edu/user/s/sli2/m5-work-stable/configs/common/Simulation.py, line 236, in run m5.instantiate(checkpoint_dir) File /afs/ crc.nd.edu/user/s/sli2/m5-work-stable/src/python/m5/simulate.py, line 77, in instantiate for obj in root.descendants(): obj.createCCObject() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 841, in createCCObject def createCCObject(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 796, in getCCParams value = value.getValue() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 845, in getValue def getValue(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 826, in getCCObject self._ccObject = -1 File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 796, in getCCParams value = value.getValue() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/params.py, line 183, in getValue return [ v.getValue() for v in self ] File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 845, in getValue def getValue(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 822, in getCCObject #print self.name RuntimeError: Attempt to instantiate orphan node Thanks a lot! -Sheng On Wed, Feb 9, 2011 at 4:03 PM, Sheng Li sheng@gmail.com wrote: Thanks Joel! Yes, I did. The checkpoint created with 4096MB has problem as lots of information is missing. Is it possible that checkpoint does not support larger memory (i.e 4096MB) in M5? Thanks -Sheng On Wed, Feb 9, 2011 at 3:31 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Sheng, Did you collect the checkpoints from a simulated system with 512MB of memory? The checkpoints encode the current state of memory
Re: [m5-users] Tracing does not work
Hey Nilay, It looks like the tracing (debug) functionality is now working again, but the M5 help message is still incorrect (and extremely misleading). For instance, trace-flags, and trace-file are still accepted, but they don't do anything now. They should be eliminated from the message. We're also missing the equivalent of trace-start and trace-file. Do you mind cleaning that up? Thanks, Joel PS. I haven't followed the tracing/debugging thread closely enough, but it seems like trace and debug should be different things (though they are currently implemented as the same thing). Is there a reason why we moved over to debug? On Fri, Apr 29, 2011 at 8:28 AM, Nilay Vaish ni...@cs.wisc.edu wrote: On Fri, 29 Apr 2011, Korey Sewell wrote: Is it not now debug-help and debug-flags instead of trace-help and trace-flags??? On Fri, Apr 29, 2011 at 9:18 AM, Nilay Vaish ni...@cs.wisc.edu wrote: On Thu, 28 Apr 2011, Nilay wrote: On Thu, April 28, 2011 7:55 pm, Andrea Pellegrini wrote: Hi all, I just downloaded the latest repo from: http://repo.m5sim.org/m5 When I activate the trace functionalities through the flags nothing shows up in the output. The same command for older versions of m5 (few weeks ago) worked flawlessly. Can anybody help? Thanks -Andrea Andrea, we are aware of the problem. The solution is almost ready, and hopefully by tomorrow trace would start functioning again. -- Nilay Andrea, trace facility is working now. In fact it was fixed yesterday itself. -- Nilay ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- - Korey That's right, the option names have been changed. But there was some error in the trace facility it self that Nate corrected yesterday. Nilay ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
Re: [gem5-users] McPAT and M5
Hi Tony, We're currently working on M5 - McPAT coordination in conjunction with a refresh of the McPAT code to coordinate more closely with new gem5 (Ruby and GARNET) components. We're hoping to finish the code for release soon. For the time being, Rick Strong has a decent script that can help get you started at: http://cseweb.ucsd.edu/~rstrong/ http://cseweb.ucsd.edu/~rstrong/ Hope this helps, Joel On Mon, May 16, 2011 at 11:06 AM, Tony Nowatzki t...@cs.wisc.edu wrote: Hi, I was just curious, does anyone know about a parser for McPAT which reads M5 stats files? (or if there is any other elegant way to use McPAT with M5?) Thanks much, Tony ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Question about checkpoint
Hi Atieh, You can take checkpoints within the benchmark by instrumenting the code with the M5 magic instructions. You will need to grab a few files from ./util/m5/ and move them to the source tree of your benchmark: m5op.h, m5ops.h and m5op_arch.S (where arch is the ISA that you're building the benchmark for). You will need to include the m5op.h header in the source code that you're instrumenting, and you will need to build and link against the m5op_arch.S assembly file. As an example of how the instrumentation works, you can check out the m5 control application, ./util/m5/m5.c, and the appropriate Makefiles in ./util/m5. Hope this helps, Joel On Tue, May 31, 2011 at 5:30 AM, Atieh Lotfi ati.lo...@gmail.com wrote: Hi, I have some questions about checkpoitning mechanism in gem5. Is it possible to control the checkpoint from the benchmark itself and not to control it from the command line? I don’t want to take periodic checkpoints. I want to determine some points in the benchmark for taking checkpoints. I also have another question. I want to know over and above serialize.c, which c codes are related to getting checkpoints? Thanks in advance, Regards, Atieh ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Question about checkpoint
Hi Atieh, This is a tricky question to answer, because it depends on a lot of other configuration parameters. If you are using the ALPHA or x86 architectures, I know that checkpointing works for both SE and FS. The other architectures might also work, but I'd have to defer to others on the list about what parts of the checkpointing functionality works for each ISA. If you're using the classic memory model in gem5 (the default), it's worth noting that cache state is not checkpointed, so when you restore, the caches are cold. On the other hand, the Ruby memory model has functionality to checkpoint the caches. Hope this helps, Joel On Thu, Jun 2, 2011 at 12:55 AM, Atieh Lotfi ati.lo...@gmail.com wrote: Hi Joel, Thank you so much for your help. It would be kind of you if you would let me know, is it possible to restore from checkpoints in SE mode?or it just works in FS mode? Regards, Atieh On Wed, Jun 1, 2011 at 8:41 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Atieh, You can take checkpoints within the benchmark by instrumenting the code with the M5 magic instructions. You will need to grab a few files from ./util/m5/ and move them to the source tree of your benchmark: m5op.h, m5ops.h and m5op_arch.S (where arch is the ISA that you're building the benchmark for). You will need to include the m5op.h header in the source code that you're instrumenting, and you will need to build and link against the m5op_arch.S assembly file. As an example of how the instrumentation works, you can check out the m5 control application, ./util/m5/m5.c, and the appropriate Makefiles in ./util/m5. Hope this helps, Joel On Tue, May 31, 2011 at 5:30 AM, Atieh Lotfi ati.lo...@gmail.com wrote: Hi, I have some questions about checkpoitning mechanism in gem5. Is it possible to control the checkpoint from the benchmark itself and not to control it from the command line? I don’t want to take periodic checkpoints. I want to determine some points in the benchmark for taking checkpoints. I also have another question. I want to know over and above serialize.c, which c codes are related to getting checkpoints? Thanks in advance, Regards, Atieh ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ gem5-users mailing list gem5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users