Re: [m5-dev] Review Request: Util: Replace mkblankimage.sh with the new gem5img.py.
On 2011-04-18 21:20:49, Nathan Binkert wrote: util/gem5img.py, line 25 http://reviews.m5sim.org/r/644/diff/1/?file=11664#file11664line25 This makes me feel uneasy for a script that you're likely to call using sudo. I know it's overly paranoid, but why not just simply give the user a tip if the program is not found (which you have to deal with anyway.) I'm not necessarily advocating doing things this way, but this is what the original script was doing. This script should not be called with sudo since it calls it internally, although I suppose it could. I don't know why, but I think if you use sudo to like the script does, you have to make sure you use an absolute path to the target binary. There were some posts to back that up online and that's also what the original script was doing. To get that path, the script calls which, and for that to find things that are normally only usable by root it needs to also search in /sbin and /usr/sbin. If a program isn't found the script will complain and die. On 2011-04-18 21:20:49, Nathan Binkert wrote: util/gem5img.py, line 51 http://reviews.m5sim.org/r/644/diff/1/?file=11664#file11664line51 I'm not going to make you change it or anything, but this whole class seems to me to be a bit overkill, no? __notRoot = None def needSudo(): if __notRoot is None: __notRoot = os.geteuid() != 0 return __notRoot BTW: we also have m5.util.Singleton Yes and no. This came about because the original script had a warning about using sudo which I wanted to preserve. I made it possible to run only part of the script, and if you run a part that doesn't need sudo (like just creating an appropriately sized file) then the warning might be confusing. I also didn't want the warning to show up multiple times if sudo was used more than once. It's a little clumsy, but I didn't see any obvious alternative that would get the behavior I wanted. Feel free to convince me to drop the warning. On 2011-04-18 21:20:49, Nathan Binkert wrote: util/gem5img.py, line 64 http://reviews.m5sim.org/r/644/diff/1/?file=11664#file11664line64 This is pretty similar to m5.util.readCommand which made me think that it might be nice if we put your utility functions here in m5.util Can I use m5.util from an arbitrary python script? If I can that's good to know. Also, how does readCommand work? Does it pass through stdout/stderr or capture it? Depending on the answer it might be an appropriate replacement for this or the subsequent getOutput, but changing only one obscures the similarities between the two functions. If you're advocating adding a new version of readCommand that has the other behavior then that makes sense. I also funnel text into stdin for input, and I think sudo happens to still work because it goes around any redirection I set up. On 2011-04-18 21:20:49, Nathan Binkert wrote: util/gem5img.py, line 72 http://reviews.m5sim.org/r/644/diff/1/?file=11664#file11664line72 This is basically readCommand() See above. On 2011-04-18 21:20:49, Nathan Binkert wrote: util/gem5img.py, line 106 http://reviews.m5sim.org/r/644/diff/1/?file=11664#file11664line106 Here is where you could suggest that it is in /sbin or /usr/sbin I'm not sure what telling them where it might be would accomplish since the script still wouldn't be able to find/use it. I may just not understand what you're getting at. - Gabe --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/644/#review1133 --- On 2011-04-18 02:37:48, Gabe Black wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/644/ --- (Updated 2011-04-18 02:37:48) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- Util: Replace mkblankimage.sh with the new gem5img.py. This change replaces the mkblankimage.sh script, used for creating new disk images, with a new gem5img.py script. The new version is written in python instead of bash, takes its parameters from command line arguments instead of prompting for them, and finds a free loopback device dynamically instead of hardcoding /dev/loop1. The file system used is now optionally configurable, and the blank image is filled by a hole left by lseek and write instead of literally filling it with zeroes. The functionality of the new script is broken into subcommands init, mount, umount, new, partition, and format. init creates a new file of the appropriate size, partitions it, and then formats the first (and only) new parition. mount attaches a new loopback device to the
[m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-atomic FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing FAILED! * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-timing FAILED! * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-atomic FAILED! * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-timing FAILED! * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-atomic-mp FAILED! * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest FAILED! * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-timing-mp FAILED! * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby FAILED! * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/simple-timing-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/simple-timing-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/linux/rubytest-ruby-MOESI_CMP_token FAILED! * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic FAILED! * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic-dual FAILED! * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing FAILED! * build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing-dual FAILED! * build/ALPHA_FS/tests/fast/quick/80.netperf-stream/alpha/linux/twosys-tsunami-simple-atomic FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/inorder-timing FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/o3-timing FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing FAILED! * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-ruby FAILED! * build/POWER_SE/tests/fast/quick/00.hello/power/linux/o3-timing FAILED! * build/POWER_SE/tests/fast/quick/00.hello/power/linux/simple-atomic FAILED! * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-atomic FAILED! * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing FAILED! * build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing-ruby FAILED! * build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/o3-timing FAILED! * build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/simple-atomic FAILED! * build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/simple-timing FAILED! * build/SPARC_SE/tests/fast/quick/40.m5threads-test-atomic/sparc/linux/o3-timing-mp FAILED! *
[m5-dev] what scons can do
I was looking at some of the stuff in util, and it occurred to me that the m5 utility program is cross compiled using different Makefiles for different architectures. Statetrace used to be like that (sort of), but recently I converted it over to scons and set up some configuration variables that made it easier to work with. It would be nice to be able to share some of that with the m5 utility program, although I don't remember it being all that complicated. Anyway, it seems like it would be useful to be able to have multiple binaries that can be built by scons, specifically the utility stuff and unit tests. That way we could avoid having a hodge podge of small build systems which are either isolated or not in not quite the right ways. I know some of Nate's recent changes suggested this was going to get easier. Could you quickly summarize what that's all about, Nate? Speaking of which, the regressions are still broken. Since that's taking a little while, would you mind please backing out the problem change? Also, I was thinking about how to handle the dependencies/generated files/custom language issue a little while back, and what I kept coming back to were schemes where scons would use a cache of dependency information which it would regenerate if any of the input files which determined outputs and/or dependencies changed. The problem is that scons would need to run once and possibly regenerate its cache, and then run again to actually run. Is this sort of multi-pass setup possible somehow without major hacks? To explain more what I'm getting, lets say you have input file foo.isa which, when processed, generates the files foo_exec.cc and bar_exec.cc. What would happen is that you'd have a file like foo.isa.dep which would describe what would happen and make that depend on foo.isa. When you run for the first time, scons would see that foo.isa.dep doesn't exist. During it's build phase, it would run foo.isa through the system and see that it generated foo_exec.cc and bar_exec.cc and put that into foo.isa.dep (as actual SConscript type code, or flat data, or...). When scons ran the second time, it would read in foo.isa.dep and extract the dependencies from it and build that into the graph. It wouldn't construct foo.isa.dep again since all its inputs were the same, and it would still capture all those dependencies. This time around, the larger binary would see that it depended on foo_exec.cc and bar_exec.cc and that those depend on foo.isa.dep (as a convenient aggregation point of all *.isa files involved). If foo.isa changed later, foo.isa.dep would be out of date and have to be regenerated, and then foo_exec.cc and bar_exec.cc, and then the main binary. The net effect of this is that the thing that processed the .isa would only be run when necessary. In our current setup, that would mean SLICC wouldn't have to be run for every build, only ones where the SLICC input files changed. The problem here is that scons would need to basically call a nested instance of itself on foo.isa.dep, let that build a dep tree and run the build phase, then process foo.isa.dep in the parent dep phase, and then run the parent build phase. It could literally just call scons from scons (though that seems like a major hack) or it could, if scons has a facility for it, do some sort of fancy multi-pass thing. This is sort of related to the first thing (additional targets) because the dependency cache files are sort of like independent targets with their own invocations of scons. Also related to scons are those .pyc files that end up scattered around the source tree. I know I asked about those a long, long time ago, but why are they there? Why don't they end up in the build directories? Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] what scons can do
On Tue, Apr 19, 2011 at 3:00 AM, Gabe Black gbl...@eecs.umich.edu wrote: I was looking at some of the stuff in util, and it occurred to me that the m5 utility program is cross compiled using different Makefiles for different architectures. Statetrace used to be like that (sort of), but recently I converted it over to scons and set up some configuration variables that made it easier to work with. It would be nice to be able to share some of that with the m5 utility program, although I don't remember it being all that complicated. Anyway, it seems like it would be useful to be able to have multiple binaries that can be built by scons, specifically the utility stuff and unit tests. That way we could avoid having a hodge podge of small build systems which are either isolated or not in not quite the right ways. If you're suggesting that the stuff in util get built by scons and put the binaries somewhere like build/util, I think that's a great idea. I don't know of the pros and cons of making it an indepedent invocation of scons vs integrating it into our global scons configuration, but whatever way we do it should avoid redundant scons code for things like detecting your compiler version, which makes me think the integrated approach might be better. Either way might require some refactoring between the SConstruct file and src/SConscript, since it's not under src, but we should be able to work that out. Also, I was thinking about how to handle the dependencies/generated files/custom language issue a little while back, and what I kept coming back to were schemes where scons would use a cache of dependency information which it would regenerate if any of the input files which determined outputs and/or dependencies changed. The problem is that scons would need to run once and possibly regenerate its cache, and then run again to actually run. Is this sort of multi-pass setup possible somehow without major hacks? To explain more what I'm getting, lets say you have input file foo.isa which, when processed, generates the files foo_exec.cc and bar_exec.cc. What would happen is that you'd have a file like foo.isa.dep which would describe what would happen and make that depend on foo.isa. When you run for the first time, scons would see that foo.isa.dep doesn't exist. During it's build phase, it would run foo.isa through the system and see that it generated foo_exec.cc and bar_exec.cc and put that into foo.isa.dep (as actual SConscript type code, or flat data, or...). When scons ran the second time, it would read in foo.isa.dep and extract the dependencies from it and build that into the graph. It wouldn't construct foo.isa.dep again since all its inputs were the same, and it would still capture all those dependencies. This time around, the larger binary would see that it depended on foo_exec.cc and bar_exec.cc and that those depend on foo.isa.dep (as a convenient aggregation point of all *.isa files involved). If foo.isa changed later, foo.isa.dep would be out of date and have to be regenerated, and then foo_exec.cc and bar_exec.cc, and then the main binary. The net effect of this is that the thing that processed the .isa would only be run when necessary. In our current setup, that would mean SLICC wouldn't have to be run for every build, only ones where the SLICC input files changed. The problem here is that scons would need to basically call a nested instance of itself on foo.isa.dep, let that build a dep tree and run the build phase, then process foo.isa.dep in the parent dep phase, and then run the parent build phase. It could literally just call scons from scons (though that seems like a major hack) or it could, if scons has a facility for it, do some sort of fancy multi-pass thing. This is sort of related to the first thing (additional targets) because the dependency cache files are sort of like independent targets with their own invocations of scons. Caching the list of generated SLICC files sounds like a good idea to me. I'm not sure this would require recursive scons invocations, since we manage to build the list dynamically already without that. I wouldn't call it a cache of dependency information though, since scons already has one of those; this is really just a cache of generated filenames, right? Also related to scons are those .pyc files that end up scattered around the source tree. I know I asked about those a long, long time ago, but why are they there? Why don't they end up in the build directories? That's an artifact of including them in scons, where they're run in place and not by m5. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Looks like it's Lisa's fault ;-) http://repo.m5sim.org/m5/diff/ab05e20dc4a7/src/mem/cache/base.cc I think Nate's point is that all the stats vector lengths should be changed to _numCpus or _numCpus+1 instead of maxThreadsPerCpu to be consistent. We should also either (1) always do _numCpus+1 even though the extra device slot is unnecessary for SE mode or (2) have a single #ifdef to set a local var to one or the other and use that consistently rather than having #ifdefs all over the place. I'd lean toward #2 just to keep the output a little cleaner in SE mode. Does that make sense, Lisa? Steve On Mon, Apr 18, 2011 at 3:58 PM, nathan binkert n...@binkert.org wrote: Yes, but all arithmetic between vectors is elementwise, so they need to be the same length if used in a formula. Total miss latency needs to have the same vector length as total misses. Nate On Mon, Apr 18, 2011 at 2:09 PM, Lisa Hsu h...@eecs.umich.edu wrote: I'm not sure I understand what the problem is either. Can different VectorStats not have different lengths? Lisa On Mon, Apr 18, 2011 at 11:43 AM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: My first reaction is let's fix it, but I don't really understand the problem or the impact of changing things. Anything serious? Gabe Quoting nathan binkert n...@binkert.org: I'm trying to get my python stats stuff committed and I found a bug in the classic cache stats. Look in src/mem/cache/base.cc. The VectorStats have several different lengths _numCpus + 1, _numCpus, or maxThreadsPerCPU. The fact that this works in the current stats package is lucky. I can be bug compatible, but I think we should fix this instead. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Hey Steve, I dont disagree with your solution (at least in the interim), but wouldn't the right solution be to have the actual CPUs pass how many hardware threads is has to the caches? The actual regStats happens after all the CPUs are instantiated and ports are connected (right?), so it would seem that since different CPUs can have a different amount of threads per CPU, that the caches should just use the port interface to ask how many threads? . Then, the caches down the hierarchy would just sum those thread counts up for each of their Stat vectors. I imagine a similar method to how the M5 classic figured out address ranges and snoop ports would work fine. Do people think that's fine or am I missing the point here? On Tue, Apr 19, 2011 at 10:42 AM, Steve Reinhardt ste...@gmail.com wrote: Looks like it's Lisa's fault ;-) http://repo.m5sim.org/m5/diff/ab05e20dc4a7/src/mem/cache/base.cc I think Nate's point is that all the stats vector lengths should be changed to _numCpus or _numCpus+1 instead of maxThreadsPerCpu to be consistent. We should also either (1) always do _numCpus+1 even though the extra device slot is unnecessary for SE mode or (2) have a single #ifdef to set a local var to one or the other and use that consistently rather than having #ifdefs all over the place. I'd lean toward #2 just to keep the output a little cleaner in SE mode. Does that make sense, Lisa? Steve On Mon, Apr 18, 2011 at 3:58 PM, nathan binkert n...@binkert.org wrote: Yes, but all arithmetic between vectors is elementwise, so they need to be the same length if used in a formula. Total miss latency needs to have the same vector length as total misses. Nate On Mon, Apr 18, 2011 at 2:09 PM, Lisa Hsu h...@eecs.umich.edu wrote: I'm not sure I understand what the problem is either. Can different VectorStats not have different lengths? Lisa On Mon, Apr 18, 2011 at 11:43 AM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: My first reaction is let's fix it, but I don't really understand the problem or the impact of changing things. Anything serious? Gabe Quoting nathan binkert n...@binkert.org: I'm trying to get my python stats stuff committed and I found a bug in the classic cache stats. Look in src/mem/cache/base.cc. The VectorStats have several different lengths _numCpus + 1, _numCpus, or maxThreadsPerCPU. The fact that this works in the current stats package is lucky. I can be bug compatible, but I think we should fix this instead. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Lisa should verify, but I think that's what _numCpus is. On Tue, Apr 19, 2011 at 8:01 AM, Korey Sewell ksew...@umich.edu wrote: Hey Steve, I dont disagree with your solution (at least in the interim), but wouldn't the right solution be to have the actual CPUs pass how many hardware threads is has to the caches? The actual regStats happens after all the CPUs are instantiated and ports are connected (right?), so it would seem that since different CPUs can have a different amount of threads per CPU, that the caches should just use the port interface to ask how many threads? . Then, the caches down the hierarchy would just sum those thread counts up for each of their Stat vectors. I imagine a similar method to how the M5 classic figured out address ranges and snoop ports would work fine. Do people think that's fine or am I missing the point here? On Tue, Apr 19, 2011 at 10:42 AM, Steve Reinhardt ste...@gmail.com wrote: Looks like it's Lisa's fault ;-) http://repo.m5sim.org/m5/diff/ab05e20dc4a7/src/mem/cache/base.cc I think Nate's point is that all the stats vector lengths should be changed to _numCpus or _numCpus+1 instead of maxThreadsPerCpu to be consistent. We should also either (1) always do _numCpus+1 even though the extra device slot is unnecessary for SE mode or (2) have a single #ifdef to set a local var to one or the other and use that consistently rather than having #ifdefs all over the place. I'd lean toward #2 just to keep the output a little cleaner in SE mode. Does that make sense, Lisa? Steve On Mon, Apr 18, 2011 at 3:58 PM, nathan binkert n...@binkert.org wrote: Yes, but all arithmetic between vectors is elementwise, so they need to be the same length if used in a formula. Total miss latency needs to have the same vector length as total misses. Nate On Mon, Apr 18, 2011 at 2:09 PM, Lisa Hsu h...@eecs.umich.edu wrote: I'm not sure I understand what the problem is either. Can different VectorStats not have different lengths? Lisa On Mon, Apr 18, 2011 at 11:43 AM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: My first reaction is let's fix it, but I don't really understand the problem or the impact of changing things. Anything serious? Gabe Quoting nathan binkert n...@binkert.org: I'm trying to get my python stats stuff committed and I found a bug in the classic cache stats. Look in src/mem/cache/base.cc. The VectorStats have several different lengths _numCpus + 1, _numCpus, or maxThreadsPerCPU. The fact that this works in the current stats package is lucky. I can be bug compatible, but I think we should fix this instead. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
It'd be good for Lisa to chime in here to comment on the exact meaning of _numCpus and how that value is intended to be set. If you dont care about threads per CPU (and just the raw # of CPUs value), then I'd defer to Lisa for figuring out what the right vector lengths are so that they are consistent :) However, since _numCpus seems to be a parameter given to the BaseCache, then the burden is on the script writer to figure out how many CPUs that a cache is sharing rather than automatically going through the cache hierarchy and figuring it out. That seems like the ultimately right thing to do. Lastly, I'm not sure how maxThreadsPerCPU plays into this but if you needed to get that threads value on a perCPU basis, then the suggested CPU search method would also work well. On Tue, Apr 19, 2011 at 11:35 AM, Steve Reinhardt ste...@gmail.com wrote: Lisa should verify, but I think that's what _numCpus is. On Tue, Apr 19, 2011 at 8:01 AM, Korey Sewell ksew...@umich.edu wrote: Hey Steve, I dont disagree with your solution (at least in the interim), but wouldn't the right solution be to have the actual CPUs pass how many hardware threads is has to the caches? The actual regStats happens after all the CPUs are instantiated and ports are connected (right?), so it would seem that since different CPUs can have a different amount of threads per CPU, that the caches should just use the port interface to ask how many threads? . Then, the caches down the hierarchy would just sum those thread counts up for each of their Stat vectors. I imagine a similar method to how the M5 classic figured out address ranges and snoop ports would work fine. Do people think that's fine or am I missing the point here? On Tue, Apr 19, 2011 at 10:42 AM, Steve Reinhardt ste...@gmail.com wrote: Looks like it's Lisa's fault ;-) http://repo.m5sim.org/m5/diff/ab05e20dc4a7/src/mem/cache/base.cc I think Nate's point is that all the stats vector lengths should be changed to _numCpus or _numCpus+1 instead of maxThreadsPerCpu to be consistent. We should also either (1) always do _numCpus+1 even though the extra device slot is unnecessary for SE mode or (2) have a single #ifdef to set a local var to one or the other and use that consistently rather than having #ifdefs all over the place. I'd lean toward #2 just to keep the output a little cleaner in SE mode. Does that make sense, Lisa? Steve On Mon, Apr 18, 2011 at 3:58 PM, nathan binkert n...@binkert.org wrote: Yes, but all arithmetic between vectors is elementwise, so they need to be the same length if used in a formula. Total miss latency needs to have the same vector length as total misses. Nate On Mon, Apr 18, 2011 at 2:09 PM, Lisa Hsu h...@eecs.umich.edu wrote: I'm not sure I understand what the problem is either. Can different VectorStats not have different lengths? Lisa On Mon, Apr 18, 2011 at 11:43 AM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: My first reaction is let's fix it, but I don't really understand the problem or the impact of changing things. Anything serious? Gabe Quoting nathan binkert n...@binkert.org: I'm trying to get my python stats stuff committed and I found a bug in the classic cache stats. Look in src/mem/cache/base.cc. The VectorStats have several different lengths _numCpus + 1, _numCpus, or maxThreadsPerCPU. The fact that this works in the current stats package is lucky. I can be bug compatible, but I think we should fix this instead. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Hi guys, Sorry for the delay in responding. I spent last night on the couch in 3 sweatshirts and under 3 fleece blankets generally feeling like a shit sandwich and not looking at my email. Fortunately, today I feel a bit more like a normal human being, so let me page this in. The point of the _numCpus parameter is to provide the cache some way of knowing how many sharers it has, since in a lot of shared cache scenarios knowing things like hits/misses per thread is important. The +1 does indeed have to do with devices. IIRC, always doing a +1 really made things annoying because you'd suddenly go from a singleton stat to a vector stat and the output would look ugly and blow up in SE mode because you'd always have a blank slot and vector stats have a much bigger space overhead in the stats file, and yet to have things be accurate in FS you needed an extra slot to put stats from devices. As for the maxThreadsPerCPU init variable, it has something to do with SMT and it seems pretty broken in concept (assumes single CPU and always instantiates a vector of max SMT width), but I didn't change any stats that didn't seem to be relevant in differentiating between sharing threads. WRT Korey's suggestion, it is possible to explore the hierarchy to look for how many CPUs there are but I don't think that's exactly the right thing to do. I've done something like that before for some other cache-sharing study, and M5 is so modular that you'd have to do a lot of configuration in the python scripts anyway to indicate who is sharing what with whom, and register that with some common object and make connections to that object. For example, the only way caches right now connect to anything else is through their ports. Not only would you need to need to add in a facility to have the cache be able to call out to, say, System and ask for how many sharers there are, but you'd need facilities to register with System how many sharers there are, and you'd need to be able to differentiate between different levels, e.g. if you had 4 private L1s, 4 private L2s, and 1 shared L3, walking through and discovering how many CPUs exist in the system will not tell you anything about how they are hooked up together and you'd need a way in configuration scripts to disambiguate from, say, 4 private L1s, 2 shared L2s, and 1 shared L3. I think the right thing to do is set _numCPUs appropriately from the configuration scripts. But instead of doing how it is done now, which is L3cache.num_cpus = options.np, something like: if SMT: num_cpus = options.np*width if buildEnv[FULL_SYSTEM]: num_cpus = num_cpus+1 L3cache.num_cpus = num_cpus. I can make this change if people agree. Lisa On Tue, Apr 19, 2011 at 8:53 AM, Korey Sewell ksew...@umich.edu wrote: It'd be good for Lisa to chime in here to comment on the exact meaning of _numCpus and how that value is intended to be set. If you dont care about threads per CPU (and just the raw # of CPUs value), then I'd defer to Lisa for figuring out what the right vector lengths are so that they are consistent :) However, since _numCpus seems to be a parameter given to the BaseCache, then the burden is on the script writer to figure out how many CPUs that a cache is sharing rather than automatically going through the cache hierarchy and figuring it out. That seems like the ultimately right thing to do. Lastly, I'm not sure how maxThreadsPerCPU plays into this but if you needed to get that threads value on a perCPU basis, then the suggested CPU search method would also work well. On Tue, Apr 19, 2011 at 11:35 AM, Steve Reinhardt ste...@gmail.com wrote: Lisa should verify, but I think that's what _numCpus is. On Tue, Apr 19, 2011 at 8:01 AM, Korey Sewell ksew...@umich.edu wrote: Hey Steve, I dont disagree with your solution (at least in the interim), but wouldn't the right solution be to have the actual CPUs pass how many hardware threads is has to the caches? The actual regStats happens after all the CPUs are instantiated and ports are connected (right?), so it would seem that since different CPUs can have a different amount of threads per CPU, that the caches should just use the port interface to ask how many threads? . Then, the caches down the hierarchy would just sum those thread counts up for each of their Stat vectors. I imagine a similar method to how the M5 classic figured out address ranges and snoop ports would work fine. Do people think that's fine or am I missing the point here? On Tue, Apr 19, 2011 at 10:42 AM, Steve Reinhardt ste...@gmail.com wrote: Looks like it's Lisa's fault ;-) http://repo.m5sim.org/m5/diff/ab05e20dc4a7/src/mem/cache/base.cc I think Nate's point is that all the stats vector lengths should be changed to _numCpus or _numCpus+1 instead of maxThreadsPerCpu to be consistent. We should also either (1) always do _numCpus+1 even though the extra device slot is unnecessary for SE mode or
Re: [m5-dev] Bad Stats Names
I'd vote for (1) and just do the replacement commands you suggest. The capitalisation wont work for the inorder model stats since they all call the name() operator as prefix for their stats, but those names are easily updated in src/cpu/inorder/resource_pool.cc Also, the name for the stages can be fixed in the name() function inorder/pipeline_stage.cc. (oh yea, I should be using csprintf() for these name functions... I know I know) On Tue, Apr 19, 2011 at 1:53 PM, nathan binkert n...@binkert.org wrote: The python stats stuff exposes all stats as python variables. (Making scripting far easier). While I can support stat names that are not valid python expressions, stuff works far better if stats names are valid python expressions. We have a few options, listed in my (rapidly) decreasing order of preference. 1) Require that all stats names are valid python expressions and panic if one is not. 2) Replace invalid characters on the python side (Mostly colon and dash) with something to kludge it (this is not so hard, but seems less than ideal). 3) Give up. :) I've pasted the various stats names below. (Many of them hang off a system or a cpu, but I removed the prefix to reduce duplicates). Here are all of the renames that would have to happen (they are generally trivial). 1) s/COM:// 2) s/ISSUE:// 3) s/RENAME:// 4) s/PROG:// 5) s/iew.EXEC:/iew.exec_/ 6) s/iew.WB:/iew.wb_/ 7) My guess is that occ_% is just a bug and it was supposed to be a %s or something. 8) All of the ones that look like AGEN-Unit or Branch-Predictor shoudl just be renamed to agen_unit and branch_predictor. 9) stage-0, stage-1, etc. becomes stage0, stage1, ... What do you guys think? They are mostly stats that I would guess are rarely used. Nate AGEN-Unit.agens Branch-Predictor.BTBHitPct Branch-Predictor.BTBHits Branch-Predictor.BTBLookups Branch-Predictor.RASInCorrect Branch-Predictor.condIncorrect Branch-Predictor.condPredicted Branch-Predictor.lookups Branch-Predictor.predictedNotTaken Branch-Predictor.predictedTaken Branch-Predictor.usedRAS Execution-Unit.executions Execution-Unit.mispredictPct Execution-Unit.mispredicted Execution-Unit.predicted Execution-Unit.predictedNotTakenIncorrect Execution-Unit.predictedTakenIncorrect Fetch-Buffer-T0.instsBypassed Fetch-Buffer-T1.instsBypassed Mult-Div-Unit.divides Mult-Div-Unit.multiplies RegFile-Manager.regFileAccesses RegFile-Manager.regFileReads RegFile-Manager.regFileWrites RegFile-Manager.regForwards RegFile-Manager.uniqueRegsPerSwitch commit.COM:branches commit.COM:bw_lim_events commit.COM:bw_limited commit.COM:committed_per_cycle commit.COM:count commit.COM:fp_insts commit.COM:function_calls commit.COM:int_insts commit.COM:loads commit.COM:membars commit.COM:refs commit.COM:swp_count dcache.occ_% decode.DECODE:BlockedCycles decode.DECODE:BranchMispred decode.DECODE:BranchResolved decode.DECODE:ControlMispred decode.DECODE:DecodedInsts decode.DECODE:IdleCycles decode.DECODE:RunCycles decode.DECODE:SquashCycles decode.DECODE:SquashedInsts decode.DECODE:UnblockCycles dtb_walker_cache.occ_% icache.occ_% iew.EXEC:branches iew.EXEC:nop iew.EXEC:rate iew.EXEC:refs iew.EXEC:stores iew.EXEC:swp iew.WB:consumers iew.WB:count iew.WB:fanout iew.WB:penalized iew.WB:penalized_rate iew.WB:producers iew.WB:rate iew.WB:sent iocache.occ_% iq.ISSUE:FU_type iq.ISSUE:fu_busy_cnt iq.ISSUE:fu_busy_rate iq.ISSUE:fu_full iq.ISSUE:issued_per_cycle iq.ISSUE:rate itb_walker_cache.occ_% l1c.occ_% l2c.occ_% l2cache.occ_% rename.RENAME:BlockCycles rename.RENAME:CommittedMaps rename.RENAME:FullRegisterEvents rename.RENAME:IQFullEvents rename.RENAME:IdleCycles rename.RENAME:LSQFullEvents rename.RENAME:ROBFullEvents rename.RENAME:RenameLookups rename.RENAME:RenamedInsts rename.RENAME:RenamedOperands rename.RENAME:RunCycles rename.RENAME:SquashCycles rename.RENAME:SquashedInsts rename.RENAME:UnblockCycles rename.RENAME:UndoneMaps rename.RENAME:fp_rename_lookups rename.RENAME:int_rename_lookups rename.RENAME:serializeStallCycles rename.RENAME:serializingInsts rename.RENAME:skidInsts rename.RENAME:tempSerializingInsts stage-0.idleCycles stage-0.runCycles stage-0.utilization stage-1.idleCycles stage-1.runCycles stage-1.utilization stage-2.idleCycles stage-2.runCycles stage-2.utilization stage-3.idleCycles stage-3.runCycles stage-3.utilization stage-4.idleCycles stage-4.runCycles stage-4.utilization workload.PROG:num_syscalls workload0.PROG:num_syscalls workload1.PROG:num_syscalls ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Review Request: stats: rename stats so they can be used as python expressions
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/645/ --- Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- stats: rename stats so they can be used as python expressions Diffs - src/cpu/inorder/pipeline_stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.9stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.cc d8ec0a7b3f0c src/cpu/o3/commit_impl.hh d8ec0a7b3f0c src/cpu/o3/decode_impl.hh d8ec0a7b3f0c src/cpu/o3/iew_impl.hh d8ec0a7b3f0c src/cpu/o3/inst_queue_impl.hh d8ec0a7b3f0c src/cpu/o3/rename_impl.hh d8ec0a7b3f0c src/mem/cache/tags/base.cc d8ec0a7b3f0c src/sim/process.cc d8ec0a7b3f0c Diff: http://reviews.m5sim.org/r/645/diff Testing --- I'm rerunning all tests right now and will commit the results so all the name changes are updated. Thanks, Nathan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Bad Stats Names
I've already done the replacement and I'm currently regenerating all stats :) I've just posted a diff to reviewboard. Please check it out. Nate I'd vote for (1) and just do the replacement commands you suggest. The capitalisation wont work for the inorder model stats since they all call the name() operator as prefix for their stats, but those names are easily updated in src/cpu/inorder/resource_pool.cc Also, the name for the stages can be fixed in the name() function inorder/pipeline_stage.cc. (oh yea, I should be using csprintf() for these name functions... I know I know) On Tue, Apr 19, 2011 at 1:53 PM, nathan binkert n...@binkert.org wrote: The python stats stuff exposes all stats as python variables. (Making scripting far easier). While I can support stat names that are not valid python expressions, stuff works far better if stats names are valid python expressions. We have a few options, listed in my (rapidly) decreasing order of preference. 1) Require that all stats names are valid python expressions and panic if one is not. 2) Replace invalid characters on the python side (Mostly colon and dash) with something to kludge it (this is not so hard, but seems less than ideal). 3) Give up. :) I've pasted the various stats names below. (Many of them hang off a system or a cpu, but I removed the prefix to reduce duplicates). Here are all of the renames that would have to happen (they are generally trivial). 1) s/COM:// 2) s/ISSUE:// 3) s/RENAME:// 4) s/PROG:// 5) s/iew.EXEC:/iew.exec_/ 6) s/iew.WB:/iew.wb_/ 7) My guess is that occ_% is just a bug and it was supposed to be a %s or something. 8) All of the ones that look like AGEN-Unit or Branch-Predictor shoudl just be renamed to agen_unit and branch_predictor. 9) stage-0, stage-1, etc. becomes stage0, stage1, ... What do you guys think? They are mostly stats that I would guess are rarely used. Nate AGEN-Unit.agens Branch-Predictor.BTBHitPct Branch-Predictor.BTBHits Branch-Predictor.BTBLookups Branch-Predictor.RASInCorrect Branch-Predictor.condIncorrect Branch-Predictor.condPredicted Branch-Predictor.lookups Branch-Predictor.predictedNotTaken Branch-Predictor.predictedTaken Branch-Predictor.usedRAS Execution-Unit.executions Execution-Unit.mispredictPct Execution-Unit.mispredicted Execution-Unit.predicted Execution-Unit.predictedNotTakenIncorrect Execution-Unit.predictedTakenIncorrect Fetch-Buffer-T0.instsBypassed Fetch-Buffer-T1.instsBypassed Mult-Div-Unit.divides Mult-Div-Unit.multiplies RegFile-Manager.regFileAccesses RegFile-Manager.regFileReads RegFile-Manager.regFileWrites RegFile-Manager.regForwards RegFile-Manager.uniqueRegsPerSwitch commit.COM:branches commit.COM:bw_lim_events commit.COM:bw_limited commit.COM:committed_per_cycle commit.COM:count commit.COM:fp_insts commit.COM:function_calls commit.COM:int_insts commit.COM:loads commit.COM:membars commit.COM:refs commit.COM:swp_count dcache.occ_% decode.DECODE:BlockedCycles decode.DECODE:BranchMispred decode.DECODE:BranchResolved decode.DECODE:ControlMispred decode.DECODE:DecodedInsts decode.DECODE:IdleCycles decode.DECODE:RunCycles decode.DECODE:SquashCycles decode.DECODE:SquashedInsts decode.DECODE:UnblockCycles dtb_walker_cache.occ_% icache.occ_% iew.EXEC:branches iew.EXEC:nop iew.EXEC:rate iew.EXEC:refs iew.EXEC:stores iew.EXEC:swp iew.WB:consumers iew.WB:count iew.WB:fanout iew.WB:penalized iew.WB:penalized_rate iew.WB:producers iew.WB:rate iew.WB:sent iocache.occ_% iq.ISSUE:FU_type iq.ISSUE:fu_busy_cnt iq.ISSUE:fu_busy_rate iq.ISSUE:fu_full iq.ISSUE:issued_per_cycle iq.ISSUE:rate itb_walker_cache.occ_% l1c.occ_% l2c.occ_% l2cache.occ_% rename.RENAME:BlockCycles rename.RENAME:CommittedMaps rename.RENAME:FullRegisterEvents rename.RENAME:IQFullEvents rename.RENAME:IdleCycles rename.RENAME:LSQFullEvents rename.RENAME:ROBFullEvents rename.RENAME:RenameLookups rename.RENAME:RenamedInsts rename.RENAME:RenamedOperands rename.RENAME:RunCycles rename.RENAME:SquashCycles rename.RENAME:SquashedInsts rename.RENAME:UnblockCycles rename.RENAME:UndoneMaps rename.RENAME:fp_rename_lookups rename.RENAME:int_rename_lookups rename.RENAME:serializeStallCycles rename.RENAME:serializingInsts rename.RENAME:skidInsts rename.RENAME:tempSerializingInsts stage-0.idleCycles stage-0.runCycles stage-0.utilization stage-1.idleCycles stage-1.runCycles stage-1.utilization stage-2.idleCycles stage-2.runCycles stage-2.utilization stage-3.idleCycles stage-3.runCycles stage-3.utilization stage-4.idleCycles stage-4.runCycles stage-4.utilization workload.PROG:num_syscalls workload0.PROG:num_syscalls workload1.PROG:num_syscalls ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org
Re: [m5-dev] Stats Bug
I love that you guys want to fix this. Can we agree on the immediate fix so it's no longer broken and then improve it? :) Thanks, Nate I suppose you could do that kind of walking, though I think it would be overly complicated. Let's say again you have 4 private L1s, 2 shared L2s, and a shared L3. If the L3 poked its port appropriately, I guess it could know that there are two things hanging off of it on the other side. But if you want to know things on a per CPU basis, then you'd have to keep track of depth as well so that when you get the the L2s and poke THEIR ports, you could back calculate at the L3 that there are 4 cores sharing the L3. Seems messy to me. So, I guess my feeling is, if you want to be the one to code that up, that's cool, but I'm definitely not going to :). Lisa On Tue, Apr 19, 2011 at 10:21 AM, Korey Sewell ksew...@umich.edu wrote: Hey Lisa, Is this (below) really something you have to do though? you'd have to do a lot of configuration in the python scripts anyway to indicate who is sharing what with whom, and register that with some common object and make connections to that object. I mean, as far as my understanding goes, to figure out which ports to snoop, M5 already goes through this type of exploration process (recvStatusChange?). different levels, e.g. if you had 4 private L1s, 4 private L2s, and 1 shared L3, walking through and discovering how many CPUs exist in the system will not tell you anything about how they are hooked up together and you'd need a way in configuration scripts to disambiguate from, say, 4 private L1s, 2 shared L2s, and 1 shared L3. After everything as been hooked up through the port interface, I think you have enough information. For example, if you have 4 private L1s and 2 shared L2s, then each L2 would ask each of the L1 ports that it's connected to how many sharers and then each L1 would ask it's CPU how many sharers. Eventually, u just sum that information up and pass it back. I understand that might be overkill (over just explicitly setting the sharers), but I dont see how that wouldn't work quite yet (although, I could just be missing something). - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
I think the immediate fix is the pythonic fix from a few msgs ago. I believe that's quick and easy. I think :). On Tue, Apr 19, 2011 at 2:27 PM, nathan binkert n...@binkert.org wrote: I love that you guys want to fix this. Can we agree on the immediate fix so it's no longer broken and then improve it? :) Thanks, Nate I suppose you could do that kind of walking, though I think it would be overly complicated. Let's say again you have 4 private L1s, 2 shared L2s, and a shared L3. If the L3 poked its port appropriately, I guess it could know that there are two things hanging off of it on the other side. But if you want to know things on a per CPU basis, then you'd have to keep track of depth as well so that when you get the the L2s and poke THEIR ports, you could back calculate at the L3 that there are 4 cores sharing the L3. Seems messy to me. So, I guess my feeling is, if you want to be the one to code that up, that's cool, but I'm definitely not going to :). Lisa On Tue, Apr 19, 2011 at 10:21 AM, Korey Sewell ksew...@umich.edu wrote: Hey Lisa, Is this (below) really something you have to do though? you'd have to do a lot of configuration in the python scripts anyway to indicate who is sharing what with whom, and register that with some common object and make connections to that object. I mean, as far as my understanding goes, to figure out which ports to snoop, M5 already goes through this type of exploration process (recvStatusChange?). different levels, e.g. if you had 4 private L1s, 4 private L2s, and 1 shared L3, walking through and discovering how many CPUs exist in the system will not tell you anything about how they are hooked up together and you'd need a way in configuration scripts to disambiguate from, say, 4 private L1s, 2 shared L2s, and 1 shared L3. After everything as been hooked up through the port interface, I think you have enough information. For example, if you have 4 private L1s and 2 shared L2s, then each L2 would ask each of the L1 ports that it's connected to how many sharers and then each L1 would ask it's CPU how many sharers. Eventually, u just sum that information up and pass it back. I understand that might be overkill (over just explicitly setting the sharers), but I dont see how that wouldn't work quite yet (although, I could just be missing something). - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
I think the immediate fix is the pythonic fix from a few msgs ago. I believe that's quick and easy. I think :). The fix has to be more than python. I'm talking about the original question. The stats use three different lengths: _numCpus + 1, _numCpus, or maxThreadsPerCPU. Which one is correct? Are you healthy enough to fix this? Thanks, Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
I think _numCpus+1 is the safe choiceLisa? On Tue, Apr 19, 2011 at 6:01 PM, nathan binkert n...@binkert.org wrote: I think the immediate fix is the pythonic fix from a few msgs ago. I believe that's quick and easy. I think :). The fix has to be more than python. I'm talking about the original question. The stats use three different lengths: _numCpus + 1, _numCpus, or maxThreadsPerCPU. Which one is correct? Are you healthy enough to fix this? Thanks, Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] what scons can do
On Tue, Apr 19, 2011 at 3:13 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: Caching the list of generated SLICC files sounds like a good idea to me. I'm not sure this would require recursive scons invocations, since we manage to build the list dynamically already without that. I wouldn't call it a cache of dependency information though, since scons already has one of those; this is really just a cache of generated filenames, right? How would you be able to do it all in one shot? The tricky part is that you actually have to build the .dep in the build phase, but you need it in the dependency tree generating phase. Since you can't go backwards like that in one invocation (as far as I know or can imagine) then you'd need to rounds. OK, I see, maybe it is inherently another level beyond what we currently do. I'm not the scons expert... As far as what it would be caching I think it's largely a semantic difference. You could consider it a cache of generated files which are used to set up the dependencies. It's just terminology... not that it's wildly inaccurate to call it dependency info, since it is info that is eventually used to determine dependencies, just that there's already a dependency caching feature in scons that's totally different (http://www.scons.org/doc/2.0.1/HTML/scons-user.html#AEN1148), and in general when the scons docs talk about dependencies it's what are the files that this file depends on not what are the files that depend on this file. Thus it would be less confusing if you avoided referring to the info that you're discussing here as dependency info and used a different term like generated file info. That's all. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
I think Lisa's pythonic fix is the one where we use the _numCpus value that's calculated entirely in python (including the +1 offset in FS mode) and don't attempt the automatic C++ technique that Korey proposed. Yes, that still involves C++ changes, specifically making everything in the C++ code just use the value passed in from python. That value is currently called _numCpus. IMO, _numCpus is not a good name, and while we're at it we should change it to be more descriptive, like _numSharers or _numSharingContexts. Steve On Tue, Apr 19, 2011 at 3:14 PM, Korey Sewell ksew...@umich.edu wrote: I think _numCpus+1 is the safe choiceLisa? On Tue, Apr 19, 2011 at 6:01 PM, nathan binkert n...@binkert.org wrote: I think the immediate fix is the pythonic fix from a few msgs ago. I believe that's quick and easy. I think :). The fix has to be more than python. I'm talking about the original question. The stats use three different lengths: _numCpus + 1, _numCpus, or maxThreadsPerCPU. Which one is correct? Are you healthy enough to fix this? Thanks, Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Yes, Steve's got it right. In the C++ you replace the instances of the 3 lengths Nate mentions with a single var (that can be more aptly named as Steve wants) that is pythonically calculated and passed in from the configuration. I'm making up words here :). I think I can do this tonight or tomorrow, I don't think it's that complicated, so yes, I'm healthy enough :). Lisa On Tue, Apr 19, 2011 at 3:58 PM, Steve Reinhardt ste...@gmail.com wrote: I think Lisa's pythonic fix is the one where we use the _numCpus value that's calculated entirely in python (including the +1 offset in FS mode) and don't attempt the automatic C++ technique that Korey proposed. Yes, that still involves C++ changes, specifically making everything in the C++ code just use the value passed in from python. That value is currently called _numCpus. IMO, _numCpus is not a good name, and while we're at it we should change it to be more descriptive, like _numSharers or _numSharingContexts. Steve On Tue, Apr 19, 2011 at 3:14 PM, Korey Sewell ksew...@umich.edu wrote: I think _numCpus+1 is the safe choiceLisa? On Tue, Apr 19, 2011 at 6:01 PM, nathan binkert n...@binkert.org wrote: I think the immediate fix is the pythonic fix from a few msgs ago. I believe that's quick and easy. I think :). The fix has to be more than python. I'm talking about the original question. The stats use three different lengths: _numCpus + 1, _numCpus, or maxThreadsPerCPU. Which one is correct? Are you healthy enough to fix this? Thanks, Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: stats: rename stats so they can be used as python expressions
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/645/#review1137 --- Ship it! - Ali On 2011-04-19 13:20:04, Nathan Binkert wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/645/ --- (Updated 2011-04-19 13:20:04) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- stats: rename stats so they can be used as python expressions Diffs - src/cpu/inorder/pipeline_stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.9stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.cc d8ec0a7b3f0c src/cpu/o3/commit_impl.hh d8ec0a7b3f0c src/cpu/o3/decode_impl.hh d8ec0a7b3f0c src/cpu/o3/iew_impl.hh d8ec0a7b3f0c src/cpu/o3/inst_queue_impl.hh d8ec0a7b3f0c src/cpu/o3/rename_impl.hh d8ec0a7b3f0c src/mem/cache/tags/base.cc d8ec0a7b3f0c src/sim/process.cc d8ec0a7b3f0c Diff: http://reviews.m5sim.org/r/645/diff Testing --- I'm rerunning all tests right now and will commit the results so all the name changes are updated. Thanks, Nathan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: python: different import for dealing with deman...
changeset 24406820a7e0 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=24406820a7e0 description: python: different import for dealing with demandimport diffstat: src/python/m5/__init__.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diffs (16 lines): diff -r d8ec0a7b3f0c -r 24406820a7e0 src/python/m5/__init__.py --- a/src/python/m5/__init__.py Sun Apr 17 14:21:04 2011 -0700 +++ b/src/python/m5/__init__.py Tue Apr 19 11:13:01 2011 -0700 @@ -32,10 +32,10 @@ try: # Try to import something that's generated by swig -import internal.core +import internal # Try to grab something from it in case demandimport is being used -internal.core.__package__ +internal.core.curTick except ImportError: # The import failed internal = None ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: stats: rename stats so they can be used as pyth...
changeset 38befb82b2c9 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=38befb82b2c9 description: stats: rename stats so they can be used as python expressions diffstat: src/cpu/inorder/pipeline_stage.cc | 2 +- src/cpu/inorder/resource_pool.9stage.cc | 39 +++--- src/cpu/inorder/resource_pool.cc| 20 +++--- src/cpu/o3/commit_impl.hh | 24 +- src/cpu/o3/decode_impl.hh | 20 +++--- src/cpu/o3/iew_impl.hh | 28 +++--- src/cpu/o3/inst_queue_impl.hh | 16 ++-- src/cpu/o3/rename_impl.hh | 42 src/mem/cache/tags/base.cc | 2 +- src/sim/process.cc | 2 +- 10 files changed, 103 insertions(+), 92 deletions(-) diffs (truncated from 611 to 300 lines): diff -r 24406820a7e0 -r 38befb82b2c9 src/cpu/inorder/pipeline_stage.cc --- a/src/cpu/inorder/pipeline_stage.cc Tue Apr 19 11:13:01 2011 -0700 +++ b/src/cpu/inorder/pipeline_stage.cc Tue Apr 19 18:45:21 2011 -0700 @@ -88,7 +88,7 @@ std::string PipelineStage::name() const { - return cpu-name() + .stage- + to_string(stageNum); + return cpu-name() + .stage + to_string(stageNum); } diff -r 24406820a7e0 -r 38befb82b2c9 src/cpu/inorder/resource_pool.9stage.cc --- a/src/cpu/inorder/resource_pool.9stage.cc Tue Apr 19 11:13:01 2011 -0700 +++ b/src/cpu/inorder/resource_pool.9stage.cc Tue Apr 19 18:45:21 2011 -0700 @@ -48,37 +48,48 @@ // Declare Resource Objects // name - id - bandwidth - latency - CPU - Parameters // -- -resources.push_back(new FetchSeqUnit(Fetch-Seq-Unit, FetchSeq, StageWidth * 2, 0, _cpu, params)); +resources.push_back(new FetchSeqUnit(fetch_seq_unit, FetchSeq, +StageWidth * 2, 0, _cpu, params)); -resources.push_back(new TLBUnit(I-TLB, ITLB, StageWidth, 0, _cpu, params)); +resources.push_back(new TLBUnit(itlb, ITLB, StageWidth, 0, _cpu, params)); memObjects.push_back(ICache); -resources.push_back(new CacheUnit(icache_port, ICache, StageWidth * MaxThreads, 0, _cpu, params)); +resources.push_back(new CacheUnit(icache_port, ICache, +StageWidth * MaxThreads, 0, _cpu, params)); -resources.push_back(new DecodeUnit(Decode-Unit, Decode, StageWidth, 0, _cpu, params)); +resources.push_back(new DecodeUnit(decode_unit, Decode, StageWidth, 0, +_cpu, params)); -resources.push_back(new BranchPredictor(Branch-Predictor, BPred, StageWidth, 0, _cpu, params)); +resources.push_back(new BranchPredictor(branch_predictor, BPred, +StageWidth, 0, _cpu, params)); for (int i = 0; i params-numberOfThreads; i++) { char fbuff_name[20]; -sprintf(fbuff_name, Fetch-Buffer-T%i, i); -resources.push_back(new InstBuffer(fbuff_name, FetchBuff + i, 4, 0, _cpu, params)); +sprintf(fbuff_name, fetch_buffer_t%i, i); +resources.push_back(new InstBuffer(fbuff_name, FetchBuff + i, 4, 0, +_cpu, params)); } -resources.push_back(new UseDefUnit(RegFile-Manager, RegManager, StageWidth * MaxThreads, 0, _cpu, params)); +resources.push_back(new UseDefUnit(regfile_manager, RegManager, +StageWidth * MaxThreads, 0, _cpu, params)); -resources.push_back(new AGENUnit(AGEN-Unit, AGEN, StageWidth, 0, _cpu, params)); +resources.push_back(new AGENUnit(agen_unit, AGEN, StageWidth, 0, _cpu, +params)); -resources.push_back(new ExecutionUnit(Execution-Unit, ExecUnit, StageWidth, 0, _cpu, params)); +resources.push_back(new ExecutionUnit(execution_unit, ExecUnit, +StageWidth, 0, _cpu, params)); -resources.push_back(new MultDivUnit(Mult-Div-Unit, MDU, 5, 0, _cpu, params)); +resources.push_back(new MultDivUnit(mult_div_unit, MDU, 5, 0, _cpu, +params)); -resources.push_back(new TLBUnit(D-TLB, DTLB, StageWidth, 0, _cpu, params)); +resources.push_back(new TLBUnit(dtlb, DTLB, StageWidth, 0, _cpu, params)); memObjects.push_back(DCache); -resources.push_back(new CacheUnit(dcache_port, DCache, StageWidth * MaxThreads, 0, _cpu, params)); +resources.push_back(new CacheUnit(dcache_port, DCache, +StageWidth * MaxThreads, 0, _cpu, params)); -resources.push_back(new GraduationUnit(Graduation-Unit, Grad, StageWidth * MaxThreads, 0, _cpu, params)); +resources.push_back(new GraduationUnit(graduation_unit, Grad, +StageWidth * MaxThreads, 0, _cpu, params)); } void diff -r 24406820a7e0 -r 38befb82b2c9 src/cpu/inorder/resource_pool.cc --- a/src/cpu/inorder/resource_pool.cc Tue Apr 19 11:13:01 2011 -0700 +++ b/src/cpu/inorder/resource_pool.cc Tue Apr 19 18:45:21 2011 -0700 @@ -51,7 +51,7 @@ // Declare Resource Objects // name - id - bandwidth - latency - CPU -
[m5-dev] Review Request: cache: properly initialize vector stats that are per context
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/646/ --- Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- cache: properly initialize vector stats that are per context Diffs - src/mem/cache/base.cc ee4e795343bf Diff: http://reviews.m5sim.org/r/646/diff Testing --- Thanks, Nathan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Stats Bug
Yes, Steve's got it right. In the C++ you replace the instances of the 3 lengths Nate mentions with a single var (that can be more aptly named as Steve wants) that is pythonically calculated and passed in from the configuration. I'm making up words here :). I think I can do this tonight or tomorrow, I don't think it's that complicated, so yes, I'm healthy enough :). I just posted a review. I'm running tests right now. Let me know if this is what you all had in mind please. http://reviews.m5sim.org/r/646/ Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: stats: rename stats so they can be used as python expressions
--- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/645/#review1138 --- Ship it! I'm sure this will cause some people headaches if they have stat names hardcoded in some scripts, but it's unavoidable. - Steve On 2011-04-19 13:20:04, Nathan Binkert wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/645/ --- (Updated 2011-04-19 13:20:04) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- stats: rename stats so they can be used as python expressions Diffs - src/cpu/inorder/pipeline_stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.9stage.cc d8ec0a7b3f0c src/cpu/inorder/resource_pool.cc d8ec0a7b3f0c src/cpu/o3/commit_impl.hh d8ec0a7b3f0c src/cpu/o3/decode_impl.hh d8ec0a7b3f0c src/cpu/o3/iew_impl.hh d8ec0a7b3f0c src/cpu/o3/inst_queue_impl.hh d8ec0a7b3f0c src/cpu/o3/rename_impl.hh d8ec0a7b3f0c src/mem/cache/tags/base.cc d8ec0a7b3f0c src/sim/process.cc d8ec0a7b3f0c Diff: http://reviews.m5sim.org/r/645/diff Testing --- I'm rerunning all tests right now and will commit the results so all the name changes are updated. Thanks, Nathan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] what scons can do
Quoting Steve Reinhardt ste...@gmail.com: On Tue, Apr 19, 2011 at 3:13 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: Caching the list of generated SLICC files sounds like a good idea to me. I'm not sure this would require recursive scons invocations, since we manage to build the list dynamically already without that. I wouldn't call it a cache of dependency information though, since scons already has one of those; this is really just a cache of generated filenames, right? How would you be able to do it all in one shot? The tricky part is that you actually have to build the .dep in the build phase, but you need it in the dependency tree generating phase. Since you can't go backwards like that in one invocation (as far as I know or can imagine) then you'd need to rounds. OK, I see, maybe it is inherently another level beyond what we currently do. I'm not the scons expert... Me neither... Anybody else? As far as what it would be caching I think it's largely a semantic difference. You could consider it a cache of generated files which are used to set up the dependencies. It's just terminology... not that it's wildly inaccurate to call it dependency info, since it is info that is eventually used to determine dependencies, just that there's already a dependency caching feature in scons that's totally different (http://www.scons.org/doc/2.0.1/HTML/scons-user.html#AEN1148), and in general when the scons docs talk about dependencies it's what are the files that this file depends on not what are the files that depend on this file. Thus it would be less confusing if you avoided referring to the info that you're discussing here as dependency info and used a different term like generated file info. That's all. Yes, I see why that could be confusing. I won't call it that any more (unless I forget). Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev