Dear Djordje,
Thank you for your answer! You have given me a better point of view on
the flexpoints' reusability.
As for the flexpoint creation, I had figured out that the number of
flexpoints taken (during the trace
simulation) as much as the range of flexpoints used for timing
simulation are a matter of experience.
I was mostly referring to the workloads that are used by the Flexus
development team though. Isn't there
a proposed number of flexpoints for each one? For example, in the
run_job.rc.tcl file one can see:
apache_4cpu_40cl2 baseline 0-7:5-24 $flexus_commands_timing(common)
If I am not mistaken, this means that when using this workload in a
timing simulation, flexpoints 5-24 from
phases 0-7 will be used, right? So, it would be sufficient if I asked
Flexus to create 24 flexpoints for each
phase during the trace simulation (preceding the timing simulation, of
course)?
And one last remark, on the same issue. Do lines such as this
spec2k_art baseline x:x $flexus_commands_timing(common)
in the .tcl file mean that there is no proposed range of flexpoints
that should be taken into account during
the timing simulation? In that case, will I have to follow the
trial-and-error method you proposed until I
get some reliable results while still being practical in terms of
simulation duration?
Cheers,
Alexandros
On Tue, 31 Jan 2012 17:22:49 +0000, Djordje Jevdjic wrote:
Dear Alexandro,
1. Flexpoints are used only for timing simulations, so you don’t use
them for any trace simulation.
If you look again at the .tcl file, in the section named “rungen
trace” you will find the pointers to
the definitions containing running parameters. These parameters tell
you the number of instructions
each core in your system should run.
However, you need to specify the number of flexpoints you will create
when you want to generate
flexpoints, and the generation is done with a trace simulator. You
decide on the number of phases/flexpoints
while you are creating your workload. The number of flexpoints
affects the reliability of your results.
The more flexpoints (measurements) you have, and the longer the
measurements are, the more
reliable your results will be. This is not always straightforward,
and depends on the workload behavior.
So, sometimes, you need to repeat this process a couple of times
until you get results that are reliable enough,
while still being practical for simulations.
2. Flexpoints are highly reusable. This means that there is no need
to regenerate them if and only if the
parameters you plan to change affect only the timing behavior of your
components. If your changes
affect the functionality of the components (for example, you need to
vary cache sizes), you cannot
reuse the flexpoints. A flexpoint should keep the state that is
independent of your timing parameters.
Ideally, your flexpoint should have the state of the system
equivalent to the state that you would have
if you were running the workload in the timing mode from the
beginning until that point. However, this
is not always possible, but also not necessary. That’s why when
running a flexpoint we always have a
detailed 100-200K cycles warm-up period to warm up the
microarchitectural state that is not simulated
with a trace simulator.
Whether you need to recreate the flexpoints or not depends on what
functionality you want to introduce.
For example, if you plan study the effect of small optimizations in
the replacement policy of 2-way
associative 32KB L1 caches, you can reuse the flexpoints created with
the LRU policy. When executing
each flexpoint, before taking the actual measurement, you will warm
up your system for let’s say 200K
cycles, so the small difference coming from the policy change will
vanish. However, you might need to
reconsider this if you are studying some changes in a 64MB L3 cache,
for example. It’s up to you to do
the math and see if flexpoint regeneration is needed or not and what
the required warm-up time is
(100K is typically ok if you are not changing the functionality
recorded by the trace simulator). If you are
changing the data placement policy in the last-level cache, there
WILL be a need to recreate your flexpoints,
because the content of your cache recorded in the flexpoints is not
valid anymore.
Hope this helps,
Regards,
Djordje
________________________________________
From: aledaglis [[email protected]]
Sent: Saturday, January 28, 2012 12:04 PM
To: Simflex
Subject: Flexpoint Creation & Reusability
Hello everybody,
I have two questions about flexpoint generation and reusability.
1. How can I decide how many flexpoints I should create for each
different workload? I have noticed that in run_job.rc.tcl there are
definitions for each workload, like the number of cycles before each
breakpoint and flexpoint generation. What I am not sure about is the
number of flexpoints I should request when running the trace
simulation.
Would it be fine if I just checked which flexpoints are used for
timing
and only create as many flexpoints? e.g. as seen in run_job.rc.tcl:
"db2v8_tpcc_nort_16cpu_64cl baseline 0-1:1-24
$flexus_commands_timing(common)", would it be enough to create 24
flexpoints?
2. Another matter that concerns me is the range of a flexpoint's
reusability. I guess that if I change my user-postload.simics file
used
for the trace simulation, I should probably create new flexpoints.
What
if I change a component's behaviour? For instance, if I make a
change in
CMPCache, like changing the mapping function or implementing some
simple
migration mechanism, would it be necessary to make new flexpoints?
It
would seem logical to me to do so. In that case, under what
conditions
can a flexpoint set be reused? I understand that there is no short
answer to this question, but any explanation would be really
helpful.
Thank you in advance
Alexandros Daglis