Hello,

This week I tried to implement a new heuristic that implies hot caches:
schedule the threads belonging to a process on the same core to make use of
the L1 cache they share.
For this I made a test scenario: one process that run two pthreads. Each
thread is doing computations on the same region of memory (one thread on
the odd indexes and another on the even ones of a matrix).

The hw was a Core i3 (dual-core with HT): when running the two threads on
different cores, the time for the computations is 9sec.When running on the
same core, the time is almost double (17sec). Another test case would be to
ocupy all 4 logical cpus. For this I ran out two  processes, each with two
threads. The time was ~17sec, no matter that the two threads were running
on the same core or not. So there is no benefict for this heuristic
apparently. In the linux smt paper, they didn't talk about this case (where
to schedule the threads belonging to the same process)....they probably
knew the results.

As these being said, I will continue with scheduling heuristics regarding
the cache. The next one I will try: schedule the process to the closest
cache that had run before (try to schedule on the same core, no matter what
thread - L1 cache; then try to schedule within the same chip, no matter
what core - L2-3 cache, etc).

Also, the heuristic that had no results on HT, would be good for applying
to packages/cores (always schedule on the same package, no matter what core
to use the hot caches).

Another work to be done is the bug fixing regarding the CPU topology on the
monster. It seems it's a problem there.

Mihai

Reply via email to