Hello,

Here i describe something that works very bad in all OS-es i tested it
at. I tested it at NT and Linux. w95/w98 i can't take serious enough to
even consider testing it at.

Consider i have 4 processes. First process sets variable in
shared memory struct Tree.

tree->waitforsearch = true;

Now other 3 processes get started.
They start till they get into the next loop:

do {
  ;
} while( tree->waitforsearch );

InitializeSettings();
while( tree->job ) {
  tree->waitingformove[ProcessorNumber] = true;
  do {
    ;
  } while( tree->waitformove );
  DoYourJob();
}

The first processor in the meantime is
waiting till all processors are in the tree->waitformove loop,
only then continue so that other processors can get a job.

So in fact at 2 places n-1 out of n processes must wait till they
all are at the same place (except for the first processor).

When starting a job, this synchronization must happen a few hundreds
of times, up to a thousand times. That must actually happen within
a second.

Now in the past the wait loops were more difficult. I've made it simpler now.

I deeply regret that.

Why do easy when it can be done difficult?

You probably laugh now. You shouldn't. 

In the past i could test my program parallel at a single processor,
of course it was hard to detect real faults like that, but i could
at least test it. That can be done no longer.

If i start my program now at a single processor,
then synchronization takes very very long. Many minutes before
both processes get 50% system time.

when i start however 2 processes at a quad or dual machine,
then that 1000 synchronisations eat STILL 10 seconds or something,
also laughable if i compare to what it was.

Both NT and Linux completely f... up my program. When i got the
results first i was doubting whether it was the OS or my program.

I figured now out for sure it's the OSes that cause this problem,
they 'seem' to figure out somehow in this simple
implementation that a process is idle, yet
when this process badly needs systemtime, 
it doesn't get it then.

I do not know what in the OS causes this problem. If i knew i might
work around it for the time being.

Anyway i feel quite well that it basically isn't my problem but a
serious OS problem. They're smart, but not smart enuf
to understand that i'm running a parallel program where it's very
important that seemingly idle processes get a big deal of systemtime,
instead of waiting for a second before getting it.

Greetings,
Vincent





-
Linux SMP list: FIRST see FAQ at http://www.irisa.fr/prive/mentre/smp-faq/
To Unsubscribe: send "unsubscribe linux-smp" to [EMAIL PROTECTED]

Reply via email to