Re: Was ->Re: [ilug-cal] Share LI-99 - Now - Misconceptions on MIMD systems.

Shourya Sarcar Wed, 26 Jan 2000 11:13:41 -0800
It was on Mon, 24 Jan 2000, Shanker R Swaminathan typed in this..
>*>AND You have again missed the point- I was using them to find out whether
>*>there was any percentage in running any app on a beowolf- NOT to reduce the
>*>speed of my machine - which incidently is an overclocked 333.

I think you are overlooking something very simple. IF you are dealing at
an algorithmic level (not at an application level), there is NO
justification of coding something on a uniprocessor VM. You can easily
find out the space time complexity, surface - volume ratios analytically,
if yourrs is a deterministic algo. Worst case, best case and with a best
of luck , the average case.

Something as simple as this does not give me any reason to code out the
thing on two different OS. Why do I need multiple OSes for writing
parallel algorithms ?

If you are talking about real benchmarking to see whether you get some
kind of a lift from distribbing the algo, you might as well find it find
it analytically.As to the case of an analytical explanation not being
confirmed by expt. ones, the problem probably lies in the analysis.

All this while I have been talking at the algorithmic level, it is a well
known fact that algos do not scale efficiently and things at the
application level would be rather different.

However, it still beats me as to however you can even think of benching
on a uniproc VM. You will definitely get incorrect indications. IT might
even turn out , that your parallel implementation is slower than your
sequential implementatin. Sorry, not "might even".. it is " surely will".

>*> Repeating
>*>myself- if and only If under an Ideal situation - if there is a significant
>*>performance difference between a MIMD implementation of any algorithm and
>*>the SIMD one , ( IT MAY NOT be always faster- Taking the underlying h/w in
>*>consideration) 

Restating my assertion, you can do this analytically on an algo level ,
why even bother to write #include <mpi.h> ? Also, please state what you
mean by an Ideal situation ?


>*>  Then one goes about trying
>*>to use a Parellel architecture to get a speedup( I am talking only of the
>*>need for speed - NOT fault tolerance , which is an entirely different
>*>story). This is where The price performance ratio and user needs come in. If
>*>you have enough cash - go in for a Close-coupled machine with the attendant
>*>*High* cost. If you can get an acceptable level of performance ( by user
>*>definition) using a loosely coupled architecture  like the NOW( Network of
>*>workstations ) then go for it..

You started this by justifying the use of a uniproc VMWare  now you have
moved into harware granularity . Voila !!


>*>  A Beowolf is a software implementation of a
>*>closely coupled system on loosely coupled hardware. The End result is
>*>transparency - NOT a difference in speed with a NOW/POP ( The reason I told
>*>you to look up the
>*>defination is because typical examples using named pipes do NOT require a
>*>Beowolf to run).


Tch, Tch..Couldn't be further from the truth. Which "definition" of
Beowulf are you looking at anyway ? Try running a Beowulf cluster on a
general-purpose network and be ready to be booed down and flamed by the
Beowulfers. You can definitely run NOWs on gen-purp networks... NOWS are
NOT optimised for speed , but Beowulfs (or should it be Beowulves ??) are.
IT is amazing how you got confused on this point. BASICALLY, NOWs are
about heterogeneity/COTS in parallel environments, Beowulves are for
speed.


>*>
>*>> Hi Shanker,
>*>> You seem to have missed my point. Beowulfs are clusters fine -
>*>> tuned for SPEED. SPeED is the word, nothing else. Nothing else.. we are
>*>> not looking at accomodating heterogeneity in the avavilable machines to
>*>
>*>Missed the point- I am unaware how you could get an Ideal case situation
>*>otherwise? You have to justify the investment in resources to build a
>*>beowolf.
>*>
>*>> build up a Beowulf cluster. If SPEED is not optimised we can call it a
>*>> "cluster", or a NOW or a POP.. but not a Beowulf.
>*>Totally Incorrect- Speed has nothing to do with terminology w.r.t the above
>*>statement( It has to an Extent otherwise)

Urrgh !@ Be very precise about this. You are incorrect :( Beowulf
clusters ARE built for speed and performance. SPEED has everything to do
with the Beowulf phenomena. They are not about Fault Tolerance or running
X as a process. You can build in Fault tolerance in to your software
but probably not at the cost of speed.

>*>
>*>> Then again, who cares
>*>> about definitions. ?
>*>EVERYTHING depends on definations-Otherwise I could call a simple network
>*>using diskless workstations using NFS a beowolf . 

<OT>
Infact replace that "simple network" with a quite 100Mbps backplane and
you GET a Beowulf.Definitions can be very misleading. For example, could
you refer me to a necessary and sufficient description of an
System software so as to distinguish itself from Application software. ?
Do you STILL believe in waht the ICSE books taught us about them , huh ?
Looks easy, but as you know you can easily contradict yourself after
having provided a definition. Non theoritical computer science is so much
based on definitions which trod the fringes. Eg, Is middleware in the
geometric middle, actually ? (Bad humor )
</OT>

>*> Thet is what
>*>differentiates a scientist from a dreamer- The lack of clarity in vision
>*>which a lack of precise defination brings.

<CLICHE>
you may say I'm a dreamer
but i'm not the only one :-)
</CLICHE>

>*>
>*>> I do not get your idea of running a VM on multiple machines to
>*>> build a cluster.
>*>> (A) Implementations of MPICH/PVM on Windows are flakey or broken.
>*>>      Do not even try to use them ;)
>*>
>*>I Wasn't . However-  This does defeat your need for heterogenety -:-)
>*>
>*>> (B) Running a VM on multiple machines _will_ slow down your
>*>>     performance.
>*>
>*>Who is ( Just by itself)?These exist to make higher level tools /resources
>*>like monitors ( the semaphore one) available. The trick is  to find an
>*>algorithm for say Matrix Multiplication in parellel where the subalgorithms
>*>have very little interaction( They exist _ have to give an exam on them in
>*>my sem exams!). With sufficient VM's , a significant speedup could be
>*>managed - Sufficient being the keyword.  ( Space - time complexity)


Who bothers about a __significant__ speedup. ?? you can get __actual__
speedups creating an analytical model and substituiting parameters like
PRocessor periodicity and inter processor latency.. or cyclic process
allocation. Don't even talk about VM's while trying to bench mark. IF you
have multiple processors, get similar types of OS running to create a
Beowulf. If you are stuck with a single processor, don't even think of
benchmarking. Stick to heterogeneity tests (which is NOT Beowulf) or at
the most efficient parallelisation of code (although I haven't the
faintest idea of how you are going to prove this on a uniproc :) All you
can prove is the optimality of surface to volume ratio.



 >*>
>*>> (C) Running 450 MHz chips with a 10Mbps backplane is NOT
>*>>     Beowulfing... throwing to the winds all the fundae
>*>>             surface to volume ratios of orthogonal tile graphs.
>*>
>*>Finally you are getting somewhere-

Thanx for the assurance :-|

>*>1. That is the precise reason You need to do some benchmarks to JUSTIFY
>*>making the Beowolf.

Definitely, you do not do the benchmarking on a uniproc VM. :)

>*>  The entire point of this discussion from my side has
>*>been directed towards that. Processors will keep getting faster and This
>*>means newer MIMD algorithims which need fewer interactions/Messages to pass
>*>are needed.

I am not sure i get you . Are you talking about time complexity as a
function of processor speed ? In that case,I need a revision of my algo
analysis and design theory... I am not sure i encountered that
anywhere.Anybody did ?

>*> Once you have got such a situation - even multiple  733 Mhz m/c
>*>on such a network will be useful.

I don't get this one. Did you miss a negation somewhere ? And why teh
"even".

>*>
>*>BTW the 10 MBPS limit is due to a design of a 2.5 km LAN- Reduce distance
>*>and your speed increases- But you know it anyway!
>*>

Strangely, I didn't know this one !! ILUG-cal has been one great
learning curve, thanks fellow luggers . Tnx shanker for this 1. Could you
please send me a personal mail explaining this one.. or a OT to the list.
?


>*>Wouldn't it be better to optimise the VM rather then throw it away and
>*>reinvent the wheel? Remember , You will have to do a lot of work to get
>*>something useful up and then you would have thrown away the advantages of
>*>the collabrative work that has already gone into the VM and Message passage
>*>system.( and not be able to take the advantage of apps written for the VM
>*>based architecture)

Why would you need a VM , anyway in a situation which calls for critical
speed ? Even optimising a VM will not get you close to the NOS. The
proven best way is to have the same OS running on all you r nodes, if you
want to smell like a Beowulf.

Finally (hopefully),
         let me make my position with respect to this thread very clear.

        a) Beowulfs are fine-tuned for speed. Any cluster with h/w, s/w
           which is not concurrent with this philosophy is NOT a Beowulf.

        b) Uniprocessor VM are great ways if you want to test distributed
        apps, source code porting on the same machine , either because you do
        not have access to multiple machines or you are too lazy to raise you
        posterior over to the next m/c . HOWEVER, they are ruled out in
        benchmarking . You simply do not get a percentage lift.

        c) Multiprocessor VMs .. ? Why on earth ? Why not use different
           OSes directly or better still why not use the same OS on the
           nodes ?      

        d) Time-Space complexity analysis is perfectualised on analytical
           grounds not by running them on machines. You can get a
           conformance of your analytical work (expt. proof) by running
           the code on parallel machines.. but thats about it. A
           reworking is not usually allowed unless you are doing some kind
           of empirical work by curve fitting and inter/extrapolations.I
           am not aware of major experimental breakthoughs in such cases.

        e) Algorithmic Time/Space complexity does not depend upon
           processor speed. 


Regards
Shourya

_______________________________________________________________
Shourya Sarcar         <[EMAIL PROTECTED]>  <Tel:91-033-4710477> 
Department of Computer Science and Engineering Jadavpur University   
Calcutta, India 700 032

All the world's a stage..
And I am acting tonight
C - the difference : http://www.eskimo.com/~scs/C-faq/top.html


--
To unsubscribe, send mail to [EMAIL PROTECTED] with the body
"unsubscribe ilug-cal" and an empty subject line.
FAQ: http://www.ilug-cal.org/help/faq_list.html
Re: Was ->Re: [ilug-cal] Share LI-99 - Now - Misconceptions on MIMD systems.

Reply via email to