Re: [OMPI devel] Need help on the distribution process

Jeff Squyres (jsquyres) Sun, 17 Dec 2017 06:50:05 -0800

On Dec 16, 2017, at 3:15 AM, saisilpa b via devel <[email protected]> 
wrote:
> 
> I am using openmpi library for my project,  which is very old version and 
> uses the commands like orterun and orted..


Can you be a little more specific: what version are you running?  Just about 
all versions of Open MPI have orterun and orted.

> I written one script and passing the input in the text file, which has 
> 22lakhs lines..  The script has to read one by one and generate output and 
> write it into the file.. The process is taking quite a long time. 

> If I tired to add multiple hosts for distribution to execute this program 
> then each input line read by all the hosts and generate the same output from 
> all the hosts..  I am getting duplicate output and it is expected to take 
> additional time..  I don't want like that...  Can you please let us know is 
> there anyway we can split the work between the hosts.. 

I can't quite tell from your short description: are you using the MPI API, or 
not?  You specifically mention "script", which implies that you are not writing 
C code / not using the MPI API.

SIDENOTE: if you *are* actually using the MPI API, this sounds like a 
user-level issue, not a development-of-Open-MPI issue.  Your question is likely 
better directed to the users list, not the devel list.  That being said, this 
is the 2nd time you have asked this question on this list, so we might as well 
leave the thread here.  For future user-level MPI API questions, however, you 
might want to direct them to the users list.

I am parsing your description to mean that you have X amount of work that is 
taking Y amount of time.  You then replicate that X amount of work on Z number 
of hosts, and it's taking more than Y amount of time.  That's probably to be 
expected.

If you want it to take less time, then you should have each of your Z hosts do 
X/Z amount of work (not X amount of work). For example, if your text file has N 
number of lines that need to be processed, then the first host should process 
lines 1 through N/Z, the second host should process lines N/Z+1 through 2*N/Z, 
...etc.

Scaling is rarely perfect (e.g., there's overheads in initially distributing 
the input and gathering the output at the end), but depending on how much work 
you have and how much can actually be performed independently by each of the 
hosts, you can expect some level of speedup.

-- 
Jeff Squyres
[email protected]



_______________________________________________
devel mailing list
[email protected]
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Need help on the distribution process

Reply via email to