Re: [OMPI users] An idea about a semi-automatic optimized parallel I/O with Open MPI

Xuan Wang Wed, 13 Jun 2012 10:32:28 -0400

Hi Ralph,

you are right, the monitor hurts the performance, if the monitor has to sent 
the monitoring results to the data warehouse during the execution of the I/O 
operation. But actually, we don't need this parallel execution for the monitor. 
The monitor only need to gather the information such as the duration of 
read/write, the bandwidth, the number of used processes, the algorithm ID and 
some other small information. This information can be stored as txt file and 
sent to the data warehouse when the file system is free, which means the 
monitor sends information to the data warehouse "OFFLINE".

I will try to find out the impact of comm_spawn to the whole performance. 
Besides starting another MPI process to monitoring the performance, is there 
any possibility to integrating a monitoring function within the MPI process or 
even within the MPI I/O operation? That means we start one MPI process, there 
are multiple threads for I/O operations, in which one thread is in charge of 
monitoring. Will that hurt a lot of performance? If necessary, the Open MPI 
source code has to be overwritten for this purpose.

The database is actually independent to the I/O operation. The only 
disadvantage I can see, is that the MPI process has to wait for accessing the 
database. Especially the table in very large.

The goal of this concept is to automatically adjust the I/O operation 
parameters according to the historical I/O operation results, so that each I/O 
operation will execute at least not worse than last similar I/O operations. 
Therefore the system includes a learning phase - the better combination of the 
parameters and algorithm replaces the older and worse one.

By the way, I don't understand what do you mean exactly about "OMPI's design is 
intended to embed such "hints" in its selection logic. So if there are algos 
for determining which params are best given number of procs etc, then the idea 
is to embed those algos, and then let the sys admin or users "tune" them by 
setting default params in their default MCA param files." The selection of the 
parameters and algorithm is base on the historical executions, which are stored 
in the database.

Best Regards!
Xuan

----- Original Message -----
From: "Ralph Castain" <r...@open-mpi.org>
To: "Open MPI Users" <us...@open-mpi.org>
Sent: Wednesday, June 13, 2012 3:35:56 PM
Subject: Re: [OMPI users] An idea about a semi-automatic optimized      
parallel        I/O with Open MPI

So...you want the remove MPI process to comm_spawn another MPI process that 
will monitor the MPI I/O operation? That sounds expensive - comm_spawn is a 
rather slow operation in itself.

<shrug> it would work, but I can't imagine it would be very performant. Note 
that the spawned "monitor" would only live for the lifetime of the job, so I'm 
not sure what value it adds. You might as well just monitor performance in the 
original app and dump that data into the database.

FWIW: OMPI's design is intended to embed such "hints" in its selection logic. 
So if there are algos for determining which params are best given number of 
procs etc, then the idea is to embed those algos, and then let the sys admin or 
users "tune" them by setting default params in their default MCA param files.

Not saying that a database is a bad idea - just saying it is a design 
departure. We haven't done it because monitoring performance automatically 
hurts the performance of the app being monitored.

On Jun 13, 2012, at 7:13 AM, Xuan Wang wrote:

> Hi Ralph,
> 
> thank you for the advice.
> 
> You are right, the deamon is NOT MPI processes. I would like to use the Open 
> MPI I/O module to implement.
> 
> In my opinion, the commends sent by the client will start an MPI I/O 
> operation, therefore, the client can start an MPI process. In addition, I 
> have found a similar "select logic" module in the OMPIO, which is a new MPI 
> I/O architecture in Open MPI (besides the ROMIO). Therefore, the whole 
> process from "client call" to "returning result" is an MPI process, if I have 
> not made a mistake.
> 
> Best Regards!
> Xuan
> 
> ----- Original Message -----
> From: "Ralph Castain" <r...@open-mpi.org>
> To: "Open MPI Users" <us...@open-mpi.org>
> Sent: Wednesday, June 13, 2012 2:44:31 PM
> Subject: Re: [OMPI users] An idea about a semi-automatic optimized parallel   
> I/O with Open MPI
> 
> One flaw in the idea: the daemons are not MPI processes, and therefore have 
> no way to run an MPI I/O operation.
> 
> 
> On Jun 13, 2012, at 5:40 AM, Xuan Wang wrote:
> 
>> Hi,
>> 
>> I have an idea about using database to support a kind of semi-automatic 
>> optimized parallel I/O operations and want to know if it is realizable or 
>> not. Hope you guys can give me more advices and point out the shortage of 
>> the idea. Thank you all.
>> 
>> As the performance of the parallel I/O depends on the parallel I/O 
>> algorithm, the file storage in file system, the number of processes used for 
>> I/O and so on, we can use the MPI hints to control the parameters manually. 
>> But sometime, the client or the people who call the I/O operation don’t know 
>> which parameters are the best.
>> 
>> Therefore, we think about using the data warehouse and an I/O monitor to 
>> realize the optimization phase. Please take a look at the attached picture 
>> first.
>> 
>> Process explanations:
>> 1. The client sends the I/O commends with hints (optional) to deamon. The 
>> select model will decide if it is necessary to call the I/O database in 
>> order to get the optimized I/O operation strategy.
>> 2. If yes, the select model sends the I/O commends with those parameters, 
>> which can be used to choose the optimized I/O algorithm, to the knowledge 
>> base or database.
>> 3 & 4. The select model gets the optimized algorithm and runs the I/O 
>> operation.
>> 5 & 6. During the I/O operation, the monitor will gather the performance 
>> related information and sends it to the data warehouse, which is used to 
>> analyze the performance of the optimized algorithm and support the 
>> semi-automatic optimization.
>> 
>> These are the basic thought about the whole process. Please be free to ask 
>> any details about this system/concept. I will try my best to explain it.
>> 
>> I am happy if someone can take part in the discussion.
>> 
>> Thanks!
>> 
>> Best Regards!
>> Xuan Wang<1.PNG>_______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] An idea about a semi-automatic optimized parallel I/O with Open MPI

Reply via email to