Re: Can tomcat serve MPI (parallel) applications?
2015-11-30 0:56 GMT+03:00 Martijn Slouter : > Hello John, > as far as I know, all processes are equivalent, although the one with > rank 0 is usually used for logging unless each process has to > contribute its own logging messages. > > I am using the openmpi MPI software as basis with business logic > written in C and a JNI interface to make it available in java. Each > java process owns part of the data, does some sort of presentation > logic on this part of data, and is then supposed to pass the final > presentation to the web through tomcat. As you mentioned, the MPI > hosts all run in one single server rack with Gigabit ethernet > interconnection. > > My impression is that my question can only be answered by somebody who > is familiar with both MPI and tomcat. If there is no solution, I will > probably write my own, very limited servlet container, which is > compatible with MPI but not as powerful as tomcat. > You may want to look at how an embedded Tomcat is started. The test/ directory in the source tree has many examples. Look for "Tomcat.addServlet()" calls. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Can tomcat serve MPI (parallel) applications?
John and Martijn, On 11/30/15 7:57 PM, john Matlock wrote: > As I said earlier, it has been a long time since I worked with MPI (1991 to > be exact) so I am not quite understanding your application or the software > topology. Let me ask a few more questions. It seems you say that a > request for processing comes in, the task it split among many hosts using > MPI, probably with mpi_put and mpi-get to handle the communications between > hosts. In this kind of system each host only works on one part of the > problem, yet you say that the subsidiary host is to send the "final > presentation" over the web, back to presumably whoever asked for the > processing to be done, wherever in the world it is needed. If the > subsidiary host is only working on part of the problem, how does it get the > data results from how ever many other hosts are doing their part of the > processing? If a single host can produce the results, what is MPI being > used to do? Why not just send the request to the server that is going to > process the data? > > Again going back over the years, the system I worked on had a front end > machine that was sent the problem/application to be processed. It broke up > the processing task to get more CPU power onto the problem. Then the > results from the individual tasks were sent back to the front end which > assembled the independent or intermediate results into the final > presentation. The whole system was not connected to the web (1991 > remember), but I see no reason that the front end machine couldn't take in > requests from the web and then use the web to distribute these results for > which Tomcat would work fine. But I don't understand why you would want to > use Tomcat on each individual processing node. MPI communicators would > seem to handle this internal communication better than trying to fit these > communications onto some kind of intranet. > > I don't think I am helping you very much. I'm not sure why it's important to start Tomcat itself using "mpirun". Isn't Tomcat just the interface through which clients submit jobs to the MPI cluster? Just have your Tomcat start up normally, accept an HTTP request, and then send that request off to the MPI server(s) using whatever mechanism is typically used (socket, shared memory, etc.) Tomcat itself doesn't have to participate in the whole MPI party, does it? -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Can tomcat serve MPI (parallel) applications?
As I said earlier, it has been a long time since I worked with MPI (1991 to be exact) so I am not quite understanding your application or the software topology. Let me ask a few more questions. It seems you say that a request for processing comes in, the task it split among many hosts using MPI, probably with mpi_put and mpi-get to handle the communications between hosts. In this kind of system each host only works on one part of the problem, yet you say that the subsidiary host is to send the "final presentation" over the web, back to presumably whoever asked for the processing to be done, wherever in the world it is needed. If the subsidiary host is only working on part of the problem, how does it get the data results from how ever many other hosts are doing their part of the processing? If a single host can produce the results, what is MPI being used to do? Why not just send the request to the server that is going to process the data? Again going back over the years, the system I worked on had a front end machine that was sent the problem/application to be processed. It broke up the processing task to get more CPU power onto the problem. Then the results from the individual tasks were sent back to the front end which assembled the independent or intermediate results into the final presentation. The whole system was not connected to the web (1991 remember), but I see no reason that the front end machine couldn't take in requests from the web and then use the web to distribute these results for which Tomcat would work fine. But I don't understand why you would want to use Tomcat on each individual processing node. MPI communicators would seem to handle this internal communication better than trying to fit these communications onto some kind of intranet. I don't think I am helping you very much. John Matlock On Mon, Nov 30, 2015 at 2:25 AM, Stefan Mayr wrote: > Am 29.11.2015 um 19:24 schrieb Martijn Slouter: > >> Thanks for your reply, comments below: >> ... >> Any suggestion how I can accomplish the configuration, if I start >> tomcat with the MPI web application using "mpirun -n 2 java ..." so >> that only the first MPI process opens the tomcat communication ports, >> while all other MPI processes disable their communicators? >> >> As an alternative I can run the MPI application as a separate server >> (tested across 16 hosts already), and use tomcat as a (serial) client >> to this parallel server. The disadvantage is that huge amounts of data >> need to be processed another time instead of being served directly >> from the MPI application. >> >> > How does mpirun communicate to the started java process that it is the > first process? Maybe it is easier to write a wrapper that can decide which > tomcat configuration to use depending if this is your master process or > not. If you write this wrapper in java you could use an embedded tomcat or > jetty to startup a servlet container where needed. > > Regards, > >Stefan Mayr > > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >
Re: Can tomcat serve MPI (parallel) applications?
Am 29.11.2015 um 19:24 schrieb Martijn Slouter: Thanks for your reply, comments below: ... Any suggestion how I can accomplish the configuration, if I start tomcat with the MPI web application using "mpirun -n 2 java ..." so that only the first MPI process opens the tomcat communication ports, while all other MPI processes disable their communicators? As an alternative I can run the MPI application as a separate server (tested across 16 hosts already), and use tomcat as a (serial) client to this parallel server. The disadvantage is that huge amounts of data need to be processed another time instead of being served directly from the MPI application. How does mpirun communicate to the started java process that it is the first process? Maybe it is easier to write a wrapper that can decide which tomcat configuration to use depending if this is your master process or not. If you write this wrapper in java you could use an embedded tomcat or jetty to startup a servlet container where needed. Regards, Stefan Mayr - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Can tomcat serve MPI (parallel) applications?
Hello John, as far as I know, all processes are equivalent, although the one with rank 0 is usually used for logging unless each process has to contribute its own logging messages. I am using the openmpi MPI software as basis with business logic written in C and a JNI interface to make it available in java. Each java process owns part of the data, does some sort of presentation logic on this part of data, and is then supposed to pass the final presentation to the web through tomcat. As you mentioned, the MPI hosts all run in one single server rack with Gigabit ethernet interconnection. My impression is that my question can only be answered by somebody who is familiar with both MPI and tomcat. If there is no solution, I will probably write my own, very limited servlet container, which is compatible with MPI but not as powerful as tomcat. Thank you Martijn On Sun, Nov 29, 2015 at 9:32 PM, john Matlock wrote: > It has been a lot of years since I worked with MPI, but IIRC one "host" has > to be the master (usually called mpirun or mpiexec) that distributes the > tasks to the "dependent hosts" and then collects the processed results. If > this is true, then using one machine as a dedicated front end makes sense > to me. Are the dependent hosts connected to the master via a LAN (perhaps > a Beowulf cluster) or all of them distributed and receiving/returning data > over the web as well? If co-located, then a gigabit LAN can handle the com > between hosts at fairly high speed. If this is not fast enough then you > need to go to something faster (and more expensive) like the PCI-Express > system from Dolphin. Going over the web for "huge amounts" of data is > going to be limited to the bandwidth of the internet connection, i.e. much > slower than a LAN. It may be possible to have the individual tasks sent to > the processing hosts individually, but again, it seems to me that this is > the function of the master host. > > Are the applications you run all using data from a single big collection of > data like a database. Perhaps if widely distributed you could supply the > data set to all the hosts using something like a backup/restore model. > Then these copies of the data set could be transferred to the individual > hosts using something like a flash drive or SD card. A briefcase full of > these memory devices in a briefcase on an airplane has a hell of a lot of > bandwidth, even though we've become more accustomed to just dumping > everything on the net, when it comes to terabytes, petabytes, exabytes, > zettabytes, or yottabytes of data, the web isn't the answer. > > If the whole thing has to come over the internet, would something like > Linda or Rinda software help you? > > As I say, it has been many years since I worked with MPI and with the rate > of change in this business, I may have it all wrong. I hope I'm being > helpful rather than just cluttering up your mailbox. > > What MPI software are you using? Are the applications written primarily in > FORTRAN with a mixture of other languages? > > Good Luck! > > John Matlock > > On Sun, Nov 29, 2015 at 10:24 AM, Martijn Slouter > wrote: > >> Thanks for your reply, comments below: >> >> On Fri, Nov 27, 2015 at 10:15 AM, Konstantin Kolinko >> wrote: >> > What is your goal, your expectation of Tomcat? What these n instances >> > should do that 1 instance cannot? >> >> They are running cpu-intensive calculations on distributed hosts >> ("high perfomance computing"), so that all hosts share the CPU and RAM >> requirements. Tomcat will allow interaction with the MPI application >> through the internet. >> >> > Is is possible to start several Tomcats with the same CATALINA_BASE in >> > parallel, but you have to >> > ... >> > A connector can be configured, reconfigured, started/stopped >> > programmatically via JMX. >> >> Any suggestion how I can accomplish the configuration, if I start >> tomcat with the MPI web application using "mpirun -n 2 java ..." so >> that only the first MPI process opens the tomcat communication ports, >> while all other MPI processes disable their communicators? >> >> As an alternative I can run the MPI application as a separate server >> (tested across 16 hosts already), and use tomcat as a (serial) client >> to this parallel server. The disadvantage is that huge amounts of data >> need to be processed another time instead of being served directly >> from the MPI application. >> >> Which solution do you suggest? >> >> Thank you >> Martijn >> >> - >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: users-h...@tomcat.apache.org >> >> - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Can tomcat serve MPI (parallel) applications?
It has been a lot of years since I worked with MPI, but IIRC one "host" has to be the master (usually called mpirun or mpiexec) that distributes the tasks to the "dependent hosts" and then collects the processed results. If this is true, then using one machine as a dedicated front end makes sense to me. Are the dependent hosts connected to the master via a LAN (perhaps a Beowulf cluster) or all of them distributed and receiving/returning data over the web as well? If co-located, then a gigabit LAN can handle the com between hosts at fairly high speed. If this is not fast enough then you need to go to something faster (and more expensive) like the PCI-Express system from Dolphin. Going over the web for "huge amounts" of data is going to be limited to the bandwidth of the internet connection, i.e. much slower than a LAN. It may be possible to have the individual tasks sent to the processing hosts individually, but again, it seems to me that this is the function of the master host. Are the applications you run all using data from a single big collection of data like a database. Perhaps if widely distributed you could supply the data set to all the hosts using something like a backup/restore model. Then these copies of the data set could be transferred to the individual hosts using something like a flash drive or SD card. A briefcase full of these memory devices in a briefcase on an airplane has a hell of a lot of bandwidth, even though we've become more accustomed to just dumping everything on the net, when it comes to terabytes, petabytes, exabytes, zettabytes, or yottabytes of data, the web isn't the answer. If the whole thing has to come over the internet, would something like Linda or Rinda software help you? As I say, it has been many years since I worked with MPI and with the rate of change in this business, I may have it all wrong. I hope I'm being helpful rather than just cluttering up your mailbox. What MPI software are you using? Are the applications written primarily in FORTRAN with a mixture of other languages? Good Luck! John Matlock On Sun, Nov 29, 2015 at 10:24 AM, Martijn Slouter wrote: > Thanks for your reply, comments below: > > On Fri, Nov 27, 2015 at 10:15 AM, Konstantin Kolinko > wrote: > > What is your goal, your expectation of Tomcat? What these n instances > > should do that 1 instance cannot? > > They are running cpu-intensive calculations on distributed hosts > ("high perfomance computing"), so that all hosts share the CPU and RAM > requirements. Tomcat will allow interaction with the MPI application > through the internet. > > > Is is possible to start several Tomcats with the same CATALINA_BASE in > > parallel, but you have to > > ... > > A connector can be configured, reconfigured, started/stopped > > programmatically via JMX. > > Any suggestion how I can accomplish the configuration, if I start > tomcat with the MPI web application using "mpirun -n 2 java ..." so > that only the first MPI process opens the tomcat communication ports, > while all other MPI processes disable their communicators? > > As an alternative I can run the MPI application as a separate server > (tested across 16 hosts already), and use tomcat as a (serial) client > to this parallel server. The disadvantage is that huge amounts of data > need to be processed another time instead of being served directly > from the MPI application. > > Which solution do you suggest? > > Thank you > Martijn > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >
Re: Can tomcat serve MPI (parallel) applications?
Thanks for your reply, comments below: On Fri, Nov 27, 2015 at 10:15 AM, Konstantin Kolinko wrote: > What is your goal, your expectation of Tomcat? What these n instances > should do that 1 instance cannot? They are running cpu-intensive calculations on distributed hosts ("high perfomance computing"), so that all hosts share the CPU and RAM requirements. Tomcat will allow interaction with the MPI application through the internet. > Is is possible to start several Tomcats with the same CATALINA_BASE in > parallel, but you have to > ... > A connector can be configured, reconfigured, started/stopped > programmatically via JMX. Any suggestion how I can accomplish the configuration, if I start tomcat with the MPI web application using "mpirun -n 2 java ..." so that only the first MPI process opens the tomcat communication ports, while all other MPI processes disable their communicators? As an alternative I can run the MPI application as a separate server (tested across 16 hosts already), and use tomcat as a (serial) client to this parallel server. The disadvantage is that huge amounts of data need to be processed another time instead of being served directly from the MPI application. Which solution do you suggest? Thank you Martijn - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Can tomcat serve MPI (parallel) applications?
Hi Konstantin, -Original Message- From: Konstantin Kolinko [mailto:knst.koli...@gmail.com] Sent: 27 November 2015 09:15 To: Tomcat Users List Subject: Re: Can tomcat serve MPI (parallel) applications? 2015-11-26 23:18 GMT+03:00 Martijn Slouter : > Hello, > I am looking for a solution for a tomcat container, which is supposed > to serve a web application, which is using MPI (openmpi) internally. > (The servlet is making JNI calls to C library functions. I have > validated that this Java-MPI connection runs without problems when NOT > using tomcat.) > > In catalina.sh, I have changed the lines which actually starts tomcat > eval "\"$_RUNJAVA\"" "\"$LOGGING_CONFIG\"" $LOGGING_MANAGER ... > into the same command preceded by mpirun: > eval mpirun -n 2 "\"$_RUNJAVA\"" "\"$LOGGING_CONFIG\"" > $LOGGING_MANAGER ... > > However, in catalina.out I get errors like > "... java.net.BindException: Address already in use ..." > This makes sense, because both MPI processes will try to bind to the > same address. > > Is there any chance to have tomcat serve a web application which is > using native MPI functions inside one of its servlets? > > I am using apache-tomcat-7.0.65 on Ubuntu 15.04. What is your goal, your expectation of Tomcat? What these n instances should do that 1 instance cannot? Is is possible to start several Tomcats with the same CATALINA_BASE in parallel, but you have to 1. Disable shutdown port (set port="-1" on element if I remember correctly) It means that these Tomcats have to shut themselves down eventually (like explicitly calling System.exit()), or you have to kill them by sending a signal (knowing pid of the process). 2. Remove connectors, or disable them (port="-1" if I remember correctly), or configure them to autoselect a random port number (port="0" if I remember correctly) A connector can be configured, reconfigured, started/stopped programmatically via JMX. 3. Do not perform any writing activity in CATALINA_BASE - Do not deploy war files (so that Tomcat does not need to unpack them) - Do not deploy any new applications while Tomcat is running. Turn off autoDeploy feature on Host. Do not use Tomcat Manager web application. - Do not compile JSP pages. Turn them into servlets by precompiling them with Jasper JspC. - Do not write serialized session data (configure with pathname=""). - Turn off logging (or turn a deaf ear to it trying to concurrently write into the same log files). Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org In reply to - " Is it possible to start several Tomcats with the same CATALINA_BASE in parallel, but you have to 1. Disable shutdown port (set port="-1" on element if I remember correctly)" The approach I've taken is to create a port standard for each JVM instance, where each protocol in use within the JVM has its own unique port number, this prevents conflict between multiple Tomcat instances. E.G. Where there are four Tomcat instances calling shutdown - tomcat0/conf/server.xml: tomcat1/conf/server.xml: tomcat2/conf/server.xml: tomcat3/conf/server.xml: The same approach applied for HTTP connector - tomcat0/conf/server.xml:
Re: Can tomcat serve MPI (parallel) applications?
2015-11-26 23:18 GMT+03:00 Martijn Slouter : > Hello, > I am looking for a solution for a tomcat container, which is supposed to > serve a web application, which is using MPI (openmpi) internally. (The > servlet is making JNI calls to C library functions. I have validated that > this Java-MPI connection runs without problems when NOT using tomcat.) > > In catalina.sh, I have changed the lines which actually starts tomcat > eval "\"$_RUNJAVA\"" "\"$LOGGING_CONFIG\"" $LOGGING_MANAGER ... > into the same command preceded by mpirun: > eval mpirun -n 2 "\"$_RUNJAVA\"" "\"$LOGGING_CONFIG\"" $LOGGING_MANAGER > ... > > However, in catalina.out I get errors like > "... java.net.BindException: Address already in use ..." > This makes sense, because both MPI processes will try to bind to the same > address. > > Is there any chance to have tomcat serve a web application which is using > native MPI functions inside one of its servlets? > > I am using apache-tomcat-7.0.65 on Ubuntu 15.04. What is your goal, your expectation of Tomcat? What these n instances should do that 1 instance cannot? Is is possible to start several Tomcats with the same CATALINA_BASE in parallel, but you have to 1. Disable shutdown port (set port="-1" on element if I remember correctly) It means that these Tomcats have to shut themselves down eventually (like explicitly calling System.exit()), or you have to kill them by sending a signal (knowing pid of the process). 2. Remove connectors, or disable them (port="-1" if I remember correctly), or configure them to autoselect a random port number (port="0" if I remember correctly) A connector can be configured, reconfigured, started/stopped programmatically via JMX. 3. Do not perform any writing activity in CATALINA_BASE - Do not deploy war files (so that Tomcat does not need to unpack them) - Do not deploy any new applications while Tomcat is running. Turn off autoDeploy feature on Host. Do not use Tomcat Manager web application. - Do not compile JSP pages. Turn them into servlets by precompiling them with Jasper JspC. - Do not write serialized session data (configure with pathname=""). - Turn off logging (or turn a deaf ear to it trying to concurrently write into the same log files). Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org