Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]

Pierce, Marlon Wed, 04 May 2016 15:48:07 -0700

+1 for publishing to RabbitMQ. Don’t worry about XBaya as it is obsolete; 
upgrading it is another GFAC project. I suggest you focus on the SimStream to 
RabbitMQ parts first. The API changes will need additional discussion, probably 
over a hangout.

Marlon

From: Jeffery Kinnison 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, May 4, 2016 at 6:41 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Planning for In-Situ Application and Resource Monitoring [GSoC 
Project]

The more I look into it, the more I like using RabbitMQ within the SimStream 
program to communicate with Airavata server. This is what I have so far for 
practical steps to take in implementing the project:

SimStream:

  *   Refactor to use RabbitMQ queues instead of Tornado Web Server.
  *   Define a config file to send with each job that contains information 
about how to contact Airavata server, which scripts to run to collect the 
simulation and resource data, any arguments to pass to the scripts. This will 
decouple data collection logic from SimStream and hopefully eliminate the need 
for long-running data collection scripts (i.e., one data point is collected per 
run of the collection script).

Within the Airavata API Server:

  *   Extend the existing org.apache.airavata.model.job.JobModel to include 
information about contacting SimStream (queue name, valid data stream names,
  *   Add classes RabbitMQJobDataPublisher and RabbitMQJobDataConsumer 
(analogous to 
org.apache.airavata.messaging.core.impl.RabbitMQProcessLaunchPublisher and 
org.apache.airavata.messaging.core.impl.RabbitMQProcessLaunchConsumer)
  *   Extend the API Server to listen for requests for job data (requires 
identification of which job, which data from the job, in addition to 
verification that the requester should be allowed to perform this operation)
  *   Extend the API Server to send requested job data back to the gateway and 
user that issued the request.

Within Airavata's XBaya :

  *   Create default services for the data collection and event 
monitoring/handling aspects of the project that can be added into the workflow 
composer.
  *   Create a service that accepts custom data collection scripts to send 
along with the job.

Within the PGA:

  *   Add blades and controllers that allow users to view requested data from a 
job.
  *   Extend the experiment-related app functionality to allow users to 
retrieve data from a running job through the gateway.

I saw that there are some existing but empty or commented-out classes 
(org.apache.airavata.model.util.ComputeResourceUtil, 
org.apache.airavata.monitoring.Main) that suggest there was work toward similar 
functionality as I am suggesting. Searching JIRA for these didn't turn up any 
information, and I'm curious about why these plans were abandoned, or if they 
were even related to my project.

I'd appreciate any comments on the above!

Best,

Jeff K.

On Wed, Apr 27, 2016 at 4:43 PM, Pierce, Marlon 
<[email protected]<mailto:[email protected]>> wrote:
RabbitMQ has first class support for Python, so that should not be a problem.  
Suresh already included the link.  Suresh covered most of the bases already, so 
I’ll just reiterate that Airavata’s use of AMQP/RabbitMQ and Thrift should make 
it programming language independent. You can see how well this holds up in 
reality.

Marlon

From: Jeffery Kinnison 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, April 27, 2016 at 4:35 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Planning for In-Situ Application and Resource Monitoring [GSoC 
Project]

Thanks Suresh,

I was hoping that I could stick with Python for the meat of the project, not 
just because it's the language I'm most comfortable with, but also thanks to 
the fact that it's fairly ubiquitous on HPC systems.

I'll take a look at either interfacing the POC with RabbitMQ or converting it 
entirely to their Python bindings. If anyone has any alternative suggestions, 
they would be much appeciated.

Jeff K.

On Wed, Apr 27, 2016 at 4:20 PM, Suresh Marru 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jeff,

On Apr 27, 2016, at 4:08 PM, Jeffery Kinnison 
<[email protected]<mailto:[email protected]>> wrote:

Hi Dev Team,

I'd like to develop a plan for implementing my GSoC project in conjunction to 
getting my development environment up and running. This is my first substantial 
experience with Open Source software development on this scale, so thank you in 
advance for bearing with me.

You did great during proposal (hence you have a project), just continue the 
same. At worse you will hear a lot of RTFM which is a common encounter in open 
source. I will let you google for it.

The full project proposal can be found at 
https://cwiki.apache.org/confluence/display/AIRAVATA/GSoC+Proposal+-+In+Situ+Simulation+Analysis+Using+Airavata

The idea is to allow Airavata users to look behind the curtain at jobs they are 
running and enable automatic response to conditions encountered as jobs run, 
both at the system and application level. This will likely require a 
lightweight server to run alongside each job, which will communicate with the 
Airavata server.

I have a prototype for the lightweight server 
(https://github.com/jeffkinnison/simstream) written in Python, however I know 
that Apache software is typically Java-based. The question here is one of 
whether or not the prototype can be rolled into Airavata, or if I need to begin 
looking into Java-based solutions.

No, you do not need to port your simstream to Java, infact. Since your 
application demeon will need to run on HPC compute nodes, Java will not be a 
good fit there. I think you should stick to python. For the communication with 
Airavata, one suggestion will be to send a AMQP message which Airavata listens 
to. You can follow this tutorial as a start - 
https://www.rabbitmq.com/tutorials/tutorial-one-python.html. Others may have 
different suggestions.

The other initial question I have is one of how the Airavata server submits 
jobs. From what I can tell, Airavata sends batch scripts to connected computing 
resources, and my thinking right now about how to deploy the lightweight server 
is to add its startup logic to the submit scripts. Is this the correct thinking?

Yes thats exactly right. As you might see from other discussions, the cloud 
based submissions might not have a batch script, but its fair to assume your 
server will be launched one way or another.

Again, thank you for answering these questions, and I'm looking forward to 
working with everyone this summer.

Keep them coming.

Suresh

Best,
Jeff K.

Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]

Reply via email to