Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-11 Thread Pamidighantam, Sudhakar V
We can not expect  users or applications to change behavior for Airavata. It is 
up to us enable applications and users as they are now.
As you have seen several applications have system input parameters inside a 
master input file and they are used by the application and
required to be used in scheduling. As I was suggesting the memory for 
scheduling should be higher than what is expected by the application.
Similarly time for scheduling should also be higher than what is given in the 
input to accommodate cleanup and other post processing as well.
Some schedulers allow soft and hard limits (admins may or may not enable them) 
and we can think of these pairs of system parameters as
soft and hard memory and time limits.

Thanks,
Sudhakar.


On Dec 11, 2014, at 2:02 AM, Shameera Rathnayaka 
mailto:shameerai...@gmail.com>> wrote:

Hi Amila,

According to my understanding, what this handle does is, read the user given 
configuration at run time. I have no idea this will effect to qsub or aprun or 
other parameter. It is better if someone explain it to me too.

We already have a way to provide the these configuration parameters with the 
experiment itself by defining ComputeResourceScheduling. But there are some use 
cases like gaussian, where gaussian users provide these configurations with the 
input file it self.  IMO here we have two options, either we can  ask gaussian 
users to adopt to the airavata way but still those configuration in input file 
is required for gaussian application(I guess , correct me if I am wrong here) 
or use airavata extension points to support this scenario. Here the handler 
address the second option.

Thanks,
Shameera.

On Thu, Dec 11, 2014 at 12:00 AM, Amila Jayasekara 
mailto:thejaka.am...@gmail.com>> wrote:
Also, regarding the handler that Shameera is working on ...
I guess that handler is going to change mainly "qsub" parameters or "aprun" 
parameters (Correct me if i am wrong). I think it would be more useful to write 
a handler which changes any parameter in "qsub", "aprun" or "mpiexec".

In implementation wise I would imagine there is an abstract handler with 
concrete implementation for each job scheduling command.

Thanks
-Amila

On Wed, Dec 10, 2014 at 9:17 AM, Marlon Pierce 
mailto:marpi...@iu.edu>> wrote:
+1 for more generalization.

We are collecting more raw material for chemistry application use cases at 
https://cwiki.apache.org/confluence/display/AIRAVATA/Use+Cases. We'll review 
them (and bio apps that we also collected previously) in a wiki document to see 
if our API mappings are correct.

Preliminarily, we see the command line arguments don't contain the full list of 
input and output files.  Additional required inputs may be passed via control 
files, environment variables, etc.  Examples include data libraries for basis 
functions, names of checkpoint files, names of output files, and so forth.  So 
we need a way to say the application may take 4 inputs, but only 1 is needed to 
construct a valid command line, for example.

On the other hand, I don't think we need the InputMetadataType that Chathuri 
introduces below. This overlaps with what is already in the compute resource 
description fields.


Marlon


On 12/8/14, 10:17 PM, Amila Jayasekara wrote:
Hi Chathuri,

I do not know anything about Gaussian. So its kind of hard for me to
understand what exactly is the meaning of the structures you introduced and
why you exactly need those structures.

A more important question is how to come up with a more abstract and
generic thrift IDLS so that you dont need to change it every time we add a
new application. Going through many example applications is certainly a
good way to understand broad requirements and helps to abstract out many
features.

Thanks
-Thejaka

On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>>
wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used
when the job run command is generated, but some does not.  Those which are
not involved with job run command also need to be staged to working
directory. Such flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can
specify the values for memory, cpu like options. If input file includes
those parameters, we need to give priority to those values instead of the
values specified in the request.

To support these features, we need to slightly modify our thrift IDLS,
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st

Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-11 Thread Shameera Rathnayaka
Hi Amila,

According to my understanding, what this handle does is, read the user
given configuration at run time. I have no idea this will effect to qsub or
aprun or other parameter. It is better if someone explain it to me too.

We already have a way to provide the these configuration parameters with
the experiment itself by defining ComputeResourceScheduling. But there are
some use cases like gaussian, where gaussian users provide these
configurations with the input file it self.  IMO here we have two options,
either we can  ask gaussian users to adopt to the airavata way but still
those configuration in input file is required for gaussian application(I
guess , correct me if I am wrong here) or use airavata extension points to
support this scenario. Here the handler address the second option.

Thanks,
Shameera.

On Thu, Dec 11, 2014 at 12:00 AM, Amila Jayasekara 
wrote:

> Also, regarding the handler that Shameera is working on ...
> I guess that handler is going to change mainly "qsub" parameters or
> "aprun" parameters (Correct me if i am wrong). I think it would be more
> useful to write a handler which changes any parameter in "qsub", "aprun" or
> "mpiexec".
>
> In implementation wise I would imagine there is an abstract handler with
> concrete implementation for each job scheduling command.
>
> Thanks
> -Amila
>
> On Wed, Dec 10, 2014 at 9:17 AM, Marlon Pierce  wrote:
>
>> +1 for more generalization.
>>
>> We are collecting more raw material for chemistry application use cases
>> at https://cwiki.apache.org/confluence/display/AIRAVATA/Use+Cases. We'll
>> review them (and bio apps that we also collected previously) in a wiki
>> document to see if our API mappings are correct.
>>
>> Preliminarily, we see the command line arguments don't contain the full
>> list of input and output files.  Additional required inputs may be passed
>> via control files, environment variables, etc.  Examples include data
>> libraries for basis functions, names of checkpoint files, names of output
>> files, and so forth.  So we need a way to say the application may take 4
>> inputs, but only 1 is needed to construct a valid command line, for example.
>>
>> On the other hand, I don't think we need the InputMetadataType that
>> Chathuri introduces below. This overlaps with what is already in the
>> compute resource description fields.
>>
>>
>> Marlon
>>
>>
>> On 12/8/14, 10:17 PM, Amila Jayasekara wrote:
>>
>>> Hi Chathuri,
>>>
>>> I do not know anything about Gaussian. So its kind of hard for me to
>>> understand what exactly is the meaning of the structures you introduced
>>> and
>>> why you exactly need those structures.
>>>
>>> A more important question is how to come up with a more abstract and
>>> generic thrift IDLS so that you dont need to change it every time we add
>>> a
>>> new application. Going through many example applications is certainly a
>>> good way to understand broad requirements and helps to abstract out many
>>> features.
>>>
>>> Thanks
>>> -Thejaka
>>>
>>> On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena <
>>> kamalas...@gmail.com>
>>> wrote:
>>>
>>>  Hi Devs,

 We are trying to add Gaussian application using airavata-appcatalog.
 While
 doing that, we face some limitations of the current design.

 In Gaussian there are several input files, some input files should used
 when the job run command is generated, but some does not.  Those which
 are
 not involved with job run command also need to be staged to working
 directory. Such flags are not supported in current design.

 Another interesting feature that in Gaussian is, in input file, we can
 specify the values for memory, cpu like options. If input file includes
 those parameters, we need to give priority to those values instead of
 the
 values specified in the request.

 To support these features, we need to slightly modify our thrift IDLS,
 specially to InputDataObjectType struct.

 Current struct is below.

 struct InputDataObjectType {
  1: required string name,
  2: optional string value,
  3: optional DataType type,
  4: optional string applicationArgument,
  5: optional bool standardInput = 0,
  6: optional string userFriendlyDescription,
  7: optional string metaData
 }

 In order to support 1st requirement, we introduce 2 enums.

 enum InputValidityType{
 REQUIRED,
 OPTIONAL
 }

 enum CommandLineType{
 INCLUSIVE,
 EXCLUSIVE
 }

 Please excuse me for names. You are welcome to suggest better names.

 To support 2nd requirement, we change metaData field to a map with
 another
 enum where we define all the metadata types that can have.

 enum InputMetadataType {
  MEMORY,
  CPU
 }

 So the new InputDataObjectType would be as below.

 struct InputDataObjectType {
  1: r

Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-10 Thread Amila Jayasekara
Also, regarding the handler that Shameera is working on ...
I guess that handler is going to change mainly "qsub" parameters or "aprun"
parameters (Correct me if i am wrong). I think it would be more useful to
write a handler which changes any parameter in "qsub", "aprun" or
"mpiexec".

In implementation wise I would imagine there is an abstract handler with
concrete implementation for each job scheduling command.

Thanks
-Amila

On Wed, Dec 10, 2014 at 9:17 AM, Marlon Pierce  wrote:

> +1 for more generalization.
>
> We are collecting more raw material for chemistry application use cases at
> https://cwiki.apache.org/confluence/display/AIRAVATA/Use+Cases. We'll
> review them (and bio apps that we also collected previously) in a wiki
> document to see if our API mappings are correct.
>
> Preliminarily, we see the command line arguments don't contain the full
> list of input and output files.  Additional required inputs may be passed
> via control files, environment variables, etc.  Examples include data
> libraries for basis functions, names of checkpoint files, names of output
> files, and so forth.  So we need a way to say the application may take 4
> inputs, but only 1 is needed to construct a valid command line, for example.
>
> On the other hand, I don't think we need the InputMetadataType that
> Chathuri introduces below. This overlaps with what is already in the
> compute resource description fields.
>
>
> Marlon
>
>
> On 12/8/14, 10:17 PM, Amila Jayasekara wrote:
>
>> Hi Chathuri,
>>
>> I do not know anything about Gaussian. So its kind of hard for me to
>> understand what exactly is the meaning of the structures you introduced
>> and
>> why you exactly need those structures.
>>
>> A more important question is how to come up with a more abstract and
>> generic thrift IDLS so that you dont need to change it every time we add a
>> new application. Going through many example applications is certainly a
>> good way to understand broad requirements and helps to abstract out many
>> features.
>>
>> Thanks
>> -Thejaka
>>
>> On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena <
>> kamalas...@gmail.com>
>> wrote:
>>
>>  Hi Devs,
>>>
>>> We are trying to add Gaussian application using airavata-appcatalog.
>>> While
>>> doing that, we face some limitations of the current design.
>>>
>>> In Gaussian there are several input files, some input files should used
>>> when the job run command is generated, but some does not.  Those which
>>> are
>>> not involved with job run command also need to be staged to working
>>> directory. Such flags are not supported in current design.
>>>
>>> Another interesting feature that in Gaussian is, in input file, we can
>>> specify the values for memory, cpu like options. If input file includes
>>> those parameters, we need to give priority to those values instead of the
>>> values specified in the request.
>>>
>>> To support these features, we need to slightly modify our thrift IDLS,
>>> specially to InputDataObjectType struct.
>>>
>>> Current struct is below.
>>>
>>> struct InputDataObjectType {
>>>  1: required string name,
>>>  2: optional string value,
>>>  3: optional DataType type,
>>>  4: optional string applicationArgument,
>>>  5: optional bool standardInput = 0,
>>>  6: optional string userFriendlyDescription,
>>>  7: optional string metaData
>>> }
>>>
>>> In order to support 1st requirement, we introduce 2 enums.
>>>
>>> enum InputValidityType{
>>> REQUIRED,
>>> OPTIONAL
>>> }
>>>
>>> enum CommandLineType{
>>> INCLUSIVE,
>>> EXCLUSIVE
>>> }
>>>
>>> Please excuse me for names. You are welcome to suggest better names.
>>>
>>> To support 2nd requirement, we change metaData field to a map with
>>> another
>>> enum where we define all the metadata types that can have.
>>>
>>> enum InputMetadataType {
>>>  MEMORY,
>>>  CPU
>>> }
>>>
>>> So the new InputDataObjectType would be as below.
>>>
>>> struct InputDataObjectType {
>>>  1: required string name,
>>>  2: optional string value,
>>>  3: optional DataType type,
>>>  4: optional string applicationArgument,
>>>  5: optional bool standardInput = 0,
>>>  6: optional string userFriendlyDescription,
>>>*  7: optional map metaData,*
>>> *8: optional InputValidityType inputValid;*
>>> *9: optional CommandLineType addedToCommandLine;*
>>> *10: optional bool dataStaged = 0;*
>>> }
>>>
>>> Suggestions are welcome.
>>>
>>> Thanks,
>>> Chathuri
>>>
>>>
>>>
>


Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-10 Thread Marlon Pierce

+1 for more generalization.

We are collecting more raw material for chemistry application use cases 
at https://cwiki.apache.org/confluence/display/AIRAVATA/Use+Cases. We'll 
review them (and bio apps that we also collected previously) in a wiki 
document to see if our API mappings are correct.


Preliminarily, we see the command line arguments don't contain the full 
list of input and output files.  Additional required inputs may be 
passed via control files, environment variables, etc.  Examples include 
data libraries for basis functions, names of checkpoint files, names of 
output files, and so forth.  So we need a way to say the application may 
take 4 inputs, but only 1 is needed to construct a valid command line, 
for example.


On the other hand, I don't think we need the InputMetadataType that 
Chathuri introduces below. This overlaps with what is already in the 
compute resource description fields.



Marlon

On 12/8/14, 10:17 PM, Amila Jayasekara wrote:

Hi Chathuri,

I do not know anything about Gaussian. So its kind of hard for me to
understand what exactly is the meaning of the structures you introduced and
why you exactly need those structures.

A more important question is how to come up with a more abstract and
generic thrift IDLS so that you dont need to change it every time we add a
new application. Going through many example applications is certainly a
good way to understand broad requirements and helps to abstract out many
features.

Thanks
-Thejaka

On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena 
wrote:


Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used
when the job run command is generated, but some does not.  Those which are
not involved with job run command also need to be staged to working
directory. Such flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can
specify the values for memory, cpu like options. If input file includes
those parameters, we need to give priority to those values instead of the
values specified in the request.

To support these features, we need to slightly modify our thrift IDLS,
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another
enum where we define all the metadata types that can have.

enum InputMetadataType {
 MEMORY,
 CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
   *  7: optional map metaData,*
*8: optional InputValidityType inputValid;*
*9: optional CommandLineType addedToCommandLine;*
*10: optional bool dataStaged = 0;*
}

Suggestions are welcome.

Thanks,
Chathuri






Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-10 Thread Marlon Pierce
We have several use cases [1] of codes (not just Gaussian) that take a 
single input file, but this input file may modify or override the memory 
and CPU requirements that may be specified in other parts of the API 
call.  It may also specify the names and locations of other input and 
output files (such as checkpoint files).   These input files follow the 
application's input format specification; they aren't resource-supplied 
helper scripts (that is a different consideration). The application 
depends on the information in these input files, so if the PBS/SLURM/etc 
script specifies incompatible values, the code will crash.


So we have to map this use case to the API and implementation in order 
to generate correct job execution scripts.   It is hard to capture in 
the API directly, but we can use a piece of code that handles specific 
application input file formats that inspects user-provided files and 
modifies the experiment data as appropriate (changing the memory, names 
and numbers of inputs and outputs, etc). These little pieces of codes 
need to be application specific and in a plugin place.


Two candidate places for this to happen (there may be others) are a) in 
the validation step in the orchestrator, and b) in the a GFAC handler.  
I don't have a strong argument for one of these over the other. Other 
recommendations?


Marlon

[1] https://cwiki.apache.org/confluence/display/AIRAVATA/Use+Cases

On 12/9/14, 11:03 AM, Shameera Rathnayaka wrote:

Hi Suresh,

Gaussian input file can provide environment requirement with the main input
file. For an example, %nprocshared which has the user defined process count
for that experiment and %mem  which provide the memory it require. There
are lots of command that user can provide with input file. In gaussian
input handler we need to parse this config lines and read the values set to
JobExecutionContext. This config template is specific to the Gaussian
application. Another application may have another set of configurations.
So this handler will be specific to the gaussian.

Thanks,
Shameera.

On Tue, Dec 9, 2014 at 8:18 AM, Suresh Marru  wrote:


Hi Shameera,

Can you please describe what this gaussian specific handler supposed to
do? Anything more beyond reading or editing the input file?

Suresh


On 09-Dec-2014, at 1:26 am, Shameera Rathnayaka 
wrote:

Hi All,

I am writing a new handler which is gaussian specific. I checked for a
location to put this handler code in the airavata main source code , but it
seems all handlers we have in airavata is bundle with particular provider.
Hence I was thinking to create a new project to put this code. But after
having offline chat with Marlon, decided to put this to the airavata main
source code because other developers also can works with this gaussian
handlers. So i am going to create a new module under gfac, named
"gfac-application-specific-handlers" (if you have any good suggestions
please reply) to keep all application specific handlers. When we fully
integrated gridchem applications we may end up few more application
specific handlers and those will go under this new module. WDYT?

Thanks,
Shameera.

On Mon, Dec 8, 2014 at 12:20 PM, Marlon Pierce  wrote:


That would be great. Please upload them to the Wiki.

Marlon


On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:


I would suggest that we look at several quantum chemistry applications
which have slight variations on the theme.  We have NWChem, Gamess, and
Molpro
examples to look at. I can send some input files and/or have a session
to go over the relevant sections. We can do this later today.

Thanks,
Sudhakar.


On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:

  The more examples, the better.  I'd like to find the right balance

between understanding the problem space and making incremental progress.

Marlon

On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:


Chaturi:
Thanks for these suggestions. One question I have is whether we should
look at some of the input files in the set of applications currently under
testing to come up with these requirements.
There may be additional requirements in some of the inputs. Of course
we can incrementally update the data structures as well as we test these
applications in more depth. But I feel some significant number of
application cases should be accommodated with each update. We may target
these for rc 0.15 and depending on the time available  we can look at at
least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog.
While doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should
used when the job run command is generated, but some does not.  Those which
are not involved with job run command also need to be staged to working
directory. Such flags are not supporte

Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-09 Thread Shameera Rathnayaka
Hi Suresh,

Gaussian input file can provide environment requirement with the main input
file. For an example, %nprocshared which has the user defined process count
for that experiment and %mem  which provide the memory it require. There
are lots of command that user can provide with input file. In gaussian
input handler we need to parse this config lines and read the values set to
JobExecutionContext. This config template is specific to the Gaussian
application. Another application may have another set of configurations.
So this handler will be specific to the gaussian.

Thanks,
Shameera.

On Tue, Dec 9, 2014 at 8:18 AM, Suresh Marru  wrote:

> Hi Shameera,
>
> Can you please describe what this gaussian specific handler supposed to
> do? Anything more beyond reading or editing the input file?
>
> Suresh
>
>
> On 09-Dec-2014, at 1:26 am, Shameera Rathnayaka 
> wrote:
>
> Hi All,
>
> I am writing a new handler which is gaussian specific. I checked for a
> location to put this handler code in the airavata main source code , but it
> seems all handlers we have in airavata is bundle with particular provider.
> Hence I was thinking to create a new project to put this code. But after
> having offline chat with Marlon, decided to put this to the airavata main
> source code because other developers also can works with this gaussian
> handlers. So i am going to create a new module under gfac, named
> "gfac-application-specific-handlers" (if you have any good suggestions
> please reply) to keep all application specific handlers. When we fully
> integrated gridchem applications we may end up few more application
> specific handlers and those will go under this new module. WDYT?
>
> Thanks,
> Shameera.
>
> On Mon, Dec 8, 2014 at 12:20 PM, Marlon Pierce  wrote:
>
>> That would be great. Please upload them to the Wiki.
>>
>> Marlon
>>
>>
>> On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:
>>
>>> I would suggest that we look at several quantum chemistry applications
>>> which have slight variations on the theme.  We have NWChem, Gamess, and
>>> Molpro
>>> examples to look at. I can send some input files and/or have a session
>>> to go over the relevant sections. We can do this later today.
>>>
>>> Thanks,
>>> Sudhakar.
>>>
>>>
>>> On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:
>>>
>>>  The more examples, the better.  I'd like to find the right balance
 between understanding the problem space and making incremental progress.

 Marlon

 On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:

> Chaturi:
> Thanks for these suggestions. One question I have is whether we should
> look at some of the input files in the set of applications currently under
> testing to come up with these requirements.
> There may be additional requirements in some of the inputs. Of course
> we can incrementally update the data structures as well as we test these
> applications in more depth. But I feel some significant number of
> application cases should be accommodated with each update. We may target
> these for rc 0.15 and depending on the time available  we can look at at
> least few more applications.
>
> Comments?
>
> Thanks,
> Sudhakar.
> On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena  > wrote:
>
> Hi Devs,
>
> We are trying to add Gaussian application using airavata-appcatalog.
> While doing that, we face some limitations of the current design.
>
> In Gaussian there are several input files, some input files should
> used when the job run command is generated, but some does not.  Those 
> which
> are not involved with job run command also need to be staged to working
> directory. Such flags are not supported in current design.
>
> Another interesting feature that in Gaussian is, in input file, we can
> specify the values for memory, cpu like options. If input file includes
> those parameters, we need to give priority to those values instead of the
> values specified in the request.
>
> To support these features, we need to slightly modify our thrift IDLS,
> specially to InputDataObjectType struct.
>
> Current struct is below.
>
> struct InputDataObjectType {
>  1: required string name,
>  2: optional string value,
>  3: optional DataType type,
>  4: optional string applicationArgument,
>  5: optional bool standardInput = 0,
>  6: optional string userFriendlyDescription,
>  7: optional string metaData
> }
>
> In order to support 1st requirement, we introduce 2 enums.
>
> enum InputValidityType{
> REQUIRED,
> OPTIONAL
> }
>
> enum CommandLineType{
> INCLUSIVE,
> EXCLUSIVE
> }
>
> Please excuse me for names. You are welcome to suggest better names.
>
> To support 2nd requirement, we change metaData fie

Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-09 Thread Suresh Marru
Hi Shameera,

Can you please describe what this gaussian specific handler supposed to do? 
Anything more beyond reading or editing the input file?

Suresh


> On 09-Dec-2014, at 1:26 am, Shameera Rathnayaka  wrote:
> 
> Hi All, 
> 
> I am writing a new handler which is gaussian specific. I checked for a 
> location to put this handler code in the airavata main source code , but it 
> seems all handlers we have in airavata is bundle with particular provider. 
> Hence I was thinking to create a new project to put this code. But after 
> having offline chat with Marlon, decided to put this to the airavata main 
> source code because other developers also can works with this gaussian 
> handlers. So i am going to create a new module under gfac, named 
> "gfac-application-specific-handlers" (if you have any good suggestions please 
> reply) to keep all application specific handlers. When we fully integrated 
> gridchem applications we may end up few more application specific handlers 
> and those will go under this new module. WDYT?
> 
> Thanks, 
> Shameera.
> 
> On Mon, Dec 8, 2014 at 12:20 PM, Marlon Pierce  > wrote:
> That would be great. Please upload them to the Wiki.
> 
> Marlon
> 
> 
> On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:
> I would suggest that we look at several quantum chemistry applications which 
> have slight variations on the theme.  We have NWChem, Gamess, and Molpro
> examples to look at. I can send some input files and/or have a session to go 
> over the relevant sections. We can do this later today.
> 
> Thanks,
> Sudhakar.
> 
> 
> On Dec 8, 2014, at 10:23 AM, Marlon Pierce  > wrote:
> 
> The more examples, the better.  I'd like to find the right balance between 
> understanding the problem space and making incremental progress.
> 
> Marlon
> 
> On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:
> Chaturi:
> Thanks for these suggestions. One question I have is whether we should look 
> at some of the input files in the set of applications currently under testing 
> to come up with these requirements.
> There may be additional requirements in some of the inputs. Of course we can 
> incrementally update the data structures as well as we test these 
> applications in more depth. But I feel some significant number of application 
> cases should be accommodated with each update. We may target these for rc 
> 0.15 and depending on the time available  we can look at at least few more 
> applications.
> 
> Comments?
> 
> Thanks,
> Sudhakar.
> On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena   >> wrote:
> 
> Hi Devs,
> 
> We are trying to add Gaussian application using airavata-appcatalog. While 
> doing that, we face some limitations of the current design.
> 
> In Gaussian there are several input files, some input files should used when 
> the job run command is generated, but some does not.  Those which are not 
> involved with job run command also need to be staged to working directory. 
> Such flags are not supported in current design.
> 
> Another interesting feature that in Gaussian is, in input file, we can 
> specify the values for memory, cpu like options. If input file includes those 
> parameters, we need to give priority to those values instead of the values 
> specified in the request.
> 
> To support these features, we need to slightly modify our thrift IDLS, 
> specially to InputDataObjectType struct.
> 
> Current struct is below.
> 
> struct InputDataObjectType {
>  1: required string name,
>  2: optional string value,
>  3: optional DataType type,
>  4: optional string applicationArgument,
>  5: optional bool standardInput = 0,
>  6: optional string userFriendlyDescription,
>  7: optional string metaData
> }
> 
> In order to support 1st requirement, we introduce 2 enums.
> 
> enum InputValidityType{
> REQUIRED,
> OPTIONAL
> }
> 
> enum CommandLineType{
> INCLUSIVE,
> EXCLUSIVE
> }
> 
> Please excuse me for names. You are welcome to suggest better names.
> 
> To support 2nd requirement, we change metaData field to a map with another 
> enum where we define all the metadata types that can have.
> 
> enum InputMetadataType {
>  MEMORY,
>  CPU
> }
> 
> So the new InputDataObjectType would be as below.
> 
> struct InputDataObjectType {
>  1: required string name,
>  2: optional string value,
>  3: optional DataType type,
>  4: optional string applicationArgument,
>  5: optional bool standardInput = 0,
>  6: optional string userFriendlyDescription,
>  7: optional map metaData,
>  8: optional InputValidityType inputValid;
>  9: optional CommandLineType addedToCommandLine;
>  10: optional bool dataStaged = 0;
> }
> 
> Suggestions are welcome.
> 
> Thanks,
> Chathuri
> 
> 
> 
> 
> 



Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Amila Jayasekara
Hi Chathuri,

I do not know anything about Gaussian. So its kind of hard for me to
understand what exactly is the meaning of the structures you introduced and
why you exactly need those structures.

A more important question is how to come up with a more abstract and
generic thrift IDLS so that you dont need to change it every time we add a
new application. Going through many example applications is certainly a
good way to understand broad requirements and helps to abstract out many
features.

Thanks
-Thejaka

On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena 
wrote:

> Hi Devs,
>
> We are trying to add Gaussian application using airavata-appcatalog. While
> doing that, we face some limitations of the current design.
>
> In Gaussian there are several input files, some input files should used
> when the job run command is generated, but some does not.  Those which are
> not involved with job run command also need to be staged to working
> directory. Such flags are not supported in current design.
>
> Another interesting feature that in Gaussian is, in input file, we can
> specify the values for memory, cpu like options. If input file includes
> those parameters, we need to give priority to those values instead of the
> values specified in the request.
>
> To support these features, we need to slightly modify our thrift IDLS,
> specially to InputDataObjectType struct.
>
> Current struct is below.
>
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional string metaData
> }
>
> In order to support 1st requirement, we introduce 2 enums.
>
> enum InputValidityType{
> REQUIRED,
> OPTIONAL
> }
>
> enum CommandLineType{
> INCLUSIVE,
> EXCLUSIVE
> }
>
> Please excuse me for names. You are welcome to suggest better names.
>
> To support 2nd requirement, we change metaData field to a map with another
> enum where we define all the metadata types that can have.
>
> enum InputMetadataType {
> MEMORY,
> CPU
> }
>
> So the new InputDataObjectType would be as below.
>
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
>   *  7: optional map metaData,*
> *8: optional InputValidityType inputValid;*
> *9: optional CommandLineType addedToCommandLine;*
> *10: optional bool dataStaged = 0;*
> }
>
> Suggestions are welcome.
>
> Thanks,
> Chathuri
>
>


Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Raminder Singh
Thanks Chathuri, the new changes look good. I will recommend another field for 
input validation. It can be used to validate user input based on text 
comparison or regex.  We can extend the orchestrator validator to use the field 
to validate inputs. 

I am not completely able to understand the use of inputMetadataType. Metadata 
normally contain extra information about the data or a validation schema etc. 
If we restrict inputMetadataType to ENUM its can become very specific to one 
application, Gaussian in this case. I am not able to understand the use of 
converting metaData to inputMetadataType map. 

I will recommend similar changes to output schema (see AIRAVATA-1544). 

Thanks
Raminder

On Dec 8, 2014, at 10:22 AM, Chathuri Wimalasena  wrote:

> Hi Devs, 
> 
> We are trying to add Gaussian application using airavata-appcatalog. While 
> doing that, we face some limitations of the current design. 
> 
> In Gaussian there are several input files, some input files should used when 
> the job run command is generated, but some does not.  Those which are not 
> involved with job run command also need to be staged to working directory. 
> Such flags are not supported in current design. 
> 
> Another interesting feature that in Gaussian is, in input file, we can 
> specify the values for memory, cpu like options. If input file includes those 
> parameters, we need to give priority to those values instead of the values 
> specified in the request. 
> 
> To support these features, we need to slightly modify our thrift IDLS, 
> specially to InputDataObjectType struct. 
> 
> Current struct is below. 
> 
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional string metaData
> }
> 
> In order to support 1st requirement, we introduce 2 enums.
> 
> enum InputValidityType{
>   REQUIRED,
>   OPTIONAL
> }
> 
> enum CommandLineType{
>   INCLUSIVE,
>   EXCLUSIVE
> }
> 
> Please excuse me for names. You are welcome to suggest better names. 
> 
> To support 2nd requirement, we change metaData field to a map with another 
> enum where we define all the metadata types that can have. 
> 
> enum InputMetadataType {
> MEMORY,
> CPU
> }
> 
> So the new InputDataObjectType would be as below. 
> 
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional map metaData,
> 8: optional InputValidityType inputValid;
> 9: optional CommandLineType addedToCommandLine;
> 10: optional bool dataStaged = 0;
> }
> 
> Suggestions are welcome. 
> 
> Thanks,
> Chathuri
> 



Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Shameera Rathnayaka
Hi All,

I am writing a new handler which is gaussian specific. I checked for a
location to put this handler code in the airavata main source code , but it
seems all handlers we have in airavata is bundle with particular provider.
Hence I was thinking to create a new project to put this code. But after
having offline chat with Marlon, decided to put this to the airavata main
source code because other developers also can works with this gaussian
handlers. So i am going to create a new module under gfac, named
"gfac-application-specific-handlers" (if you have any good suggestions
please reply) to keep all application specific handlers. When we fully
integrated gridchem applications we may end up few more application
specific handlers and those will go under this new module. WDYT?

Thanks,
Shameera.

On Mon, Dec 8, 2014 at 12:20 PM, Marlon Pierce  wrote:

> That would be great. Please upload them to the Wiki.
>
> Marlon
>
>
> On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:
>
>> I would suggest that we look at several quantum chemistry applications
>> which have slight variations on the theme.  We have NWChem, Gamess, and
>> Molpro
>> examples to look at. I can send some input files and/or have a session to
>> go over the relevant sections. We can do this later today.
>>
>> Thanks,
>> Sudhakar.
>>
>>
>> On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:
>>
>>  The more examples, the better.  I'd like to find the right balance
>>> between understanding the problem space and making incremental progress.
>>>
>>> Marlon
>>>
>>> On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:
>>>
 Chaturi:
 Thanks for these suggestions. One question I have is whether we should
 look at some of the input files in the set of applications currently under
 testing to come up with these requirements.
 There may be additional requirements in some of the inputs. Of course
 we can incrementally update the data structures as well as we test these
 applications in more depth. But I feel some significant number of
 application cases should be accommodated with each update. We may target
 these for rc 0.15 and depending on the time available  we can look at at
 least few more applications.

 Comments?

 Thanks,
 Sudhakar.
 On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena >>> > wrote:

 Hi Devs,

 We are trying to add Gaussian application using airavata-appcatalog.
 While doing that, we face some limitations of the current design.

 In Gaussian there are several input files, some input files should used
 when the job run command is generated, but some does not.  Those which are
 not involved with job run command also need to be staged to working
 directory. Such flags are not supported in current design.

 Another interesting feature that in Gaussian is, in input file, we can
 specify the values for memory, cpu like options. If input file includes
 those parameters, we need to give priority to those values instead of the
 values specified in the request.

 To support these features, we need to slightly modify our thrift IDLS,
 specially to InputDataObjectType struct.

 Current struct is below.

 struct InputDataObjectType {
  1: required string name,
  2: optional string value,
  3: optional DataType type,
  4: optional string applicationArgument,
  5: optional bool standardInput = 0,
  6: optional string userFriendlyDescription,
  7: optional string metaData
 }

 In order to support 1st requirement, we introduce 2 enums.

 enum InputValidityType{
 REQUIRED,
 OPTIONAL
 }

 enum CommandLineType{
 INCLUSIVE,
 EXCLUSIVE
 }

 Please excuse me for names. You are welcome to suggest better names.

 To support 2nd requirement, we change metaData field to a map with
 another enum where we define all the metadata types that can have.

 enum InputMetadataType {
  MEMORY,
  CPU
 }

 So the new InputDataObjectType would be as below.

 struct InputDataObjectType {
  1: required string name,
  2: optional string value,
  3: optional DataType type,
  4: optional string applicationArgument,
  5: optional bool standardInput = 0,
  6: optional string userFriendlyDescription,
  7: optional map metaData,
  8: optional InputValidityType inputValid;
  9: optional CommandLineType addedToCommandLine;
  10: optional bool dataStaged = 0;
 }

 Suggestions are welcome.

 Thanks,
 Chathuri




>


Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Marlon Pierce

That would be great. Please upload them to the Wiki.

Marlon

On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:

I would suggest that we look at several quantum chemistry applications which 
have slight variations on the theme.  We have NWChem, Gamess, and Molpro
examples to look at. I can send some input files and/or have a session to go 
over the relevant sections. We can do this later today.

Thanks,
Sudhakar.


On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:


The more examples, the better.  I'd like to find the right balance between 
understanding the problem space and making incremental progress.

Marlon

On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:

Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
 MEMORY,
 CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional map metaData,
 8: optional InputValidityType inputValid;
 9: optional CommandLineType addedToCommandLine;
 10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri







Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Pamidighantam, Sudhakar V
I would suggest that we look at several quantum chemistry applications which 
have slight variations on the theme.  We have NWChem, Gamess, and Molpro 
examples to look at. I can send some input files and/or have a session to go 
over the relevant sections. We can do this later today. 

Thanks,
Sudhakar. 


On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:

> The more examples, the better.  I'd like to find the right balance between 
> understanding the problem space and making incremental progress.
> 
> Marlon
> 
> On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:
>> Chaturi:
>> Thanks for these suggestions. One question I have is whether we should look 
>> at some of the input files in the set of applications currently under 
>> testing to come up with these requirements.
>> There may be additional requirements in some of the inputs. Of course we can 
>> incrementally update the data structures as well as we test these 
>> applications in more depth. But I feel some significant number of 
>> application cases should be accommodated with each update. We may target 
>> these for rc 0.15 and depending on the time available  we can look at at 
>> least few more applications.
>> 
>> Comments?
>> 
>> Thanks,
>> Sudhakar.
>> On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
>> mailto:kamalas...@gmail.com>> wrote:
>> 
>> Hi Devs,
>> 
>> We are trying to add Gaussian application using airavata-appcatalog. While 
>> doing that, we face some limitations of the current design.
>> 
>> In Gaussian there are several input files, some input files should used when 
>> the job run command is generated, but some does not.  Those which are not 
>> involved with job run command also need to be staged to working directory. 
>> Such flags are not supported in current design.
>> 
>> Another interesting feature that in Gaussian is, in input file, we can 
>> specify the values for memory, cpu like options. If input file includes 
>> those parameters, we need to give priority to those values instead of the 
>> values specified in the request.
>> 
>> To support these features, we need to slightly modify our thrift IDLS, 
>> specially to InputDataObjectType struct.
>> 
>> Current struct is below.
>> 
>> struct InputDataObjectType {
>> 1: required string name,
>> 2: optional string value,
>> 3: optional DataType type,
>> 4: optional string applicationArgument,
>> 5: optional bool standardInput = 0,
>> 6: optional string userFriendlyDescription,
>> 7: optional string metaData
>> }
>> 
>> In order to support 1st requirement, we introduce 2 enums.
>> 
>> enum InputValidityType{
>> REQUIRED,
>> OPTIONAL
>> }
>> 
>> enum CommandLineType{
>> INCLUSIVE,
>> EXCLUSIVE
>> }
>> 
>> Please excuse me for names. You are welcome to suggest better names.
>> 
>> To support 2nd requirement, we change metaData field to a map with another 
>> enum where we define all the metadata types that can have.
>> 
>> enum InputMetadataType {
>> MEMORY,
>> CPU
>> }
>> 
>> So the new InputDataObjectType would be as below.
>> 
>> struct InputDataObjectType {
>> 1: required string name,
>> 2: optional string value,
>> 3: optional DataType type,
>> 4: optional string applicationArgument,
>> 5: optional bool standardInput = 0,
>> 6: optional string userFriendlyDescription,
>> 7: optional map metaData,
>> 8: optional InputValidityType inputValid;
>> 9: optional CommandLineType addedToCommandLine;
>> 10: optional bool dataStaged = 0;
>> }
>> 
>> Suggestions are welcome.
>> 
>> Thanks,
>> Chathuri
>> 
>> 
>> 
> 



Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Marlon Pierce
The more examples, the better.  I'd like to find the right balance 
between understanding the problem space and making incremental progress.


Marlon

On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:

Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
 MEMORY,
 CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional map metaData,
 8: optional InputValidityType inputValid;
 9: optional CommandLineType addedToCommandLine;
 10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri







Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Pamidighantam, Sudhakar V
Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
MEMORY,
CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional map metaData,
8: optional InputValidityType inputValid;
9: optional CommandLineType addedToCommandLine;
10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri




Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Chathuri Wimalasena
Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used
when the job run command is generated, but some does not.  Those which are
not involved with job run command also need to be staged to working
directory. Such flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can
specify the values for memory, cpu like options. If input file includes
those parameters, we need to give priority to those values instead of the
values specified in the request.

To support these features, we need to slightly modify our thrift IDLS,
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another
enum where we define all the metadata types that can have.

enum InputMetadataType {
MEMORY,
CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
  *  7: optional map metaData,*
*8: optional InputValidityType inputValid;*
*9: optional CommandLineType addedToCommandLine;*
*10: optional bool dataStaged = 0;*
}

Suggestions are welcome.

Thanks,
Chathuri