Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Amila Jayasekara
Hi Chathuri,

I do not know anything about Gaussian. So its kind of hard for me to
understand what exactly is the meaning of the structures you introduced and
why you exactly need those structures.

A more important question is how to come up with a more abstract and
generic thrift IDLS so that you dont need to change it every time we add a
new application. Going through many example applications is certainly a
good way to understand broad requirements and helps to abstract out many
features.

Thanks
-Thejaka

On Mon, Dec 8, 2014 at 10:22 AM, Chathuri Wimalasena 
wrote:

> Hi Devs,
>
> We are trying to add Gaussian application using airavata-appcatalog. While
> doing that, we face some limitations of the current design.
>
> In Gaussian there are several input files, some input files should used
> when the job run command is generated, but some does not.  Those which are
> not involved with job run command also need to be staged to working
> directory. Such flags are not supported in current design.
>
> Another interesting feature that in Gaussian is, in input file, we can
> specify the values for memory, cpu like options. If input file includes
> those parameters, we need to give priority to those values instead of the
> values specified in the request.
>
> To support these features, we need to slightly modify our thrift IDLS,
> specially to InputDataObjectType struct.
>
> Current struct is below.
>
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional string metaData
> }
>
> In order to support 1st requirement, we introduce 2 enums.
>
> enum InputValidityType{
> REQUIRED,
> OPTIONAL
> }
>
> enum CommandLineType{
> INCLUSIVE,
> EXCLUSIVE
> }
>
> Please excuse me for names. You are welcome to suggest better names.
>
> To support 2nd requirement, we change metaData field to a map with another
> enum where we define all the metadata types that can have.
>
> enum InputMetadataType {
> MEMORY,
> CPU
> }
>
> So the new InputDataObjectType would be as below.
>
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
>   *  7: optional map metaData,*
> *8: optional InputValidityType inputValid;*
> *9: optional CommandLineType addedToCommandLine;*
> *10: optional bool dataStaged = 0;*
> }
>
> Suggestions are welcome.
>
> Thanks,
> Chathuri
>
>


Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Raminder Singh
Thanks Chathuri, the new changes look good. I will recommend another field for 
input validation. It can be used to validate user input based on text 
comparison or regex.  We can extend the orchestrator validator to use the field 
to validate inputs. 

I am not completely able to understand the use of inputMetadataType. Metadata 
normally contain extra information about the data or a validation schema etc. 
If we restrict inputMetadataType to ENUM its can become very specific to one 
application, Gaussian in this case. I am not able to understand the use of 
converting metaData to inputMetadataType map. 

I will recommend similar changes to output schema (see AIRAVATA-1544). 

Thanks
Raminder

On Dec 8, 2014, at 10:22 AM, Chathuri Wimalasena  wrote:

> Hi Devs, 
> 
> We are trying to add Gaussian application using airavata-appcatalog. While 
> doing that, we face some limitations of the current design. 
> 
> In Gaussian there are several input files, some input files should used when 
> the job run command is generated, but some does not.  Those which are not 
> involved with job run command also need to be staged to working directory. 
> Such flags are not supported in current design. 
> 
> Another interesting feature that in Gaussian is, in input file, we can 
> specify the values for memory, cpu like options. If input file includes those 
> parameters, we need to give priority to those values instead of the values 
> specified in the request. 
> 
> To support these features, we need to slightly modify our thrift IDLS, 
> specially to InputDataObjectType struct. 
> 
> Current struct is below. 
> 
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional string metaData
> }
> 
> In order to support 1st requirement, we introduce 2 enums.
> 
> enum InputValidityType{
>   REQUIRED,
>   OPTIONAL
> }
> 
> enum CommandLineType{
>   INCLUSIVE,
>   EXCLUSIVE
> }
> 
> Please excuse me for names. You are welcome to suggest better names. 
> 
> To support 2nd requirement, we change metaData field to a map with another 
> enum where we define all the metadata types that can have. 
> 
> enum InputMetadataType {
> MEMORY,
> CPU
> }
> 
> So the new InputDataObjectType would be as below. 
> 
> struct InputDataObjectType {
> 1: required string name,
> 2: optional string value,
> 3: optional DataType type,
> 4: optional string applicationArgument,
> 5: optional bool standardInput = 0,
> 6: optional string userFriendlyDescription,
> 7: optional map metaData,
> 8: optional InputValidityType inputValid;
> 9: optional CommandLineType addedToCommandLine;
> 10: optional bool dataStaged = 0;
> }
> 
> Suggestions are welcome. 
> 
> Thanks,
> Chathuri
> 



Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Shameera Rathnayaka
Hi All,

I am writing a new handler which is gaussian specific. I checked for a
location to put this handler code in the airavata main source code , but it
seems all handlers we have in airavata is bundle with particular provider.
Hence I was thinking to create a new project to put this code. But after
having offline chat with Marlon, decided to put this to the airavata main
source code because other developers also can works with this gaussian
handlers. So i am going to create a new module under gfac, named
"gfac-application-specific-handlers" (if you have any good suggestions
please reply) to keep all application specific handlers. When we fully
integrated gridchem applications we may end up few more application
specific handlers and those will go under this new module. WDYT?

Thanks,
Shameera.

On Mon, Dec 8, 2014 at 12:20 PM, Marlon Pierce  wrote:

> That would be great. Please upload them to the Wiki.
>
> Marlon
>
>
> On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:
>
>> I would suggest that we look at several quantum chemistry applications
>> which have slight variations on the theme.  We have NWChem, Gamess, and
>> Molpro
>> examples to look at. I can send some input files and/or have a session to
>> go over the relevant sections. We can do this later today.
>>
>> Thanks,
>> Sudhakar.
>>
>>
>> On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:
>>
>>  The more examples, the better.  I'd like to find the right balance
>>> between understanding the problem space and making incremental progress.
>>>
>>> Marlon
>>>
>>> On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:
>>>
 Chaturi:
 Thanks for these suggestions. One question I have is whether we should
 look at some of the input files in the set of applications currently under
 testing to come up with these requirements.
 There may be additional requirements in some of the inputs. Of course
 we can incrementally update the data structures as well as we test these
 applications in more depth. But I feel some significant number of
 application cases should be accommodated with each update. We may target
 these for rc 0.15 and depending on the time available  we can look at at
 least few more applications.

 Comments?

 Thanks,
 Sudhakar.
 On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena >>> > wrote:

 Hi Devs,

 We are trying to add Gaussian application using airavata-appcatalog.
 While doing that, we face some limitations of the current design.

 In Gaussian there are several input files, some input files should used
 when the job run command is generated, but some does not.  Those which are
 not involved with job run command also need to be staged to working
 directory. Such flags are not supported in current design.

 Another interesting feature that in Gaussian is, in input file, we can
 specify the values for memory, cpu like options. If input file includes
 those parameters, we need to give priority to those values instead of the
 values specified in the request.

 To support these features, we need to slightly modify our thrift IDLS,
 specially to InputDataObjectType struct.

 Current struct is below.

 struct InputDataObjectType {
  1: required string name,
  2: optional string value,
  3: optional DataType type,
  4: optional string applicationArgument,
  5: optional bool standardInput = 0,
  6: optional string userFriendlyDescription,
  7: optional string metaData
 }

 In order to support 1st requirement, we introduce 2 enums.

 enum InputValidityType{
 REQUIRED,
 OPTIONAL
 }

 enum CommandLineType{
 INCLUSIVE,
 EXCLUSIVE
 }

 Please excuse me for names. You are welcome to suggest better names.

 To support 2nd requirement, we change metaData field to a map with
 another enum where we define all the metadata types that can have.

 enum InputMetadataType {
  MEMORY,
  CPU
 }

 So the new InputDataObjectType would be as below.

 struct InputDataObjectType {
  1: required string name,
  2: optional string value,
  3: optional DataType type,
  4: optional string applicationArgument,
  5: optional bool standardInput = 0,
  6: optional string userFriendlyDescription,
  7: optional map metaData,
  8: optional InputValidityType inputValid;
  9: optional CommandLineType addedToCommandLine;
  10: optional bool dataStaged = 0;
 }

 Suggestions are welcome.

 Thanks,
 Chathuri




>


Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Marlon Pierce

That would be great. Please upload them to the Wiki.

Marlon

On 12/8/14, 11:59 AM, Pamidighantam, Sudhakar V wrote:

I would suggest that we look at several quantum chemistry applications which 
have slight variations on the theme.  We have NWChem, Gamess, and Molpro
examples to look at. I can send some input files and/or have a session to go 
over the relevant sections. We can do this later today.

Thanks,
Sudhakar.


On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:


The more examples, the better.  I'd like to find the right balance between 
understanding the problem space and making incremental progress.

Marlon

On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:

Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
 MEMORY,
 CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional map metaData,
 8: optional InputValidityType inputValid;
 9: optional CommandLineType addedToCommandLine;
 10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri







Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Pamidighantam, Sudhakar V
I would suggest that we look at several quantum chemistry applications which 
have slight variations on the theme.  We have NWChem, Gamess, and Molpro 
examples to look at. I can send some input files and/or have a session to go 
over the relevant sections. We can do this later today. 

Thanks,
Sudhakar. 


On Dec 8, 2014, at 10:23 AM, Marlon Pierce  wrote:

> The more examples, the better.  I'd like to find the right balance between 
> understanding the problem space and making incremental progress.
> 
> Marlon
> 
> On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:
>> Chaturi:
>> Thanks for these suggestions. One question I have is whether we should look 
>> at some of the input files in the set of applications currently under 
>> testing to come up with these requirements.
>> There may be additional requirements in some of the inputs. Of course we can 
>> incrementally update the data structures as well as we test these 
>> applications in more depth. But I feel some significant number of 
>> application cases should be accommodated with each update. We may target 
>> these for rc 0.15 and depending on the time available  we can look at at 
>> least few more applications.
>> 
>> Comments?
>> 
>> Thanks,
>> Sudhakar.
>> On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
>> mailto:kamalas...@gmail.com>> wrote:
>> 
>> Hi Devs,
>> 
>> We are trying to add Gaussian application using airavata-appcatalog. While 
>> doing that, we face some limitations of the current design.
>> 
>> In Gaussian there are several input files, some input files should used when 
>> the job run command is generated, but some does not.  Those which are not 
>> involved with job run command also need to be staged to working directory. 
>> Such flags are not supported in current design.
>> 
>> Another interesting feature that in Gaussian is, in input file, we can 
>> specify the values for memory, cpu like options. If input file includes 
>> those parameters, we need to give priority to those values instead of the 
>> values specified in the request.
>> 
>> To support these features, we need to slightly modify our thrift IDLS, 
>> specially to InputDataObjectType struct.
>> 
>> Current struct is below.
>> 
>> struct InputDataObjectType {
>> 1: required string name,
>> 2: optional string value,
>> 3: optional DataType type,
>> 4: optional string applicationArgument,
>> 5: optional bool standardInput = 0,
>> 6: optional string userFriendlyDescription,
>> 7: optional string metaData
>> }
>> 
>> In order to support 1st requirement, we introduce 2 enums.
>> 
>> enum InputValidityType{
>> REQUIRED,
>> OPTIONAL
>> }
>> 
>> enum CommandLineType{
>> INCLUSIVE,
>> EXCLUSIVE
>> }
>> 
>> Please excuse me for names. You are welcome to suggest better names.
>> 
>> To support 2nd requirement, we change metaData field to a map with another 
>> enum where we define all the metadata types that can have.
>> 
>> enum InputMetadataType {
>> MEMORY,
>> CPU
>> }
>> 
>> So the new InputDataObjectType would be as below.
>> 
>> struct InputDataObjectType {
>> 1: required string name,
>> 2: optional string value,
>> 3: optional DataType type,
>> 4: optional string applicationArgument,
>> 5: optional bool standardInput = 0,
>> 6: optional string userFriendlyDescription,
>> 7: optional map metaData,
>> 8: optional InputValidityType inputValid;
>> 9: optional CommandLineType addedToCommandLine;
>> 10: optional bool dataStaged = 0;
>> }
>> 
>> Suggestions are welcome.
>> 
>> Thanks,
>> Chathuri
>> 
>> 
>> 
> 



Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Marlon Pierce
The more examples, the better.  I'd like to find the right balance 
between understanding the problem space and making incremental progress.


Marlon

On 12/8/14, 10:38 AM, Pamidighantam, Sudhakar V wrote:

Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
 MEMORY,
 CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
 1: required string name,
 2: optional string value,
 3: optional DataType type,
 4: optional string applicationArgument,
 5: optional bool standardInput = 0,
 6: optional string userFriendlyDescription,
 7: optional map metaData,
 8: optional InputValidityType inputValid;
 9: optional CommandLineType addedToCommandLine;
 10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri







Re: Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Pamidighantam, Sudhakar V
Chaturi:
Thanks for these suggestions. One question I have is whether we should look at 
some of the input files in the set of applications currently under testing to 
come up with these requirements.
There may be additional requirements in some of the inputs. Of course we can 
incrementally update the data structures as well as we test these applications 
in more depth. But I feel some significant number of application cases should 
be accommodated with each update. We may target these for rc 0.15 and depending 
on the time available  we can look at at least few more applications.

Comments?

Thanks,
Sudhakar.
On Dec 8, 2014, at 9:22 AM, Chathuri Wimalasena 
mailto:kamalas...@gmail.com>> wrote:

Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While 
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used when 
the job run command is generated, but some does not.  Those which are not 
involved with job run command also need to be staged to working directory. Such 
flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can specify 
the values for memory, cpu like options. If input file includes those 
parameters, we need to give priority to those values instead of the values 
specified in the request.

To support these features, we need to slightly modify our thrift IDLS, 
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another enum 
where we define all the metadata types that can have.

enum InputMetadataType {
MEMORY,
CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional map metaData,
8: optional InputValidityType inputValid;
9: optional CommandLineType addedToCommandLine;
10: optional bool dataStaged = 0;
}

Suggestions are welcome.

Thanks,
Chathuri




Improvements to Experiment input data model in order to support Gaussian application

2014-12-08 Thread Chathuri Wimalasena
Hi Devs,

We are trying to add Gaussian application using airavata-appcatalog. While
doing that, we face some limitations of the current design.

In Gaussian there are several input files, some input files should used
when the job run command is generated, but some does not.  Those which are
not involved with job run command also need to be staged to working
directory. Such flags are not supported in current design.

Another interesting feature that in Gaussian is, in input file, we can
specify the values for memory, cpu like options. If input file includes
those parameters, we need to give priority to those values instead of the
values specified in the request.

To support these features, we need to slightly modify our thrift IDLS,
specially to InputDataObjectType struct.

Current struct is below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
7: optional string metaData
}

In order to support 1st requirement, we introduce 2 enums.

enum InputValidityType{
REQUIRED,
OPTIONAL
}

enum CommandLineType{
INCLUSIVE,
EXCLUSIVE
}

Please excuse me for names. You are welcome to suggest better names.

To support 2nd requirement, we change metaData field to a map with another
enum where we define all the metadata types that can have.

enum InputMetadataType {
MEMORY,
CPU
}

So the new InputDataObjectType would be as below.

struct InputDataObjectType {
1: required string name,
2: optional string value,
3: optional DataType type,
4: optional string applicationArgument,
5: optional bool standardInput = 0,
6: optional string userFriendlyDescription,
  *  7: optional map metaData,*
*8: optional InputValidityType inputValid;*
*9: optional CommandLineType addedToCommandLine;*
*10: optional bool dataStaged = 0;*
}

Suggestions are welcome.

Thanks,
Chathuri