So let's take $working_dir as the base and upload all inputs to
$working_dir/input. Then in the job script we set the working directory
to $working_dir/output and run the application.
Would this be reasonably guaranteed to put all outputs in the
$working_dir/output directory, or are there codes for which this would
not work?
Marlon
On 12/23/14, 2:54 PM, Miller, Mark wrote:
We do use a flat system. However, to return files in a very controlled way,
since one has to know precisely the names of input and output files, and maybe
even restrict the user's ability to name them. The codes we use produce files
conditionally, depending on the command line, and so unless you know the code
super well, using explicit names can cause certain output files to now be
returned at all, an obvious bummer for user and developer alike. As a result, I
have elected in most cases to just return all files. This can be confusing for
users, and just binning input and output as coarse categories is something I
wish I could do currently.
That was my motivation in responding....
Mark
-----Original Message-----
From: Marlon Pierce [mailto:[email protected]]
Sent: Tuesday, December 23, 2014 11:50 AM
To: [email protected]
Subject: Re: "input" and "output" subdirs in working directories
Thanks, Mark. What's your approach in CIPRES? Do you use a flat structure for
your working directories, or do you do something else?
Marlon
On 12/23/14, 2:45 PM, Miller, Mark wrote:
I am not sure of the reasoning behind the design, but off-hand it seems to me it would simplify the
job of returning to "input" and "output" files as discrete entities, so if you
don't know which files are which, you would have at least two coarse bins that can be returned
under separate banners.
Mark
-----Original Message-----
From: Marlon Pierce [mailto:[email protected]]
Sent: Tuesday, December 23, 2014 11:01 AM
To: Airavata Dev
Subject: "input" and "output" subdirs in working directories
When Airavata executes a remote command (launching a SLURM job, for example), it creates a working
directory on the target machine's scratch space and two subdirectories, "input" and
"output". Is there a good reason for creating these two subdirectories? Why not just do
all the work in the top level of the working directory? It seems unnecessary.
Also, I don't understand why these are in the GFAC module, as these should be
constructed from Registry information.
Below is background information.
--------------
Below is an example working directory.
$ cd
/oasis/scratch/trestles/ogce/temp_project/gta-work-dirs/TEST_8b10aa04-
95c3-4695-af77-d3b3987c7ef9/
$ ls -tlr
total 20
drwxr-xr-x 2 ogce sds128 4096 Dec 23 07:17 output
-rw-r--r-- 1 ogce sds128 831 Dec 23 07:39 1203922204.pbs
-rw------- 1 ogce sds128 28 Dec 23 07:40 Gaussian.stdout
-rw------- 1 ogce sds128 663 Dec 23 07:40 Gaussian.stderr drwxr-xr-x
2 ogce sds128 4096 Dec 23 07:47 input
The names of these subdirectories are specified in Constants.java (as
OUTPUT_DATA_DIR_VAR_NAME and INPUT_DATA_DIR_VAR_NAME). Below are the
files in the GFAC module that use these two constants.
$ find ./modules/gfac -type f -exec grep -il "OUTPUT_DATA_DIR_VAR" {}
\;
| grep java|grep -v target
./modules/gfac/gfac-core/src/main/java/org/apache/airavata/gfac/Consta
nts.java
./modules/gfac/gfac-core/src/main/java/org/apache/airavata/gfac/core/c
pi/BetterGfacImpl.java
./modules/gfac/gfac-gram/src/main/java/org/apache/airavata/gfac/gram/u
til/GramRSLGenerator.java
./modules/gfac/gfac-local/src/main/java/org/apache/airavata/gfac/local
/provider/impl/LocalProvider.java
./modules/gfac/gfac-ssh/src/main/java/org/apache/airavata/gfac/ssh/pro
vider/impl/SSHProvider.java
So we would need to clean these up if we remove the constants.
Marlon