Le 02/05/2015 16:09, Romain Reuillon a écrit :
Hi Andreas,I just pushed a first implementation of the optimisation for cluster environments in case of shared storage with the submission node. To enable it you should add the storageSharedLocally = true in you environment constructor. You should kill the dbServer when you update to this version (so it can reinitialize the db) since some files where compressed and are not anymore.There are still room for optimisation, especially concerning the output files and the directories (in input and output) which are still subject to several transformations which might be bypassed in case of a shared storage.I tried it on my local machine with the SshEnvironment and its functionnal. Could you test it on your environments?cheers, Romain Le 01/05/2015 19:09, Andreas Schuh a écrit :On 1 May 2015, at 18:00, Romain Reuillon <[email protected]> wrote:Yes, it is. Only the TMPDIR is local to each compute node and not shared.The default is in home, but you can configure where the jobs should be working in as an option of the environment. In the present implementation it has to be a shared storage, but I guess that $WORK is one.Le 01/05/2015 18:55, Andreas Schuh a écrit :FYI I just refreshed my memory of our college HPC cluster (it’s actually using PBS, not SGE as mentioned before).From their intro document, the following information may be useful while revising the OpenMOLE storage handling:On the HPC system, there are two file stores available to the user: HOME and WORK . HOME has a relatively small quota of 10GB and is intended for storing binaries, source and modest amounts of data. It should not be written to directly by jobs.WORK is a larger area which is intended for staging files between jobs and for long term dataThese areas should be referred to using the environment variables $HOME and $WORK as their absolute locations are subject to change.Additionally,$TMPDIR. Jobs requiring scratch space at run time should write to $TMPDIR.On 1 May 2015, at 11:57, Andreas Schuh <[email protected]> wrote:On 1 May 2015, at 11:49, Romain Reuillon <[email protected]> wrote:That would be great as I was hoping to finally be able to run my tasks to get actual results… it’s been 1 month now developing the OpenMOLE workflow :(I’ll be happy to test it in our environment. I have access to our lab dedicated SLURM cluster and the department HTCondor setup. I could also try it on our college HPC which uses SGE and shared storage.I also agree that these options should be part of the environment specification.Great !OpenMOLE env works by copying file to storages. In the general case the storage is not shared between the submission machine and the execution machines. In the case of a cluster OpenMOLE copy everything on the shared FS by using ssh transfer to the master node (entry point of the cluster) so it is accessible to all the computing nodes. In the particular case where the submission machine shares it's FS with the computing node I intend to substitute copy operations by simlink creations, in order for this particular case to be handled by the generic submission code of OpenMOLE.I basically agree with you for the file in ~/.openmole: file are transfered to the node through the shared FS. So it has to be copied here. What could be optimized, is the temporary dir location of execution for task. It is also created in this folder and therefore on the sharded FS, which is not actually requiered. This workdir could be optionnaly relocated somewhere using an environment parameter.Not sure if I follow this solution outline, but I’m sure you have a better idea of how things are working right now and need to be modified. Why do files have to be copied to ~/.openmole when the original input files to the workflow (exploration SelectFileDomain), is already located in a shared FS ?That the location of the local and remote temporary directory location can be configured via environment variable would solve the second issue of where temporary files such as wrapper scripts and remote resources are located.The first issue is how to deal with input and output files of tasks which are located on a shared FS already and thus should not require a copy to the temporary directories.Ok, got it, and sounds like a good solution.So the optional symbolic links (“link” option of “addInputFile” and “addResource”) from the temporary directory/workingDir of each individual task are pointing to the storage on the master node of the execution machines. That is why I encounter currently an unexpected copy of my files. When the storage used by the execution machines themselves, however, uses symbolic links to the storage of the submission machine (as all machines share the same FS), no files are actually copied.What would have been when I had executed the OpenMOLE console on the master node of the environment ? Would then OpenMOLE already know that submission machine and execution machine are actually identical and thus inherently share the same storage ?_______________________________________________ OpenMOLE-users mailing list [email protected] http://fedex.iscpif.fr/mailman/listinfo/openmole-users
smime.p7s
Description: Signature cryptographique S/MIME
_______________________________________________ OpenMOLE-users mailing list [email protected] http://fedex.iscpif.fr/mailman/listinfo/openmole-users
