Dan,

We're not completely out of the woods yet. I've gotten several successful runs 
this morning but occasionally the program hangs on the Gmsh3D command. I'm 
going to have to let this set for the next week but I've added to your comments 
below.

Bill


On Apr 3, 2014, at 10:58 AM, Daniel Wheeler <[email protected]>
 wrote:

> On Wed, Apr 2, 2014 at 6:04 PM, Seufzer, William J. (LARC-D307)
> <[email protected]> wrote:
> 
>> Setting the tempdir in python (#1 above) did not fix the problem, although 
>> somehow a few runs actually completed. But your advice on the /tmp directory 
>> was correct, but not complete enough for our cluster configuration.
> 
> Interesting. This makes fixing / automating the setting of the temp
> directory quite difficult. Would changing the $TMPDIR from within
> Python even work?

Do you mean to set the environment variable with os.environ? Would this be 
different from setting the variable as part of the PBS start up script?

> 
>> 
>> Here is what I found that worked.
>> 
>> If set the TMPDIR environment variable in the PBS script to match the 
>> directory in #1, great success!
>> 
>> If I comment #3 sometimes I will get an error that says something about not 
>> finding the end of file in the .msh file (I can try to repeat and get the 
>> actual message if you desire). Leaving #3 in gives comfort that all cores 
>> will start #4 at the same time.
> 
> The .msh file may be written on only one of the proccesses, but read
> on all of them. That is probably a bug. I'll put in a bug report for
> that.

Are you suggesting that all cores are creating a .msh file with the same name 
in the same place and writing over each other? I'll attempt, next week, to have 
the .msh ready and use Gmsh3D to read directly. If the .msh file is read 
directly is gmsh invoked?

> 
>> If I comment out #1 and look at the results from #2, I consistently saw 4 of 
>> 12 cores (over 3 nodes) print the $TMPDIR directory, the other cores printed 
>> their local /tmp directory. Our cluster nodes do not share /tmp space so the 
>> gettempdir() results were not necessarily the same even if the string was 
>> identical.
>> 
>> Success was found, but only after a bit of trial and error. I was surprised 
>> that some of the processes would ignore the $TMPDIR environment variable and 
>> use the local /tmp space.
> 
> That is weird.

Yes... I'm thinking of adding some lines to the PBS script to see that all the 
correct paths are being followed to executables and libraries.
> 
>> Now with TMPDIR and tempfile.tempdir both set and pointing to the same 
>> directory I do get a warning from MPI that using a network file system may 
>> not be the best solution. But in my case it works!!
> 
> Good. I hope you make good progress with your work.

Well... someday... :)

> 
> -- 
> Daniel Wheeler
> 
> _______________________________________________
> fipy mailing list
> [email protected]
> http://www.ctcms.nist.gov/fipy
>  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]


_______________________________________________
fipy mailing list
[email protected]
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

Reply via email to