Re: Parallel meshing

Daniel Wheeler Thu, 03 Apr 2014 07:59:03 -0700

On Wed, Apr 2, 2014 at 6:04 PM, Seufzer, William J. (LARC-D307)
<[email protected]> wrote:


> Setting the tempdir in python (#1 above) did not fix the problem, although 
> somehow a few runs actually completed. But your advice on the /tmp directory 
> was correct, but not complete enough for our cluster configuration.

Interesting. This makes fixing / automating the setting of the temp
directory quite difficult. Would changing the $TMPDIR from within
Python even work?

>
> Here is what I found that worked.
>
> If set the TMPDIR environment variable in the PBS script to match the 
> directory in #1, great success!
>
> If I comment #3 sometimes I will get an error that says something about not 
> finding the end of file in the .msh file (I can try to repeat and get the 
> actual message if you desire). Leaving #3 in gives comfort that all cores 
> will start #4 at the same time.

The .msh file may be written on only one of the proccesses, but read
on all of them. That is probably a bug. I'll put in a bug report for
that.

> If I comment out #1 and look at the results from #2, I consistently saw 4 of 
> 12 cores (over 3 nodes) print the $TMPDIR directory, the other cores printed 
> their local /tmp directory. Our cluster nodes do not share /tmp space so the 
> gettempdir() results were not necessarily the same even if the string was 
> identical.
>
> Success was found, but only after a bit of trial and error. I was surprised 
> that some of the processes would ignore the $TMPDIR environment variable and 
> use the local /tmp space.

That is weird.

> Now with TMPDIR and tempfile.tempdir both set and pointing to the same 
> directory I do get a warning from MPI that using a network file system may 
> not be the best solution. But in my case it works!!

Good. I hope you make good progress with your work.

-- 
Daniel Wheeler

_______________________________________________
fipy mailing list
[email protected]
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

Re: Parallel meshing

Reply via email to