Dan,

We're not there yet. I got back to this issue late yesterday and the fix 
doesn't work on my cluster.

I added the code that was recommended at: The workaround: 
https://gist.github.com/wd15/9693712 but it appears now that we hang on the 
line:

mesh = fp.Gmsh3D(mshFile)

As with reading the .geo file the cores are busy but after an hour the code 
does not go past the mesh = .... command.

Question: If procID==0 runs gmsh to make the .msh file, why not just run gmsh 
by hand (or script) and have the mesh file ready to go?

Is there a way to change FiPy so that the users disk space is used instead of 
/tmp ? Then if one core creates the file all the cores, on various nodes, will 
see it.

Thanks,

Bill


On Mar 21, 2014, at 12:35 PM, Daniel Wheeler 
<[email protected]<mailto:[email protected]>> wrote:

Bill,

In my case, the following error occurred when running on multiple nodes:

Traceback (most recent call last):
 File "thermX.py", line 15, in <module>
   mesh = fp.Gmsh3D(geo)
 File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 1937, in __init__
   background=background)
 File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 266, in openMSHFile
   fileIsTemporary=fileIsTemporary)
 File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 533, in __init__
   GmshFile.__init__(self, filename=filename,
communicator=communicator, mode=mode, fileIsTemporary=fileIsTemporary)
 File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 294, in __init__
   self.fileobj = open(self.filename, mode=mode)
IOError: [Errno 2] No such file or directory: '/tmp/tmpi7MiWI.msh'

This is because the "msh" file is being written on only one of the
nodes and the "/tmp" directory isn't shared. Did you see this error?
It may have been hidden from you in some way if it is indeed the same
issue as your having.
This is a fairly heinous error and points to the fact that the FiPy
test slaves should test on multiple nodes.

Anyway, there is a workaround, which I will send to you offline.

I'll try and fix this properly when I get a chance.

On Wed, Mar 19, 2014 at 9:02 AM, Seufzer, William J. (LARC-D307)
<[email protected]<mailto:[email protected]>> wrote:
Thanks Dan,

Yes, I ran across 4 nodes (32 cores) and my log file returned a randomized list 
of integers 0 through 31. With other information from PBS I could see the names 
of the 4 nodes that were allocated (I believe I didn't have 32 processes on one 
node).

--
Daniel Wheeler
_______________________________________________
fipy mailing list
[email protected]<mailto:[email protected]>
http://www.ctcms.nist.gov/fipy
 [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

_______________________________________________
fipy mailing list
[email protected]
http://www.ctcms.nist.gov/fipy
  [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]

Reply via email to