Dan, We're not there yet. I got back to this issue late yesterday and the fix doesn't work on my cluster.
I added the code that was recommended at: The workaround: https://gist.github.com/wd15/9693712 but it appears now that we hang on the line: mesh = fp.Gmsh3D(mshFile) As with reading the .geo file the cores are busy but after an hour the code does not go past the mesh = .... command. Question: If procID==0 runs gmsh to make the .msh file, why not just run gmsh by hand (or script) and have the mesh file ready to go? Is there a way to change FiPy so that the users disk space is used instead of /tmp ? Then if one core creates the file all the cores, on various nodes, will see it. Thanks, Bill On Mar 21, 2014, at 12:35 PM, Daniel Wheeler <[email protected]<mailto:[email protected]>> wrote: Bill, In my case, the following error occurred when running on multiple nodes: Traceback (most recent call last): File "thermX.py", line 15, in <module> mesh = fp.Gmsh3D(geo) File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 1937, in __init__ background=background) File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 266, in openMSHFile fileIsTemporary=fileIsTemporary) File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 533, in __init__ GmshFile.__init__(self, filename=filename, communicator=communicator, mode=mode, fileIsTemporary=fileIsTemporary) File "/users/wd15/git/fipy/fipy/meshes/gmshMesh.py", line 294, in __init__ self.fileobj = open(self.filename, mode=mode) IOError: [Errno 2] No such file or directory: '/tmp/tmpi7MiWI.msh' This is because the "msh" file is being written on only one of the nodes and the "/tmp" directory isn't shared. Did you see this error? It may have been hidden from you in some way if it is indeed the same issue as your having. This is a fairly heinous error and points to the fact that the FiPy test slaves should test on multiple nodes. Anyway, there is a workaround, which I will send to you offline. I'll try and fix this properly when I get a chance. On Wed, Mar 19, 2014 at 9:02 AM, Seufzer, William J. (LARC-D307) <[email protected]<mailto:[email protected]>> wrote: Thanks Dan, Yes, I ran across 4 nodes (32 cores) and my log file returned a randomized list of integers 0 through 31. With other information from PBS I could see the names of the 4 nodes that were allocated (I believe I didn't have 32 processes on one node). -- Daniel Wheeler _______________________________________________ fipy mailing list [email protected]<mailto:[email protected]> http://www.ctcms.nist.gov/fipy [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
_______________________________________________ fipy mailing list [email protected] http://www.ctcms.nist.gov/fipy [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
