You can always apply "Process ID Scalars" filter to a Sphere source and that will show colors for each process.
Utkarsh On Fri, Nov 11, 2011 at 8:10 PM, Cook, Rich <[email protected]> wrote: > Good catch. Indeed, the root node was on prism120, another node in the batch > pool. When I tunneled to that host instead of the other, I got a good > connection with 2 servers using MPI. > Just to be sure, is there a way to query the state of the connection from > within the client? I cannot tell from the GUI or the server output whether I > am connected to 2 servers or 1. I am certain I launched two servers and got > a good connection, and I can view a molecule, but... I'm paranoid. You never > know. :-) > > This is going to be nasty to try to make work for our users. > > Thanks for the help! > -- Rich > > On Nov 11, 2011, at 4:57 PM, Utkarsh Ayachit wrote: > >> Very peculiar. I wonder if MPI is running the root node on some other >> node. Are you sure the process is run on the same machine? Can you >> trying putting an IP address or real hostname instead of localhost? >> >> Utkarsh >> >> On Fri, Nov 11, 2011 at 7:54 PM, Cook, Rich <[email protected]> wrote: >>> And to clarify, if I just do serial, I get this good behavior: >>> >>> rcook@prism127 (~): >>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>> --use-offscreen-rendering --reverse-connection --client-host=localhost >>> Waiting for client >>> Connection URL: csrc://localhost:11111 >>> Client connected. >>> >>> On Nov 11, 2011, at 4:51 PM, Cook, Rich wrote: >>> >>>> My bad. >>>> The first email I sent I was using the wrong MPI (srun instead of mpiexec >>>> -- mvapich instead of openmpi). So both processes were indeed getting set >>>> to the same process ID. Please ignore that output. >>>> The current output looks like this: >>>> >>>> rcook@prism127 (~): mpiexec -np 2 >>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>>> --use-offscreen-rendering --reverse-connection --client-host=localhost >>>> Waiting for client >>>> Connection URL: csrc://localhost:11111 >>>> ERROR: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, >>>> line 481 >>>> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection >>>> refused. >>>> >>>> ERROR: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx, >>>> line 53 >>>> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111 >>>> >>>> Warning: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx, >>>> line 250 >>>> vtkTCPNetworkAccessManager (0x8356f0): Connect failed. Retrying for >>>> 59.9993 more seconds. >>>> >>>> ERROR: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, >>>> line 481 >>>> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection >>>> refused. >>>> >>>> ERROR: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx, >>>> line 53 >>>> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111 >>>> >>>> Warning: In >>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx, >>>> line 250 >>>> vtkTCPNetworkAccessManager (0x8356f0): Connect failed. Retrying for >>>> 58.9972 more seconds. >>>> >>>> mpiexec: killing job... >>>> >>>> >>>> Note the presence of only one connecting message. Again, I apologize for >>>> the mixup. I spoke with our MPI guru and have confirmed that MPI appears >>>> to be working correctly and I'm not making a mistake in how I launch >>>> pvserver from the batch job perspective. >>>> >>>> Do you still want that output? >>>> >>>> On Nov 11, 2011, at 4:44 PM, Utkarsh Ayachit wrote: >>>> >>>>> That sounds very odd. If process_id variable is indeed correctly set >>>>> to 0 and 1 on the two processes, then how come there are two "Waiting >>>>> for client" lines printed out in the first email that you sent? >>>>> >>>>> Can you change that line cout to the following to verify that both >>>>> processes are indeed printing out from the same time? >>>>> >>>>> cout << __LINE__ << " : Waiting for client" << endl; >>>>> >>>>> (This is in pvserver_common.h: 58) >>>>> >>>>> Utkarsh >>>>> >>>>> On Fri, Nov 11, 2011 at 6:30 PM, Cook, Rich <[email protected]> wrote: >>>>>> I posted the CMakeCache.txt. I also have tried to step through the code >>>>>> using TotalView and I can see it calling MPI_init() etc. It looks like >>>>>> one process correctly gets rank 0 and one gets rank 1 (by inspecting >>>>>> process_id variable in RealMain()) >>>>>> If I start in serial, it connects and I can view a protein molecule >>>>>> successfully. If I start in parallel, exactly one server tries and >>>>>> fails to connect. Am I supposed to give any extra arguments when >>>>>> starting in parallel? >>>>>> This is what I'm doing: >>>>>> >>>>>> mpiexec -np 2 >>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>>>>> --use-offscreen-rendering --reverse-connection --client-host=localhost >>>>>> >>>>>> >>>>>> >>>>>> On Nov 11, 2011, at 11:11 AM, Utkarsh Ayachit wrote: >>>>>> >>>>>>> Can you post your CMakeCache.txt? >>>>>>> >>>>>>> Utkarsh >>>>>>> >>>>>>> On Fri, Nov 11, 2011 at 2:08 PM, Cook, Rich <[email protected]> wrote: >>>>>>>> Hi, thanks, but you are incorrect. >>>>>>>> I did set that variable and it was indeed compiled with MPI, as I said. >>>>>>>> >>>>>>>> rcook@prism127 (IMG_private): type pvserver >>>>>>>> pvserver is >>>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>>>>>>> rcook@prism127 (IMG_private): ldd >>>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>>>>>>> libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 >>>>>>>> (0x00002aaaaacc9000) >>>>>>>> libopen-rte.so.0 => >>>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 >>>>>>>> (0x00002aaaaaf6c000) >>>>>>>> libopen-pal.so.0 => >>>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 >>>>>>>> (0x00002aaaab1b7000) >>>>>>>> libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab434000) >>>>>>>> libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab638000) >>>>>>>> libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab850000) >>>>>>>> libm.so.6 => /lib64/libm.so.6 (0x00002aaaaba54000) >>>>>>>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabcd7000) >>>>>>>> libc.so.6 => /lib64/libc.so.6 (0x00002aaaabef2000) >>>>>>>> /lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000) >>>>>>>> >>>>>>>> When the pvservers are running, I can see that they are the correct >>>>>>>> binaries, and ldd confirms they are MPI-capable. >>>>>>>> >>>>>>>> rcook@prism120 (~): ldd >>>>>>>> /collab/usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/lib/paraview-3.12/pvserver >>>>>>>> | grep mpi >>>>>>>> libmpi_cxx.so.0 => >>>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi_cxx.so.0 >>>>>>>> (0x00002aaab23bf000) >>>>>>>> libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 >>>>>>>> (0x00002aaab25da000) >>>>>>>> libopen-rte.so.0 => >>>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 >>>>>>>> (0x00002aaab287d000) >>>>>>>> libopen-pal.so.0 => >>>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 >>>>>>>> (0x00002aaab2ac7000) >>>>>>>> >>>>>>>> >>>>>>>> On Nov 11, 2011, at 11:04 AM, Utkarsh Ayachit wrote: >>>>>>>> >>>>>>>>> Your pvserver is not built with MPI enabled. Please rebuild pvserver >>>>>>>>> with CMake variable PARAVIEW_USE_MPI:BOOL=ON. >>>>>>>>> >>>>>>>>> Utkarsh >>>>>>>>> >>>>>>>>> On Fri, Nov 11, 2011 at 1:54 PM, Cook, Rich <[email protected]> wrote: >>>>>>>>>> We have a tricky firewall situation here so I have to use reverse >>>>>>>>>> tunneling per >>>>>>>>>> http://www.paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_Connection_Over_an_ssh_Tunnel >>>>>>>>>> >>>>>>>>>> I'm not sure I'm doing it right. I can do it with a single server, >>>>>>>>>> but when I try to run in parallel, it looks like something is >>>>>>>>>> broken. My understanding is that when launched under MPI, the >>>>>>>>>> servers should talk to eachother and only one of the servers should >>>>>>>>>> try to connect back to the client. I compiled with MPI, and am >>>>>>>>>> running in an MPI environment, but it looks as though the pvservers >>>>>>>>>> are not talking to each other but are each trying to make their own >>>>>>>>>> connection to the client. Below is the output. Can anyone help me >>>>>>>>>> get this up and running? I know I'm close. >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> rcook@prism127 (IMG_private): srun -n 8 >>>>>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver >>>>>>>>>> --use-offscreen-rendering --reverse-connection >>>>>>>>>> --client-host=localhost >>>>>>>>>> Waiting for client >>>>>>>>>> Connection URL: csrc://localhost:11111 >>>>>>>>>> Client connected. >>>>>>>>>> Waiting for client >>>>>>>>>> Connection URL: csrc://localhost:11111 >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, >>>>>>>>>> line 481 >>>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. >>>>>>>>>> Connection refused. >>>>>>>>>> >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx, >>>>>>>>>> line 53 >>>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server >>>>>>>>>> localhost:11111 >>>>>>>>>> >>>>>>>>>> Warning: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx, >>>>>>>>>> line 250 >>>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed. Retrying for >>>>>>>>>> 59.9994 more seconds. >>>>>>>>>> >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, >>>>>>>>>> line 481 >>>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. >>>>>>>>>> Connection refused. >>>>>>>>>> >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx, >>>>>>>>>> line 53 >>>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server >>>>>>>>>> localhost:11111 >>>>>>>>>> >>>>>>>>>> Warning: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx, >>>>>>>>>> line 250 >>>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed. Retrying for >>>>>>>>>> 58.9972 more seconds. >>>>>>>>>> >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, >>>>>>>>>> line 481 >>>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. >>>>>>>>>> Connection refused. >>>>>>>>>> >>>>>>>>>> ERROR: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx, >>>>>>>>>> line 53 >>>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server >>>>>>>>>> localhost:11111 >>>>>>>>>> >>>>>>>>>> Warning: In >>>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx, >>>>>>>>>> line 250 >>>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed. Retrying for >>>>>>>>>> 57.9952 more seconds. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> etc. etc. etc. >>>>>>>>>> -- >>>>>>>>>> ✐Richard Cook >>>>>>>>>> ✇ Lawrence Livermore National Laboratory >>>>>>>>>> Bldg-453 Rm-4024, Mail Stop L-557 >>>>>>>>>> 7000 East Avenue, Livermore, CA, 94550, USA >>>>>>>>>> ☎ (office) (925) 423-9605 >>>>>>>>>> ☎ (fax) (925) 423-6961 >>>>>>>>>> --- >>>>>>>>>> Information Management & Graphics Grp., Services & Development Div., >>>>>>>>>> Integrated Computing & Communications Dept. >>>>>>>>>> (opinions expressed herein are mine and not those of LLNL) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Powered by www.kitware.com >>>>>>>>>> >>>>>>>>>> Visit other Kitware open-source projects at >>>>>>>>>> http://www.kitware.com/opensource/opensource.html >>>>>>>>>> >>>>>>>>>> Please keep messages on-topic and check the ParaView Wiki at: >>>>>>>>>> http://paraview.org/Wiki/ParaView >>>>>>>>>> >>>>>>>>>> Follow this link to subscribe/unsubscribe: >>>>>>>>>> http://www.paraview.org/mailman/listinfo/paraview >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> ✐Richard Cook >>>>>>>> ✇ Lawrence Livermore National Laboratory >>>>>>>> Bldg-453 Rm-4024, Mail Stop L-557 >>>>>>>> 7000 East Avenue, Livermore, CA, 94550, USA >>>>>>>> ☎ (office) (925) 423-9605 >>>>>>>> ☎ (fax) (925) 423-6961 >>>>>>>> --- >>>>>>>> Information Management & Graphics Grp., Services & Development Div., >>>>>>>> Integrated Computing & Communications Dept. >>>>>>>> (opinions expressed herein are mine and not those of LLNL) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> ✐Richard Cook >>>>>> ✇ Lawrence Livermore National Laboratory >>>>>> Bldg-453 Rm-4024, Mail Stop L-557 >>>>>> 7000 East Avenue, Livermore, CA, 94550, USA >>>>>> ☎ (office) (925) 423-9605 >>>>>> ☎ (fax) (925) 423-6961 >>>>>> --- >>>>>> Information Management & Graphics Grp., Services & Development Div., >>>>>> Integrated Computing & Communications Dept. >>>>>> (opinions expressed herein are mine and not those of LLNL) >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> ✐Richard Cook >>>> ✇ Lawrence Livermore National Laboratory >>>> Bldg-453 Rm-4024, Mail Stop L-557 >>>> 7000 East Avenue, Livermore, CA, 94550, USA >>>> ☎ (office) (925) 423-9605 >>>> ☎ (fax) (925) 423-6961 >>>> --- >>>> Information Management & Graphics Grp., Services & Development Div., >>>> Integrated Computing & Communications Dept. >>>> (opinions expressed herein are mine and not those of LLNL) >>>> >>>> >>>> >>> >>> -- >>> ✐Richard Cook >>> ✇ Lawrence Livermore National Laboratory >>> Bldg-453 Rm-4024, Mail Stop L-557 >>> 7000 East Avenue, Livermore, CA, 94550, USA >>> ☎ (office) (925) 423-9605 >>> ☎ (fax) (925) 423-6961 >>> --- >>> Information Management & Graphics Grp., Services & Development Div., >>> Integrated Computing & Communications Dept. >>> (opinions expressed herein are mine and not those of LLNL) >>> >>> >>> >>> > > -- > ✐Richard Cook > ✇ Lawrence Livermore National Laboratory > Bldg-453 Rm-4024, Mail Stop L-557 > 7000 East Avenue, Livermore, CA, 94550, USA > ☎ (office) (925) 423-9605 > ☎ (fax) (925) 423-6961 > --- > Information Management & Graphics Grp., Services & Development Div., > Integrated Computing & Communications Dept. > (opinions expressed herein are mine and not those of LLNL) > > > > _______________________________________________ Powered by www.kitware.com Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html Please keep messages on-topic and check the ParaView Wiki at: http://paraview.org/Wiki/ParaView Follow this link to subscribe/unsubscribe: http://www.paraview.org/mailman/listinfo/paraview
