And to clarify, if I just do serial, I get this good behavior:  

rcook@prism127 (~): 
/usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver 
--use-offscreen-rendering  --reverse-connection  --client-host=localhost 
Waiting for client
Connection URL: csrc://localhost:11111
Client connected.

On Nov 11, 2011, at 4:51 PM, Cook, Rich wrote:

> My bad.  
> The first email I sent I was using the wrong MPI (srun instead of mpiexec -- 
> mvapich instead of openmpi).  So both processes were indeed getting set to 
> the same process ID.  Please ignore that output.  
> The current output looks like this:  
> 
> rcook@prism127 (~): mpiexec -np 2 
> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>  --use-offscreen-rendering  --reverse-connection  --client-host=localhost 
> Waiting for client
> Connection URL: csrc://localhost:11111
> ERROR: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, 
> line 481
> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection 
> refused.
> 
> ERROR: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>  line 53
> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111
> 
> Warning: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>  line 250
> vtkTCPNetworkAccessManager (0x8356f0): Connect failed.  Retrying for 59.9993 
> more seconds.
> 
> ERROR: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, 
> line 481
> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection 
> refused.
> 
> ERROR: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>  line 53
> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111
> 
> Warning: In 
> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>  line 250
> vtkTCPNetworkAccessManager (0x8356f0): Connect failed.  Retrying for 58.9972 
> more seconds.
> 
> mpiexec: killing job...
> 
> 
> Note the presence of only one connecting message.  Again, I apologize for the 
> mixup.  I spoke with our MPI guru and have confirmed that MPI appears to be 
> working correctly and I'm not making a mistake in how I launch pvserver from 
> the batch job perspective.  
> 
> Do you still want that output?  
> 
> On Nov 11, 2011, at 4:44 PM, Utkarsh Ayachit wrote:
> 
>> That sounds very odd. If process_id variable is indeed correctly set
>> to 0 and 1 on the two processes, then how come there are two "Waiting
>> for client" lines printed out in the first email that you sent?
>> 
>> Can you change that line cout to the following to verify that  both
>> processes are indeed printing out from the same time?
>> 
>> cout << __LINE__ << " : Waiting for client" << endl;
>> 
>> (This is in pvserver_common.h: 58)
>> 
>> Utkarsh
>> 
>> On Fri, Nov 11, 2011 at 6:30 PM, Cook, Rich <[email protected]> wrote:
>>> I posted the CMakeCache.txt.  I also have tried to step through the code 
>>> using TotalView and I can see it calling MPI_init() etc.  It looks like one 
>>> process correctly gets rank 0 and one gets rank 1 (by inspecting process_id 
>>> variable in RealMain())
>>> If I start in serial, it connects and I can view a protein molecule 
>>> successfully.  If I start in parallel, exactly one server tries and fails 
>>> to connect.  Am I supposed to give any extra arguments when starting in 
>>> parallel?
>>> This is what I'm doing:
>>> 
>>> mpiexec -np 2 
>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>  --use-offscreen-rendering  --reverse-connection  --client-host=localhost
>>> 
>>> 
>>> 
>>> On Nov 11, 2011, at 11:11 AM, Utkarsh Ayachit wrote:
>>> 
>>>> Can you post your CMakeCache.txt?
>>>> 
>>>> Utkarsh
>>>> 
>>>> On Fri, Nov 11, 2011 at 2:08 PM, Cook, Rich <[email protected]> wrote:
>>>>> Hi, thanks, but you are incorrect.
>>>>> I did set that variable and it was indeed compiled with MPI, as I said.
>>>>> 
>>>>> rcook@prism127 (IMG_private): type pvserver
>>>>> pvserver is 
>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>> rcook@prism127 (IMG_private): ldd  
>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>>       libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 
>>>>> (0x00002aaaaacc9000)
>>>>>       libopen-rte.so.0 => 
>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 
>>>>> (0x00002aaaaaf6c000)
>>>>>       libopen-pal.so.0 => 
>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 
>>>>> (0x00002aaaab1b7000)
>>>>>       libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab434000)
>>>>>       libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab638000)
>>>>>       libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab850000)
>>>>>       libm.so.6 => /lib64/libm.so.6 (0x00002aaaaba54000)
>>>>>       libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabcd7000)
>>>>>       libc.so.6 => /lib64/libc.so.6 (0x00002aaaabef2000)
>>>>>       /lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)
>>>>> 
>>>>> When the pvservers are running, I can see that they are the correct 
>>>>> binaries, and ldd confirms they are MPI-capable.
>>>>> 
>>>>> rcook@prism120 (~): ldd  
>>>>> /collab/usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/lib/paraview-3.12/pvserver
>>>>>  | grep mpi
>>>>>       libmpi_cxx.so.0 => 
>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi_cxx.so.0 
>>>>> (0x00002aaab23bf000)
>>>>>       libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 
>>>>> (0x00002aaab25da000)
>>>>>       libopen-rte.so.0 => 
>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 
>>>>> (0x00002aaab287d000)
>>>>>       libopen-pal.so.0 => 
>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 
>>>>> (0x00002aaab2ac7000)
>>>>> 
>>>>> 
>>>>> On Nov 11, 2011, at 11:04 AM, Utkarsh Ayachit wrote:
>>>>> 
>>>>>> Your pvserver is not built with MPI enabled. Please rebuild pvserver
>>>>>> with CMake variable PARAVIEW_USE_MPI:BOOL=ON.
>>>>>> 
>>>>>> Utkarsh
>>>>>> 
>>>>>> On Fri, Nov 11, 2011 at 1:54 PM, Cook, Rich <[email protected]> wrote:
>>>>>>> We have a tricky firewall situation here so I have to use reverse 
>>>>>>> tunneling per 
>>>>>>> http://www.paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_Connection_Over_an_ssh_Tunnel
>>>>>>> 
>>>>>>> I'm not sure I'm doing it right.  I can do it with a single server, but 
>>>>>>> when I try to run in parallel, it looks like something is broken.  My 
>>>>>>> understanding is that when launched under MPI, the servers should talk 
>>>>>>> to eachother and only one of the servers should try to connect back to 
>>>>>>> the client.  I compiled with MPI, and am running in an MPI environment, 
>>>>>>> but it looks as though the pvservers are not talking to each other but 
>>>>>>> are each trying to make their own connection to the client.  Below is 
>>>>>>> the output.  Can anyone help me get this up and running?  I know I'm 
>>>>>>> close.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> rcook@prism127 (IMG_private): srun -n 8 
>>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>>>>  --use-offscreen-rendering  --reverse-connection  
>>>>>>> --client-host=localhost
>>>>>>> Waiting for client
>>>>>>> Connection URL: csrc://localhost:11111
>>>>>>> Client connected.
>>>>>>> Waiting for client
>>>>>>> Connection URL: csrc://localhost:11111
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>  line 481
>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. Connection 
>>>>>>> refused.
>>>>>>> 
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>  line 53
>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>> 
>>>>>>> Warning: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>  line 250
>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>> 59.9994 more seconds.
>>>>>>> 
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>  line 481
>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. Connection 
>>>>>>> refused.
>>>>>>> 
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>  line 53
>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>> 
>>>>>>> Warning: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>  line 250
>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>> 58.9972 more seconds.
>>>>>>> 
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>  line 481
>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. Connection 
>>>>>>> refused.
>>>>>>> 
>>>>>>> ERROR: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>  line 53
>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>> 
>>>>>>> Warning: In 
>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>  line 250
>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>> 57.9952 more seconds.
>>>>>>> 
>>>>>>> 
>>>>>>> etc. etc. etc.
>>>>>>> --
>>>>>>> ✐Richard Cook
>>>>>>> ✇ Lawrence Livermore National Laboratory
>>>>>>> Bldg-453 Rm-4024, Mail Stop L-557
>>>>>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>>>>>> ☎ (office) (925) 423-9605
>>>>>>> ☎ (fax) (925) 423-6961
>>>>>>> ---
>>>>>>> Information Management & Graphics Grp., Services & Development Div., 
>>>>>>> Integrated Computing & Communications Dept.
>>>>>>> (opinions expressed herein are mine and not those of LLNL)
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Powered by www.kitware.com
>>>>>>> 
>>>>>>> Visit other Kitware open-source projects at 
>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>> 
>>>>>>> Please keep messages on-topic and check the ParaView Wiki at: 
>>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>> 
>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>> 
>>>>> 
>>>>> --
>>>>> ✐Richard Cook
>>>>> ✇ Lawrence Livermore National Laboratory
>>>>> Bldg-453 Rm-4024, Mail Stop L-557
>>>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>>>> ☎ (office) (925) 423-9605
>>>>> ☎ (fax) (925) 423-6961
>>>>> ---
>>>>> Information Management & Graphics Grp., Services & Development Div., 
>>>>> Integrated Computing & Communications Dept.
>>>>> (opinions expressed herein are mine and not those of LLNL)
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> --
>>> ✐Richard Cook
>>> ✇ Lawrence Livermore National Laboratory
>>> Bldg-453 Rm-4024, Mail Stop L-557
>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>> ☎ (office) (925) 423-9605
>>> ☎ (fax) (925) 423-6961
>>> ---
>>> Information Management & Graphics Grp., Services & Development Div., 
>>> Integrated Computing & Communications Dept.
>>> (opinions expressed herein are mine and not those of LLNL)
>>> 
>>> 
>>> 
>>> 
> 
> -- 
> ✐Richard Cook   
> ✇ Lawrence Livermore National Laboratory
> Bldg-453 Rm-4024, Mail Stop L-557        
> 7000 East Avenue,  Livermore, CA, 94550, USA
> ☎ (office) (925) 423-9605    
> ☎ (fax) (925) 423-6961
> ---
> Information Management & Graphics Grp., Services & Development Div., 
> Integrated Computing & Communications Dept.
> (opinions expressed herein are mine and not those of LLNL)
> 
> 
> 

-- 
✐Richard Cook   
✇ Lawrence Livermore National Laboratory
Bldg-453 Rm-4024, Mail Stop L-557        
7000 East Avenue,  Livermore, CA, 94550, USA
☎ (office) (925) 423-9605    
☎ (fax) (925) 423-6961
---
Information Management & Graphics Grp., Services & Development Div., Integrated 
Computing & Communications Dept.
(opinions expressed herein are mine and not those of LLNL)



_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to