Very peculiar. I wonder if MPI is running the root node on some other
node. Are you sure the process is run on the same machine? Can you
trying putting an IP address or real hostname instead of localhost?

Utkarsh

On Fri, Nov 11, 2011 at 7:54 PM, Cook, Rich <[email protected]> wrote:
> And to clarify, if I just do serial, I get this good behavior:
>
> rcook@prism127 (~): 
> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>  --use-offscreen-rendering  --reverse-connection  --client-host=localhost
> Waiting for client
> Connection URL: csrc://localhost:11111
> Client connected.
>
> On Nov 11, 2011, at 4:51 PM, Cook, Rich wrote:
>
>> My bad.
>> The first email I sent I was using the wrong MPI (srun instead of mpiexec -- 
>> mvapich instead of openmpi).  So both processes were indeed getting set to 
>> the same process ID.  Please ignore that output.
>> The current output looks like this:
>>
>> rcook@prism127 (~): mpiexec -np 2 
>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>  --use-offscreen-rendering  --reverse-connection  --client-host=localhost
>> Waiting for client
>> Connection URL: csrc://localhost:11111
>> ERROR: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, 
>> line 481
>> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection 
>> refused.
>>
>> ERROR: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>  line 53
>> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111
>>
>> Warning: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>  line 250
>> vtkTCPNetworkAccessManager (0x8356f0): Connect failed.  Retrying for 59.9993 
>> more seconds.
>>
>> ERROR: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx, 
>> line 481
>> vtkClientSocket (0xe6a060): Socket error in call to connect. Connection 
>> refused.
>>
>> ERROR: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>  line 53
>> vtkClientSocket (0xe6a060): Failed to connect to server localhost:11111
>>
>> Warning: In 
>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>  line 250
>> vtkTCPNetworkAccessManager (0x8356f0): Connect failed.  Retrying for 58.9972 
>> more seconds.
>>
>> mpiexec: killing job...
>>
>>
>> Note the presence of only one connecting message.  Again, I apologize for 
>> the mixup.  I spoke with our MPI guru and have confirmed that MPI appears to 
>> be working correctly and I'm not making a mistake in how I launch pvserver 
>> from the batch job perspective.
>>
>> Do you still want that output?
>>
>> On Nov 11, 2011, at 4:44 PM, Utkarsh Ayachit wrote:
>>
>>> That sounds very odd. If process_id variable is indeed correctly set
>>> to 0 and 1 on the two processes, then how come there are two "Waiting
>>> for client" lines printed out in the first email that you sent?
>>>
>>> Can you change that line cout to the following to verify that  both
>>> processes are indeed printing out from the same time?
>>>
>>> cout << __LINE__ << " : Waiting for client" << endl;
>>>
>>> (This is in pvserver_common.h: 58)
>>>
>>> Utkarsh
>>>
>>> On Fri, Nov 11, 2011 at 6:30 PM, Cook, Rich <[email protected]> wrote:
>>>> I posted the CMakeCache.txt.  I also have tried to step through the code 
>>>> using TotalView and I can see it calling MPI_init() etc.  It looks like 
>>>> one process correctly gets rank 0 and one gets rank 1 (by inspecting 
>>>> process_id variable in RealMain())
>>>> If I start in serial, it connects and I can view a protein molecule 
>>>> successfully.  If I start in parallel, exactly one server tries and fails 
>>>> to connect.  Am I supposed to give any extra arguments when starting in 
>>>> parallel?
>>>> This is what I'm doing:
>>>>
>>>> mpiexec -np 2 
>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>  --use-offscreen-rendering  --reverse-connection  --client-host=localhost
>>>>
>>>>
>>>>
>>>> On Nov 11, 2011, at 11:11 AM, Utkarsh Ayachit wrote:
>>>>
>>>>> Can you post your CMakeCache.txt?
>>>>>
>>>>> Utkarsh
>>>>>
>>>>> On Fri, Nov 11, 2011 at 2:08 PM, Cook, Rich <[email protected]> wrote:
>>>>>> Hi, thanks, but you are incorrect.
>>>>>> I did set that variable and it was indeed compiled with MPI, as I said.
>>>>>>
>>>>>> rcook@prism127 (IMG_private): type pvserver
>>>>>> pvserver is 
>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>>> rcook@prism127 (IMG_private): ldd  
>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>>>       libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 
>>>>>> (0x00002aaaaacc9000)
>>>>>>       libopen-rte.so.0 => 
>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 
>>>>>> (0x00002aaaaaf6c000)
>>>>>>       libopen-pal.so.0 => 
>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 
>>>>>> (0x00002aaaab1b7000)
>>>>>>       libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab434000)
>>>>>>       libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab638000)
>>>>>>       libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaab850000)
>>>>>>       libm.so.6 => /lib64/libm.so.6 (0x00002aaaaba54000)
>>>>>>       libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabcd7000)
>>>>>>       libc.so.6 => /lib64/libc.so.6 (0x00002aaaabef2000)
>>>>>>       /lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)
>>>>>>
>>>>>> When the pvservers are running, I can see that they are the correct 
>>>>>> binaries, and ldd confirms they are MPI-capable.
>>>>>>
>>>>>> rcook@prism120 (~): ldd  
>>>>>> /collab/usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/lib/paraview-3.12/pvserver
>>>>>>  | grep mpi
>>>>>>       libmpi_cxx.so.0 => 
>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi_cxx.so.0 
>>>>>> (0x00002aaab23bf000)
>>>>>>       libmpi.so.0 => /usr/local/tools/openmpi-gnu-1.4.3/lib/libmpi.so.0 
>>>>>> (0x00002aaab25da000)
>>>>>>       libopen-rte.so.0 => 
>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-rte.so.0 
>>>>>> (0x00002aaab287d000)
>>>>>>       libopen-pal.so.0 => 
>>>>>> /usr/local/tools/openmpi-gnu-1.4.3/lib/libopen-pal.so.0 
>>>>>> (0x00002aaab2ac7000)
>>>>>>
>>>>>>
>>>>>> On Nov 11, 2011, at 11:04 AM, Utkarsh Ayachit wrote:
>>>>>>
>>>>>>> Your pvserver is not built with MPI enabled. Please rebuild pvserver
>>>>>>> with CMake variable PARAVIEW_USE_MPI:BOOL=ON.
>>>>>>>
>>>>>>> Utkarsh
>>>>>>>
>>>>>>> On Fri, Nov 11, 2011 at 1:54 PM, Cook, Rich <[email protected]> wrote:
>>>>>>>> We have a tricky firewall situation here so I have to use reverse 
>>>>>>>> tunneling per 
>>>>>>>> http://www.paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_Connection_Over_an_ssh_Tunnel
>>>>>>>>
>>>>>>>> I'm not sure I'm doing it right.  I can do it with a single server, 
>>>>>>>> but when I try to run in parallel, it looks like something is broken.  
>>>>>>>> My understanding is that when launched under MPI, the servers should 
>>>>>>>> talk to eachother and only one of the servers should try to connect 
>>>>>>>> back to the client.  I compiled with MPI, and am running in an MPI 
>>>>>>>> environment, but it looks as though the pvservers are not talking to 
>>>>>>>> each other but are each trying to make their own connection to the 
>>>>>>>> client.  Below is the output.  Can anyone help me get this up and 
>>>>>>>> running?  I know I'm close.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> rcook@prism127 (IMG_private): srun -n 8 
>>>>>>>> /usr/global/tools/Kitware/Paraview/3.12.0-OSMesa/chaos_4_x86_64_ib/bin/pvserver
>>>>>>>>  --use-offscreen-rendering  --reverse-connection  
>>>>>>>> --client-host=localhost
>>>>>>>> Waiting for client
>>>>>>>> Connection URL: csrc://localhost:11111
>>>>>>>> Client connected.
>>>>>>>> Waiting for client
>>>>>>>> Connection URL: csrc://localhost:11111
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>>  line 481
>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. 
>>>>>>>> Connection refused.
>>>>>>>>
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>>  line 53
>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>>>
>>>>>>>> Warning: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>>  line 250
>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>>> 59.9994 more seconds.
>>>>>>>>
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>>  line 481
>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. 
>>>>>>>> Connection refused.
>>>>>>>>
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>>  line 53
>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>>>
>>>>>>>> Warning: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>>  line 250
>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>>> 58.9972 more seconds.
>>>>>>>>
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkSocket.cxx,
>>>>>>>>  line 481
>>>>>>>> vtkClientSocket (0xd8ee20): Socket error in call to connect. 
>>>>>>>> Connection refused.
>>>>>>>>
>>>>>>>> ERROR: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/VTK/Common/vtkClientSocket.cxx,
>>>>>>>>  line 53
>>>>>>>> vtkClientSocket (0xd8ee20): Failed to connect to server localhost:11111
>>>>>>>>
>>>>>>>> Warning: In 
>>>>>>>> /nfs/tmp2/rcook/ParaView/3.12.0/ParaView-3.12.0/ParaViewCore/ClientServerCore/vtkTCPNetworkAccessManager.cxx,
>>>>>>>>  line 250
>>>>>>>> vtkTCPNetworkAccessManager (0x6619a0): Connect failed.  Retrying for 
>>>>>>>> 57.9952 more seconds.
>>>>>>>>
>>>>>>>>
>>>>>>>> etc. etc. etc.
>>>>>>>> --
>>>>>>>> ✐Richard Cook
>>>>>>>> ✇ Lawrence Livermore National Laboratory
>>>>>>>> Bldg-453 Rm-4024, Mail Stop L-557
>>>>>>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>>>>>>> ☎ (office) (925) 423-9605
>>>>>>>> ☎ (fax) (925) 423-6961
>>>>>>>> ---
>>>>>>>> Information Management & Graphics Grp., Services & Development Div., 
>>>>>>>> Integrated Computing & Communications Dept.
>>>>>>>> (opinions expressed herein are mine and not those of LLNL)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Powered by www.kitware.com
>>>>>>>>
>>>>>>>> Visit other Kitware open-source projects at 
>>>>>>>> http://www.kitware.com/opensource/opensource.html
>>>>>>>>
>>>>>>>> Please keep messages on-topic and check the ParaView Wiki at: 
>>>>>>>> http://paraview.org/Wiki/ParaView
>>>>>>>>
>>>>>>>> Follow this link to subscribe/unsubscribe:
>>>>>>>> http://www.paraview.org/mailman/listinfo/paraview
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> ✐Richard Cook
>>>>>> ✇ Lawrence Livermore National Laboratory
>>>>>> Bldg-453 Rm-4024, Mail Stop L-557
>>>>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>>>>> ☎ (office) (925) 423-9605
>>>>>> ☎ (fax) (925) 423-6961
>>>>>> ---
>>>>>> Information Management & Graphics Grp., Services & Development Div., 
>>>>>> Integrated Computing & Communications Dept.
>>>>>> (opinions expressed herein are mine and not those of LLNL)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> ✐Richard Cook
>>>> ✇ Lawrence Livermore National Laboratory
>>>> Bldg-453 Rm-4024, Mail Stop L-557
>>>> 7000 East Avenue,  Livermore, CA, 94550, USA
>>>> ☎ (office) (925) 423-9605
>>>> ☎ (fax) (925) 423-6961
>>>> ---
>>>> Information Management & Graphics Grp., Services & Development Div., 
>>>> Integrated Computing & Communications Dept.
>>>> (opinions expressed herein are mine and not those of LLNL)
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> ✐Richard Cook
>> ✇ Lawrence Livermore National Laboratory
>> Bldg-453 Rm-4024, Mail Stop L-557
>> 7000 East Avenue,  Livermore, CA, 94550, USA
>> ☎ (office) (925) 423-9605
>> ☎ (fax) (925) 423-6961
>> ---
>> Information Management & Graphics Grp., Services & Development Div., 
>> Integrated Computing & Communications Dept.
>> (opinions expressed herein are mine and not those of LLNL)
>>
>>
>>
>
> --
> ✐Richard Cook
> ✇ Lawrence Livermore National Laboratory
> Bldg-453 Rm-4024, Mail Stop L-557
> 7000 East Avenue,  Livermore, CA, 94550, USA
> ☎ (office) (925) 423-9605
> ☎ (fax) (925) 423-6961
> ---
> Information Management & Graphics Grp., Services & Development Div., 
> Integrated Computing & Communications Dept.
> (opinions expressed herein are mine and not those of LLNL)
>
>
>
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to