Hi, 

I'm trying to use addprocs() to add remote workers on another windows 
machine.

I'm using a ssh server for windows (Bitvise) with a modified Cluster 
Manager, and have successfully used this method in another environment.
So I know that it works, although one difference is Window 7 (works) vs 
Windows 8.1 (does not work), but I don't think this should be problem.

Now, I don't expect anyone to troubleshoot my particular setup / 
environment / customisation.
Rather I was hoping for some high level help with further diagnosis.

I can confirm that the windows command to launch the remote worker is 
executed, and the remote machine receives a connection and then successful 
login.
The remote ssh server shows a successful connection and login, and windows 
Task Manager shows a Julia process has started.  
Then the following error occurs on the local machine, after which the 
remote session is terminated.

Error evaluating c:\Users\Greg\Julia6\src\Launcher.jl:
connect: connection timed out (ETIMEDOUT)
 in wait at task.jl:284
 in wait at task.jl:194
 in stream_wait at stream.jl:263
 in wait_connected at stream.jl:301
 in Worker at multi.jl:113
 in create_worker at multi.jl:1064
 in start_cluster_workers at multi.jl:1028

I guess my first question is which side (local or remote) is failing.
It seems to me that the local Julia process is waiting for some 
confirmation of connection? Does that sound right?
If so, are there any suggestions on how to further diagnose problem.

When the ssh command to start a remote Julia worker is executed from the 
windows command line, I get the following:
julia_worker:9009#192.168.1.107

Then after about 60s:
Master process (id 1) could not connect within 60.0 seconds.
exiting.

Presumably this is the expected behaviour, since the remote worker process 
is not communicating with master Julia process?

Maybe the remote Julia.exe command is not receiving the --worker argument 
properly?

As I said, my method works in another environment (which incidentally seems 
like magic to me). 
I'm not really sure what is different here.
So any suggestions would be appreciated.

Thanks, Greg

Reply via email to