How can I start 2 workers on each node, using Julia 0.3.11?

[count*][user@]host[:port] [bind_addr[:port]]

I have a machine file, with only one node (one line), this examples are the 
ways it works, 
but adding only one worker per node, I'm using the default port for now and 
not using a different bind address:


   - Only host:

555.555.555.555

   - User and host:

root@555.555.555.555


The way I understand:

[count*][user@]host[:port] [bind_addr[:port]]

Is that `count` is an integer while `*` means zero or more repetitions in 
REGEX lang, 
at first it seems it doesn't need a space character between the count and 
the `user@host`,
but I have tried several forms and it doesn't work:

* Use `2` as `count`, separated by space, with `my_file` being either:

2 555.555.555.555

or

2 root@555.555.555.555

[root@example ~]# julia --machinefile my_file
ssh: connect to host 2 port 22: Invalid argument


It seems to me it tries to use the 2 as the host address :(

Could anyone please give me an example off a machine file which specifies 
the worker count?

Thanks in advance, cheers! 





El viernes, 25 de septiembre de 2015, 16:42:59 (UTC-5), Ismael VC escribió:
>
> Hello everyone!
>
> I am trying to set up a Julia cluster with 20 nodes, this is the very 
> first time I've tried something like this. I have looked around for 
> examples, but documentation is not very helpful for me:
>
> *Julia can be started in parallel mode with either the -p or 
> the --machinefile options. -p n will launch an additional n worker 
> processes, while --machinefile file will launch a worker for each line in 
> file file. The machines defined in file must be accessible via a 
> passwordless ssh login, with Julia installed at the same location as the 
> current host. Each machine definition takes the 
> form [count*][user@]host[:port] [bind_addr[:port]] . user defaults to 
> current user, port to the standard ssh port. count is the number of workers 
> to spawn on the node, and defaults to 1. The 
> optional bind-to bind_addr[:port] specifies the ip-address and port that 
> other workers should use to connect to this worker.*
>
> This is what I think I have understood so far:
>
> Ok I list the machines on a machine file, that's easy, I have a file like 
> this:
>
> n user@555.555.555.555
> n user@555.555.555.556
> n user@555.555.555.555
>
>
> *The machines defined in file must be accessible via a 
> passwordless ssh login,*
>
> This is the part that is difficult for me the most, it says that machines 
> must be accesible via paswordless ssh
>
> * with Julia installed at the same location as the current host.*
>
> I understand this as I need to install Julia en every node in the same 
> location, so I have 20 nodes, same software and hardware stacks. Does this 
> means that the nodes must be of the same operating system? the same bits 
> (32/64) only?
>
> Right now I have *20 CentOS 6.7 (64 bits)* nodes with* julia-0.3.11* 
> installed from the *generic linux binaries (64bits)*, all of them 
> installed at */opt/julia-0.3.11/bin* (added to the PATH and already 
> exported in /etc/profile)
>
> Now the plan in my mind is to use my laptop *(windows 7 64 bits, 
> julia-0.3.11 64 bits)* as master node and control the cluster with that, 
> so according to what I understand, I'll need to do (leaving password blank):
>
> ssh-keygen -t rsa
>
>
> From my Windows laptop (I plan to install Arch Linux soon), in order to 
> create my ssh key and then:
>
>
> cat ~/.ssh/id_rsa.pub | ssh user@hostname 'cat >> .ssh/authorized_keys'
>
>
>
> To every node? So I have to be running the ssh server at every one of them? 
> (I understand I'll need it at the master node) This is where I simply don't 
> understand anymore, I haven't seen any tutorial, or article, or something 
> like that, just that paragraph in the manual, I know there is 
> ClusterManagers.jl but that sounds even more complicated for me right now.
>
>
> I also want to help David Sanders to set up another cluster (once I got this 
> figured out) in his lab at Science Faculty, UNAM. I promise to enhance the 
> documentation around this topic once I understand this.
>
>
> What do you guys think, do I have it all wrong?
>
>
> If anyone can help me, I'll be very grateful, thank's in advance!
>
>

Reply via email to