Hi,

make sure that all the paths are valid on all the nodes --- so definitely
no relative paths.
And of course, the binaries need to be executable on all nodes as well.

-phi

On Thu, Oct 29, 2015 at 10:12 AM, Vincent Nguyen <[email protected]> wrote:

>
> OK guys, not an easy stuff ...
> I fought to get the prerequisites working but but now at least jobs start
> .....
>
> and crash.
>
> I'll post later the details of the preliminary steps, could be useful.
>
> my crash is when lplmz starts.
>
> I have a sharepoint mounted on my nodes and all bin are well seen from the
> nodes, including the lplmz program.
>
> but I was thinking, do I need to actually install some packages on the
> nodes themselves ? I mean packages that do not fall under /mosesdecoder/
> folder ?
>
>
> thanks,
>
> V
>
>
>
> Le 29/10/2015 13:26, Philipp Koehn a écrit :
>
> Hi,
>
> these machine names are just there for convenience.
>
> If you want experiment.perl to submit jobs per qsub,
> all you have to do is to run experiment.perl with the
> additional switch "-cluster".
>
> You can also put the head node's name into the
> experiment.machines file, then you do not need to
> use the switch anymore.
>
> -phi
>
> On Wed, Oct 28, 2015 at 10:20 AM, Vincent Nguyen <[email protected]> wrote:
>
>> Hi there,
>>
>> I need some clarification before screwing up  some files.
>> I just setup a SGE cluster with a Master + 2 Nodes.
>>
>> to make it clear let say my cluster name is "default", my master
>> headnode is "master", my 2 other nodes are "node1" and "node2"
>>
>>
>> for EMS :
>>
>> I opened the default experiment.machines file and I see :
>>
>> cluster: townhill seville hermes lion seville sannox lutzow frontend
>> multicore-4: freddie
>> multicore-8: tyr thor odin crom
>> multicore-16: saxnot vali vili freyja bragi hoenir
>> multicore-24: syn hel skaol saga buri loki sif magni
>> multicore-32: gna snotra lofn thrud
>>
>> townhill and others are what ? name machines / nodes ? name of several
>> clusters ?
>> should I just put "default" or "master node1 node2" ?
>>
>> multicore-X: should I put machine names here
>> if my 3 machines are 8 cores each
>> multicore-8: master node1 node2
>> right ?
>>
>>
>> then in the config file for EMS:
>>
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-parallelizer.perl
>> #generic-parallelizer =
>> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl
>>
>> which one should  take if my nodes are multicore ? still the first one ?
>>
>>
>> ### cluster settings (if run on a cluster machine)
>> # number of jobs to be submitted in parallel
>> #
>> #jobs = 10
>> should I count approx 1 job per core on the total cores of my 3 machines ?
>>
>> # arguments to qsub when scheduling a job
>> #qsub-settings = ""
>> can this stay empty ?
>>
>> # project for priviledges and usage accounting
>> #qsub-project = iccs_smt
>> standard value ?
>>
>> # memory and time
>> #qsub-memory = 4
>> #qsub-hours = 48
>> 4 what ? GB ?
>>
>> ### multi-core settings
>> # when the generic parallelizer is used, the number of cores
>> # specified here
>> cores = 4
>> is this ignored if generic-parallelizer.perl is chosen ?
>>
>>
>> is there a way to put more load on one specific node ?
>>
>> Many thanks,
>> V.
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to