OK guys, not an easy stuff ...
I fought to get the prerequisites working but but now at least jobs
start .....
and crash.
I'll post later the details of the preliminary steps, could be useful.
my crash is when lplmz starts.
I have a sharepoint mounted on my nodes and all bin are well seen from
the nodes, including the lplmz program.
but I was thinking, do I need to actually install some packages on the
nodes themselves ? I mean packages that do not fall under /mosesdecoder/
folder ?
thanks,
V
Le 29/10/2015 13:26, Philipp Koehn a écrit :
Hi,
these machine names are just there for convenience.
If you want experiment.perl to submit jobs per qsub,
all you have to do is to run experiment.perl with the
additional switch "-cluster".
You can also put the head node's name into the
experiment.machines file, then you do not need to
use the switch anymore.
-phi
On Wed, Oct 28, 2015 at 10:20 AM, Vincent Nguyen <[email protected]
<mailto:[email protected]>> wrote:
Hi there,
I need some clarification before screwing up some files.
I just setup a SGE cluster with a Master + 2 Nodes.
to make it clear let say my cluster name is "default", my master
headnode is "master", my 2 other nodes are "node1" and "node2"
for EMS :
I opened the default experiment.machines file and I see :
cluster: townhill seville hermes lion seville sannox lutzow frontend
multicore-4: freddie
multicore-8: tyr thor odin crom
multicore-16: saxnot vali vili freyja bragi hoenir
multicore-24: syn hel skaol saga buri loki sif magni
multicore-32: gna snotra lofn thrud
townhill and others are what ? name machines / nodes ? name of several
clusters ?
should I just put "default" or "master node1 node2" ?
multicore-X: should I put machine names here
if my 3 machines are 8 cores each
multicore-8: master node1 node2
right ?
then in the config file for EMS:
#generic-parallelizer =
$moses-script-dir/ems/support/generic-parallelizer.perl
#generic-parallelizer =
$moses-script-dir/ems/support/generic-multicore-parallelizer.perl
which one should take if my nodes are multicore ? still the first
one ?
### cluster settings (if run on a cluster machine)
# number of jobs to be submitted in parallel
#
#jobs = 10
should I count approx 1 job per core on the total cores of my 3
machines ?
# arguments to qsub when scheduling a job
#qsub-settings = ""
can this stay empty ?
# project for priviledges and usage accounting
#qsub-project = iccs_smt
standard value ?
# memory and time
#qsub-memory = 4
#qsub-hours = 48
4 what ? GB ?
### multi-core settings
# when the generic parallelizer is used, the number of cores
# specified here
cores = 4
is this ignored if generic-parallelizer.perl is chosen ?
is there a way to put more load on one specific node ?
Many thanks,
V.
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support