Hi, make sure that all the paths are valid on all the nodes --- so definitely no relative paths. And of course, the binaries need to be executable on all nodes as well.
-phi On Thu, Oct 29, 2015 at 10:12 AM, Vincent Nguyen <[email protected]> wrote: > > OK guys, not an easy stuff ... > I fought to get the prerequisites working but but now at least jobs start > ..... > > and crash. > > I'll post later the details of the preliminary steps, could be useful. > > my crash is when lplmz starts. > > I have a sharepoint mounted on my nodes and all bin are well seen from the > nodes, including the lplmz program. > > but I was thinking, do I need to actually install some packages on the > nodes themselves ? I mean packages that do not fall under /mosesdecoder/ > folder ? > > > thanks, > > V > > > > Le 29/10/2015 13:26, Philipp Koehn a écrit : > > Hi, > > these machine names are just there for convenience. > > If you want experiment.perl to submit jobs per qsub, > all you have to do is to run experiment.perl with the > additional switch "-cluster". > > You can also put the head node's name into the > experiment.machines file, then you do not need to > use the switch anymore. > > -phi > > On Wed, Oct 28, 2015 at 10:20 AM, Vincent Nguyen <[email protected]> wrote: > >> Hi there, >> >> I need some clarification before screwing up some files. >> I just setup a SGE cluster with a Master + 2 Nodes. >> >> to make it clear let say my cluster name is "default", my master >> headnode is "master", my 2 other nodes are "node1" and "node2" >> >> >> for EMS : >> >> I opened the default experiment.machines file and I see : >> >> cluster: townhill seville hermes lion seville sannox lutzow frontend >> multicore-4: freddie >> multicore-8: tyr thor odin crom >> multicore-16: saxnot vali vili freyja bragi hoenir >> multicore-24: syn hel skaol saga buri loki sif magni >> multicore-32: gna snotra lofn thrud >> >> townhill and others are what ? name machines / nodes ? name of several >> clusters ? >> should I just put "default" or "master node1 node2" ? >> >> multicore-X: should I put machine names here >> if my 3 machines are 8 cores each >> multicore-8: master node1 node2 >> right ? >> >> >> then in the config file for EMS: >> >> #generic-parallelizer = >> $moses-script-dir/ems/support/generic-parallelizer.perl >> #generic-parallelizer = >> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl >> >> which one should take if my nodes are multicore ? still the first one ? >> >> >> ### cluster settings (if run on a cluster machine) >> # number of jobs to be submitted in parallel >> # >> #jobs = 10 >> should I count approx 1 job per core on the total cores of my 3 machines ? >> >> # arguments to qsub when scheduling a job >> #qsub-settings = "" >> can this stay empty ? >> >> # project for priviledges and usage accounting >> #qsub-project = iccs_smt >> standard value ? >> >> # memory and time >> #qsub-memory = 4 >> #qsub-hours = 48 >> 4 what ? GB ? >> >> ### multi-core settings >> # when the generic parallelizer is used, the number of cores >> # specified here >> cores = 4 >> is this ignored if generic-parallelizer.perl is chosen ? >> >> >> is there a way to put more load on one specific node ? >> >> Many thanks, >> V. >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
