OK guys, not an easy stuff ...
I fought to get the prerequisites working but but now at least jobs start .....

and crash.

I'll post later the details of the preliminary steps, could be useful.

my crash is when lplmz starts.

I have a sharepoint mounted on my nodes and all bin are well seen from the nodes, including the lplmz program.

but I was thinking, do I need to actually install some packages on the nodes themselves ? I mean packages that do not fall under /mosesdecoder/ folder ?


thanks,

V


Le 29/10/2015 13:26, Philipp Koehn a écrit :
Hi,

these machine names are just there for convenience.

If you want experiment.perl to submit jobs per qsub,
all you have to do is to run experiment.perl with the
additional switch "-cluster".

You can also put the head node's name into the
experiment.machines file, then you do not need to
use the switch anymore.

-phi

On Wed, Oct 28, 2015 at 10:20 AM, Vincent Nguyen <[email protected] <mailto:[email protected]>> wrote:

    Hi there,

    I need some clarification before screwing up  some files.
    I just setup a SGE cluster with a Master + 2 Nodes.

    to make it clear let say my cluster name is "default", my master
    headnode is "master", my 2 other nodes are "node1" and "node2"


    for EMS :

    I opened the default experiment.machines file and I see :

    cluster: townhill seville hermes lion seville sannox lutzow frontend
    multicore-4: freddie
    multicore-8: tyr thor odin crom
    multicore-16: saxnot vali vili freyja bragi hoenir
    multicore-24: syn hel skaol saga buri loki sif magni
    multicore-32: gna snotra lofn thrud

    townhill and others are what ? name machines / nodes ? name of several
    clusters ?
    should I just put "default" or "master node1 node2" ?

    multicore-X: should I put machine names here
    if my 3 machines are 8 cores each
    multicore-8: master node1 node2
    right ?


    then in the config file for EMS:

    #generic-parallelizer =
    $moses-script-dir/ems/support/generic-parallelizer.perl
    #generic-parallelizer =
    $moses-script-dir/ems/support/generic-multicore-parallelizer.perl

    which one should  take if my nodes are multicore ? still the first
    one ?


    ### cluster settings (if run on a cluster machine)
    # number of jobs to be submitted in parallel
    #
    #jobs = 10
    should I count approx 1 job per core on the total cores of my 3
    machines ?

    # arguments to qsub when scheduling a job
    #qsub-settings = ""
    can this stay empty ?

    # project for priviledges and usage accounting
    #qsub-project = iccs_smt
    standard value ?

    # memory and time
    #qsub-memory = 4
    #qsub-hours = 48
    4 what ? GB ?

    ### multi-core settings
    # when the generic parallelizer is used, the number of cores
    # specified here
    cores = 4
    is this ignored if generic-parallelizer.perl is chosen ?


    is there a way to put more load on one specific node ?

    Many thanks,
    V.


    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to