So we're clear, it runs correctly on the local machine but not when you run it through SGE? In that case, I suspect it's library version differences.
On 10/29/2015 03:09 PM, Vincent Nguyen wrote: > > I get this error : > > moses@sgenode1:/netshr/working-en-fr$ /netshr/mosesdecoder/bin/lmplz > --text /netshr/working-en-fr/lm/europarl.truecased.7 --order 5 --arpa > /netshr/working-en-fr/lm/europarl.lm.7 --prune 0 0 1 -T > /netshr/working-en-fr/lm -S 20% > === 1/5 Counting and sorting n-grams === > Reading /netshr/working-en-fr/lm/europarl.truecased.7 > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > tcmalloc: large alloc 2755821568 bytes == 0x25d28000 @ > **************************************************************************************************** > Segmentation fault (core dumped) > moses@sgenode1:/netshr/working-en-fr$ > > I installed the libgoogle-pertools-dev but same error. > Just to be clear, all these packages below are just necessary to build > Moses, do I need specific packages > to run one or other binary ? > confused.... > > > Ubuntu > > Install the following packages using the command > > sudo apt-get install [package name] > > Packages: > > g++ > git > subversion > automake > libtool > zlib1g-dev > libboost-all-dev > libbz2-dev > liblzma-dev > python-dev > graphviz > imagemagick > libgoogle-perftools-dev (for tcmalloc) > > > > > > > Le 29/10/2015 15:18, Philipp Koehn a écrit : >> Hi, >> >> make sure that all the paths are valid on all the nodes --- so >> definitely no relative paths. >> And of course, the binaries need to be executable on all nodes as well. >> >> -phi >> >> On Thu, Oct 29, 2015 at 10:12 AM, Vincent Nguyen <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> OK guys, not an easy stuff ... >> I fought to get the prerequisites working but but now at least >> jobs start ..... >> >> and crash. >> >> I'll post later the details of the preliminary steps, could be useful. >> >> my crash is when lplmz starts. >> >> I have a sharepoint mounted on my nodes and all bin are well seen >> from the nodes, including the lplmz program. >> >> but I was thinking, do I need to actually install some packages on >> the nodes themselves ? I mean packages that do not fall under >> /mosesdecoder/ folder ? >> >> >> thanks, >> >> V >> >> >> >> Le 29/10/2015 13:26, Philipp Koehn a écrit : >>> Hi, >>> >>> these machine names are just there for convenience. >>> >>> If you want experiment.perl to submit jobs per qsub, >>> all you have to do is to run experiment.perl with the >>> additional switch "-cluster". >>> >>> You can also put the head node's name into the >>> experiment.machines file, then you do not need to >>> use the switch anymore. >>> >>> -phi >>> >>> On Wed, Oct 28, 2015 at 10:20 AM, Vincent Nguyen <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Hi there, >>> >>> I need some clarification before screwing up some files. >>> I just setup a SGE cluster with a Master + 2 Nodes. >>> >>> to make it clear let say my cluster name is "default", my master >>> headnode is "master", my 2 other nodes are "node1" and "node2" >>> >>> >>> for EMS : >>> >>> I opened the default experiment.machines file and I see : >>> >>> cluster: townhill seville hermes lion seville sannox lutzow >>> frontend >>> multicore-4: freddie >>> multicore-8: tyr thor odin crom >>> multicore-16: saxnot vali vili freyja bragi hoenir >>> multicore-24: syn hel skaol saga buri loki sif magni >>> multicore-32: gna snotra lofn thrud >>> >>> townhill and others are what ? name machines / nodes ? name >>> of several >>> clusters ? >>> should I just put "default" or "master node1 node2" ? >>> >>> multicore-X: should I put machine names here >>> if my 3 machines are 8 cores each >>> multicore-8: master node1 node2 >>> right ? >>> >>> >>> then in the config file for EMS: >>> >>> #generic-parallelizer = >>> $moses-script-dir/ems/support/generic-parallelizer.perl >>> #generic-parallelizer = >>> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl >>> >>> which one should take if my nodes are multicore ? still the >>> first one ? >>> >>> >>> ### cluster settings (if run on a cluster machine) >>> # number of jobs to be submitted in parallel >>> # >>> #jobs = 10 >>> should I count approx 1 job per core on the total cores of my >>> 3 machines ? >>> >>> # arguments to qsub when scheduling a job >>> #qsub-settings = "" >>> can this stay empty ? >>> >>> # project for priviledges and usage accounting >>> #qsub-project = iccs_smt >>> standard value ? >>> >>> # memory and time >>> #qsub-memory = 4 >>> #qsub-hours = 48 >>> 4 what ? GB ? >>> >>> ### multi-core settings >>> # when the generic parallelizer is used, the number of cores >>> # specified here >>> cores = 4 >>> is this ignored if generic-parallelizer.perl is chosen ? >>> >>> >>> is there a way to put more load on one specific node ? >>> >>> Many thanks, >>> V. >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] <mailto:[email protected]> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >> >> > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
