I've also tried to run moses with a binarized (with compile-lm) SRI 
language model. When I run the decoder I see a segmentation fault error:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[EMAIL PROTECTED]:~$ ~/moses/moses-cmd/src/moses -config ~/ESCA/model/moses.ini 
-input-file ~/ESCA/tuning/input > ~/ESCA/evaluation/output
Defined parameters (per moses.ini or switch):
        config: /home/esca/ESCA/model/moses.ini
        distortion-file: 0-0 msd-bidirectional-fe 6 
/home/esca/ESCA/model/reordering
        distortion-limit: 6
        input-factors: 0
        input-file: /home/esca/ESCA/tuning/input
        lmodel-file: 1 0 5 /home/esca/ESCA/lm/ca.blm
        mapping: 0 T 0
        ttable-file: 0 0 5 /home/esca/ESCA/model/phrase-table
        ttable-limit: 20
        weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3
        weight-l: 0.5000
        weight-t: 0.2 0.2 0.2 0.2 0.2
        weight-w: -1
Loading lexical distortion models...
have 1 models
Creating lexical reordering...
weights: 0.300 0.300 0.300 0.300 0.300 0.300
binary file loaded, default OFF_T: -1
Created lexical orientation reordering
Start loading LanguageModel /home/esca/ESCA/lm/ca.blm : [1.000] seconds
In LanguageModelIRST::Load: nGramOrder = 5
Loading LM file (no MAP)
blmt
loadbin()
loading 321187 1-grams
loading 4548952 2-grams
loading 2785668 3-grams
loading 2501764 4-grams
loading 1741048 5-grams
done
OOV code is 37189
IRST: m_unknownId=37189
Fallo de segmentación (core dumped) #SEGMENTATION FAULT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I am using binarized phrase and reordering tables, but they worked fine 
when I build them with my old SRILM system.

Thanks for your help.

Regards,

             Miguel

Miguel José Hernández Vidal wrote:
> Hi mailing,
>
> I am trying to build my lm with IRST toolkit. First, I've added <s> 
> tags with 'add-start-end.sh' and, obviously, have my data tokenized & 
> lowercased.
>
> When I run 'build-lm.sh' it looks like it works fine, but at the end 
> of the process no output file is found. Here's the log:
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  
>
> [EMAIL PROTECTED]:~/irstlm/bin$ bash build-lm.sh -i ~/corpus/tag.es -o 
> ~/corpus/ca.lm -n 3 -k 5 -s kneser-ney
> Cleaning temporary directory stat
> Extracting dictionary from training corpus
> Splitting dictionary into 5 lists
> Extracting n-gram statistics for each word list
> dict.000
> dict.001
> dict.002
> dict.003
> dict.004
> Estimating language models for each word list
> dict.000
> dict.001
> dict.002
> dict.003
> dict.004
> Merging language models into /home/esca/corpus/ca.lm
> Cleaning temporary directory stat
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  
>
>
> I've tried with different corpus sizes, but it didn't work either. 
> btw, I am running the scripts under Ubuntu 7.04 32bit.
>
> Regards,
>
>                Miguel
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to