El dg 23 de 01 de 2011 a les 07:28 +0300, en/na Hèctor Alòs i Font va
escriure:
> I'm not able to generate a new prob file. I'm being having the same
> problem when the programme begins to run Kupiec's algorithm (i.e. just
> after it finishes the generation of the fr.crp file). The error still
> exist using different training corpus. The dictionary has more than
> 40,000 entries and the corpora 800-900 M words (with one of them I
> could generate the current prob file six months ago). Coverage is c.
> 96%. That's the output I receive:
>
> make -f fr-eo-unsupervised.make
>
> Generating fr-tagger-data/fr.dic
> This may take some time. Please, take a cup of coffee and come back
> later.
> apertium-validate-dictionary apertium-eo-fr.fr.dix
> apertium-validate-tagger apertium-eo-fr.fr.tsx
> lt-expand apertium-eo-fr.fr.dix | grep -v "__REGEXP__" | grep -v ":<:"
> |\
> awk 'BEGIN{FS=":>:|:"}{print $1 ".";}' | apertium-destxt
> >fr.dic.expanded
> lt-proc -a fr-eo.automorf.bin <fr.dic.expanded | \
> apertium-filter-ambiguity apertium-eo-fr.fr.tsx >
> fr-tagger-data/fr.dic
> rm fr.dic.expanded;
> apertium-destxt < fr-tagger-data/fr.crp.txt | lt-proc
> fr-eo.automorf.bin > fr-tagger-data/fr.crp
> apertium-validate-tagger apertium-eo-fr.fr.tsx
> apertium-tagger -t 8 \
> fr-tagger-data/fr.dic \
> fr-tagger-data/fr.crp \
> apertium-eo-fr.fr.tsx \
> fr-eo.prob;
> Calculating ambiguity classes...
>
> 90 states and 420 ambiguity classes
> Kupiec's initialization of transition and emission probabilities...
> make: *** [fr-eo.prob] Error 1
>
>
> Any idea?
> Thanks in advance.
> Hèctor
>
> PS
> The fr.crp which is generated at the beginning of the process seems to
> me very small: just 390 lines. If it should be a list of all ambiguous
> forms, it should have thousands of them.
Hey hèctor, it could be that that file is the ambiguity class file...
Can you upload the corpus somewhere so that we could download it and
check it ourselves ? Alternatively, could you run the training with
apertium-tagger through gdb to find out exactly where in the code it
errors.
Fran
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff