On 3 January 2012 14:05, Bartias <[email protected]> wrote:
> Sure - new pastes
>
> http://codepad.org/K4DMgnhN - the "working" version
>
>
> http://codepad.org/hmmsQRNp - the wrong version
>
You haven't also been including the commands that you used to generate
the binaries, and I think this is where you're going wrong.
Here are the commands I used to compile them - I used a different
naming scheme, because I think that's what's confusing you.
$ lt-comp lr apertium-pl-en.en.dix en-analysis.bin
main@standard 9 9
$ lt-comp lr apertium-pl-en.pl.dix pl-analysis.bin
main@standard 7 7
These would normally be en-pl.automorf.bin and pl-en.automorf.bin, respectively.
$ lt-comp rl apertium-pl-en.en.dix en-generation.bin
main@standard 9 9
$ lt-comp rl apertium-pl-en.pl.dix pl-generation.bin
main@standard 7 7
These would normally be en-pl.autogen.bin and pl-en.autogen.bin, respectively.
$ lt-comp rl apertium-pl-en.pl-en.dix en-to-pl.bin
main@standard 7 6
$ lt-comp lr apertium-pl-en.pl-en.dix pl-to-en.bin
main@standard 7 6
These would normally be en-pl.autobil.bin and pl-en.autobil.bin, respectively.
The convention is, that 'en-pl' or 'pl-en' corresponds to the overall
direction of the language pair. The generator made from the Polish
dictionary is 'en-pl', because it's part of the English to Polish
pipeline, and vice versa.
$ apertium-preprocess-transfer apertium-pl-en.pl-en.t1x transfer.bin
So, transfer pl->en
$ echo domy |lt-proc pl-analysis.bin |gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin pl-to-en.bin |lt-proc -g
en-generation.bin
houses
Transfer en->pl
$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin en-to-pl.bin |lt-proc -g
pl-generation.bin
domy
(I use the same transfer rules only because this is a toy example).
Now, here are two broken variations:
$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=spl1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin en-to-pl.bin |lt-proc -g
en-generation.bin
#dom
Here, we get '#dom' because the English generation binary does not
know how to generate 'domy', it only has the word 'house'.
$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin pl-to-en.bin |lt-proc -g
pl-generation.bin
@house
Here, we get '@house', because the Polish-to-English binary does not
know how to translate from house to dom (only from dom to house).
Does that make anything clearer for you?
--
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you
------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff