On 3 January 2012 14:05, Bartias <[email protected]> wrote:
> Sure - new pastes
>
> http://codepad.org/K4DMgnhN - the "working" version
>
>
> http://codepad.org/hmmsQRNp - the wrong version
>

You haven't also been including the commands that you used to generate
the binaries, and I think this is where you're going wrong.

Here are the commands I used to compile them - I used a different
naming scheme, because I think that's what's confusing you.

$ lt-comp lr apertium-pl-en.en.dix en-analysis.bin
main@standard 9 9
$ lt-comp lr apertium-pl-en.pl.dix pl-analysis.bin
main@standard 7 7

These would normally be en-pl.automorf.bin and pl-en.automorf.bin, respectively.

$ lt-comp rl apertium-pl-en.en.dix en-generation.bin
main@standard 9 9
$ lt-comp rl apertium-pl-en.pl.dix pl-generation.bin
main@standard 7 7

These would normally be en-pl.autogen.bin and pl-en.autogen.bin, respectively.

$ lt-comp rl apertium-pl-en.pl-en.dix en-to-pl.bin
main@standard 7 6
$ lt-comp lr apertium-pl-en.pl-en.dix pl-to-en.bin
main@standard 7 6

These would normally be en-pl.autobil.bin and pl-en.autobil.bin, respectively.

The convention is, that 'en-pl' or 'pl-en' corresponds to the overall
direction of the language pair. The generator made from the Polish
dictionary is 'en-pl', because it's part of the English to Polish
pipeline, and vice versa.

$ apertium-preprocess-transfer apertium-pl-en.pl-en.t1x transfer.bin

So, transfer pl->en

$ echo domy |lt-proc pl-analysis.bin |gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin pl-to-en.bin |lt-proc -g
en-generation.bin
houses

Transfer en->pl

$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin en-to-pl.bin |lt-proc -g
pl-generation.bin
domy


(I use the same transfer rules only because this is a toy example).

Now, here are two broken variations:
$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=spl1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin en-to-pl.bin |lt-proc -g
en-generation.bin
#dom

Here, we get '#dom' because the English generation binary does not
know how to generate 'domy', it only has the word 'house'.

$ echo houses|lt-proc en-analysis.bin|gawk 'BEGIN{RS="$";
FS="/";}{nf=split($1,COMPONENTS,"^"); for(i = 1; i<nf; i++) printf
COMPONENTS[i]; if($2 != "") printf("^%s$",$2);}'|apertium-transfer
apertium-pl-en.pl-en.t1x transfer.bin pl-to-en.bin |lt-proc -g
pl-generation.bin
@house

Here, we get '@house', because the Polish-to-English binary does not
know how to translate from house to dom (only from dom to house).

Does that make anything clearer for you?

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to