Hi,

You are right, your affix table needs some optimization. The second example
is much better, because analyzing multilevel suffixes needs more time and
zero suffixes have the biggest overhead. Word analysis checks *all* zero
affix rules for every input words and suggestion candidates. There are too
many zero affixes in your affix table:

$ cat fr.aff | LC_ALL=C awk 'BEGIN{FS="[ \t/]*"}/^[SP]FX/ && NF>4{print$4}'
| sort | uniq -c | sort -nrk 1 | head
    386 s
 * 385 0*
    321 ais
    222 ons
    209 ions
    209 iez
    207 ez
    157 ait
    151 is
    148 aient

Also it would be better to decrease the redundant suffixes (with different
stripping characters) of irregular words by (1) pseudoroots with NEEDAFFIX
flags or (2) generating from common suffixed forms or simply by (3) new
dictionary items. Examples:

(1) flag x defines -s, -d suffixes, "ha" is a pseudoroot (flag ! is the
NEEDAFFIX flag)
shake/x  -> shake, shakes, shaked
ha/x! -> has, had
have -> have

(2) flag y defines a "d" stripping character and an "s" suffix character:
shaked/y -> shaked, shakes
shake -> shake
had/y -> had, has
have -> have

(3)
shake/x
had
has
have

Hungarian dictionary uses (2) for irregular nouns and (3) for a lot of
irregular verbs.

Regards,

László



2008/8/20 Olivier R. <[EMAIL PROTECTED]>

> What's the best to define affix rules ?
>
>
> For example:
>
> Flag S. defines how to make plural forms in French:
> One rule.
>
> SFX S. Y 1
> SFX S.   0          s                   [^sxz]              /pl
>
>
> Others flags often call S. to generate their plural flexions, ie:
>
> SFX F. N 36
> SFX F.   0          0                   .                   .fem
> SFX F.   0          s                   [eë]                .fem/pl
> SFX F.   e          0/S.                [éiï]e              .mas
> SFX F.   rice       eur/S.              [dt]rice            .mas
> SFX F.   e          0/S.                de                  .mas
> SFX F.   fe         0/S.                ffe                 .mas
> SFX F.   he         0/S.                [^è]che             .mas
> SFX F.   èche       ec/S.               èche                .mas
> SFX F.   e          0/S.                [ut]he              .mas
> SFX F.   e          0/S.                ke                  .mas
> SFX F.   e          0/S.                ale                 .mas
> SFX F.   e          0/S.                [iouû]le            .mas
> SFX F.   le         0/S.                [eiu]lle            .mas
> SFX F.   e          0/S.                [aiou]ne            .mas
> SFX F.   ne         0/S.                [aeo]nne            .mas
> SFX F.   gne        n/S.                igne                .mas
> SFX F.   e          0/S.                [aiuûy]re           .mas
> SFX F.   ère        er/S.               ère                 .mas
> SFX F.   e          0                   [^us]se             .mas.inv
> SFX F.   sse        0/S.                [^eo].esse          .mas
> SFX F.   resse      ur/S.               eresse              .mas
> SFX F.   oresse     eur/S.              oresse              .mas
> SFX F.   se         0                   [^e]sse             .mas.inv
> SFX F.   e          0                   [^eo]use            .mas.inv
> SFX F.   se         r/S.                euse                .mas
> SFX F.   e          0/S.                [^èt]te             .mas
> SFX F.   te         0/S.                tte                 .mas
> SFX F.   ète        et/S.               ète                 .mas
> SFX F.   e          0/S.                [^gq]ue             .mas
> SFX F.   ue         0/S.                gue                 .mas
> SFX F.   que        0/S.                cque                .mas
> SFX F.   que        c/S.                [^c]que             .mas
> SFX F.   ève        ef/S.               ève                 .mas
> SFX F.   ve         f/S.                [iïu]ve             .mas
> SFX F.   ë          0/S.                uë                  .mas
> SFX F.   üe         u/S.                üe                  .mas
>
>
> But I could write F. differently, like :
>
> SFX F' Y 68
> SFX F'   0          0                   .                   .fem
> SFX F'   0          s                   [eë]                .fem/pl
> SFX F'   e          0                   [éiï]e              .mas
> SFX F'   e          s                   [éiï]e              .mas/pl
> SFX F'   rice       eur                 [dt]rice            .mas
> SFX F'   rice       eurs                [dt]rice            .mas/pl
> SFX F'   e          0                   de                  .mas
> SFX F'   e          s                   de                  .mas/pl
> SFX F'   fe         0                   ffe                 .mas
> SFX F'   fe         s                   ffe                 .mas/pl
> SFX F'   he         0                   [^è]che             .mas
> SFX F'   he         s                   [^è]che             .mas/pl
> SFX F'   èche       ec                  èche                .mas
> SFX F'   èche       ecs                 èche                .mas/pl
> SFX F'   e          0                   [ut]he              .mas
> SFX F'   e          s                   [ut]he              .mas/pl
> SFX F'   e          0                   ke                  .mas
> SFX F'   e          s                   ke                  .mas/pl
> SFX F'   e          0                   ale                 .mas
> SFX F'   e          s                   ale                 .mas/pl
> SFX F'   e          0                   [iouû]le            .mas
> SFX F'   e          s                   [iouû]le            .mas/pl
> SFX F'   le         0                   [eiu]lle            .mas
> SFX F'   le         s                   [eiu]lle            .mas/pl
> SFX F'   e          0                   [aiou]ne            .mas
> SFX F'   e          s                   [aiou]ne            .mas/pl
> SFX F'   ne         0                   [aeo]nne            .mas
> SFX F'   ne         s                   [aeo]nne            .mas/pl
> SFX F'   gne        n                   igne                .mas
> SFX F'   gne        ns                  igne                .mas/pl
> SFX F'   e          0                   [aiuûy]re           .mas
> SFX F'   e          s                   [aiuûy]re           .mas/pl
> SFX F'   ère        er                  ère                 .mas
> SFX F'   ère        ers                 ère                 .mas/pl
> SFX F'   e          0                   [^us]se             .mas.inv
> SFX F'   sse        0                   [^eo].esse          .mas
> SFX F'   sse        s                   [^eo].esse          .mas/pl
> SFX F'   resse      ur                  eresse              .mas
> SFX F'   resse      urs                 eresse              .mas/pl
> SFX F'   oresse     eur                 oresse              .mas
> SFX F'   oresse     eurs                oresse              .mas/pl
> SFX F'   se         0                   [^e]sse             .mas.inv
> SFX F'   e          0                   [^eo]use            .mas.inv
> SFX F'   se         r                   euse                .mas
> SFX F'   se         rs                  euse                .mas/pl
> SFX F'   e          0                   [^èt]te             .mas
> SFX F'   e          s                   [^èt]te             .mas/pl
> SFX F'   te         0                   tte                 .mas
> SFX F'   te         s                   tte                 .mas/pl
> SFX F'   ète        et                  ète                 .mas
> SFX F'   ète        ets                 ète                 .mas/pl
> SFX F'   ète        et                  ète                 .mas
> SFX F'   ète        ets                 ète                 .mas/pl
> SFX F'   e          0                   [^gq]ue             .mas
> SFX F'   e          s                   [^gq]ue             .mas/pl
> SFX F'   ue         0                   gue                 .mas
> SFX F'   ue         s                   gue                 .mas/pl
> SFX F'   que        0                   cque                .mas
> SFX F'   que        s                   cque                .mas/pl
> SFX F'   que        c                   [^c]que             .mas
> SFX F'   que        cs                  [^c]que             .mas/pl
> SFX F'   ève        ef                  ève                 .mas
> SFX F'   ève        efs                 ève                 .mas/pl
> SFX F'   ve         f                   [iïu]ve             .mas
> SFX F'   ve         fs                  [iïu]ve             .mas/pl
> SFX F'   ë          0                   uë                  .mas
> SFX F'   ë          s                   uë                  .mas/pl
> SFX F'   üe         u                   üe                  .mas
> SFX F'   üe         us                  üe                  .mas/pl
>
> F' does the same thing than F.
>
> Which one of the both is the best ?
>
>
> Regards,
> Olivier
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to