Hi Eduardo,

Eduardo Santana wrote:

>It's a little difficult to understand.
>
>I had read the affix.readme before. But it doesn't say
>anything about the PFX ou REP flag. Should we use them
>in our aff file?
>  
>
The PFX flag is similar to the SFX flag, only it is about parts of words
that can be added in front of them.
In Portuguese, examples of prefixes would be "re" "de" "a" "con" "pro"
"contra", (you can probably think of more
and better ones).

The REP command is useful because the normal suggestion producing
mechanism can only handle relatively
simple spelling errors (one letter wrong, one letter missing, one letter
too much, two letters exchanged).
Using the REP command you can instruct the suggestion mechanism to also
try replacing the specified character
sequence. For example, letter combinations that are often spelled wrong
because they sound similar, or whole
words that are often spelled wrongly.

>Also, the .aff file has no comment! Can we add comment
>to that?
>
>Isn't there a better documentation somewhere?
>  
>
Some more explanation:

The reason to use prefixes and suffixes is that in most languages,
including Portuguese, many words
are derived from a basic form in a regular way. For example, andar,
ando, andei, anda, andam, andamos, andando, ...
Instead of listing each of these forms, an affix compressed dictionary
needs to list only the basic form, together
with flags that indicate which other forms are recognized. You can see
how this is more efficient.

The affix file specifies the rules for making these other forms, and is
typically based on observations about
the grammar of the language in question. The more commonly used a prefix
or affix is in a language, the more
efficiency is added by using it in the affix file. So there is no
"right" or "wrong" set of affixes, it is just that if the
set of affixes is cleverly chosen, the dictionary file will be more
efficiently compressed.

You can automatically generate the affix compressed dictionary using a
plain list of words, that contains *all* the word forms,
and an affix file, using the "munch" utility that is included in
myspell. The reverse is the "unmunch" utility which you can use to
get a full plain word list back.

I hope this will make it a bit clearer. Don't hesitate if you have
further questions!

Atenciosamente,

Simon Brouwer.

>>> nl.openoffice.org <<<


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to