Hi Eduardo, Eduardo Santana wrote:
>It's a little difficult to understand. > >I had read the affix.readme before. But it doesn't say >anything about the PFX ou REP flag. Should we use them >in our aff file? > > The PFX flag is similar to the SFX flag, only it is about parts of words that can be added in front of them. In Portuguese, examples of prefixes would be "re" "de" "a" "con" "pro" "contra", (you can probably think of more and better ones). The REP command is useful because the normal suggestion producing mechanism can only handle relatively simple spelling errors (one letter wrong, one letter missing, one letter too much, two letters exchanged). Using the REP command you can instruct the suggestion mechanism to also try replacing the specified character sequence. For example, letter combinations that are often spelled wrong because they sound similar, or whole words that are often spelled wrongly. >Also, the .aff file has no comment! Can we add comment >to that? > >Isn't there a better documentation somewhere? > > Some more explanation: The reason to use prefixes and suffixes is that in most languages, including Portuguese, many words are derived from a basic form in a regular way. For example, andar, ando, andei, anda, andam, andamos, andando, ... Instead of listing each of these forms, an affix compressed dictionary needs to list only the basic form, together with flags that indicate which other forms are recognized. You can see how this is more efficient. The affix file specifies the rules for making these other forms, and is typically based on observations about the grammar of the language in question. The more commonly used a prefix or affix is in a language, the more efficiency is added by using it in the affix file. So there is no "right" or "wrong" set of affixes, it is just that if the set of affixes is cleverly chosen, the dictionary file will be more efficiently compressed. You can automatically generate the affix compressed dictionary using a plain list of words, that contains *all* the word forms, and an affix file, using the "munch" utility that is included in myspell. The reverse is the "unmunch" utility which you can use to get a full plain word list back. I hope this will make it a bit clearer. Don't hesitate if you have further questions! Atenciosamente, Simon Brouwer. >>> nl.openoffice.org <<< --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
