Thanks Pabo for your points. On 7 April 2016 at 11:07, Pablo Saratxaga <pa...@walon.org> wrote:
> Li Wed, Apr 06, 2016 at 02:55:52PM +0200, Juan Martorell scrijha: > > > It is quite common to attach some pronouns to the verb thus including > > information about direct and/or indirect object, or passive/impersonal > > voice. Combinations are hughe, some like: > > but for a proper synthetisation, the verb itsefl has to be correctly > tagged first, so to know if a pronoun can be added, or if two can be added. > > blind automatic generation will lead to a huge mass of incorrect forms. > I'd like to see how huge it will be. Some rare or unfrequent forms will not harm for sure, but I'm not sure how harmful can be some incorrect forms, for our purpose. > For example "morirteme" would be wrong. > This counterexample is not the best IMHO. Google gives some results. The correct spelling includes diacritical (*morírteme*), but the point is that this word, even rare, is gramatically correct and it has full sense in its context. Consider: "*Por muy enfermo que estés, ni se te ocurra morírteme ahora.*" "*Con el viento he hace, si sales así vestido vas a morírteme de frío. Ponte una chaqueta, anda.*" > > With the prefixes it is even more difficult, as the adequatness of > a prefix depends not only on grammatical properties, but also on > meaning and usage. > > For example, while desforestar, deshacer are ok; desmorir, descaer are odd. > I think automatic use of prefixes (that is, add the to *ALL* verbs) would > be wrong. > Agreed in some extent. Even thoug "*desforestar*" is valid; "*deforestar*" is the preferred spelling. "*descaer*" is in the RAE's dictionary being " *decaer*" the most used spelling. Following your example, even though *desmorir* is not in the RAE's dictionary, it may be a neologism with figurative content conveying sense to the reader. I mean, for religious or philosofical texts, *desmorir* (to *undie*) can make sense when talking about alternative timelines, where " *resucitar*" (to resurrect) makes worse sense bein active and removing the undo sense of *undying*. Bottm line, the point of grammar proofreading is more the syntax rather than spelling or semantics, so it would be worth to allow some flexibility while mild warning rare forms. This setting may be tuned via category activation. This is a good case for statistical insertion: 1. produce the word 2. check upon the word database created from a large corpus 3. decide its insertion based on its frequency > My approach would be to define some tags to apply to verbs (nouns, etc) > that can accept a given prefix. > This is compatible with statistical insertion, IMO.
------------------------------------------------------------------------------
_______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel