El dg 13 de 11 de 2011 a les 00:18 +0000, en/na Jimmy O'Regan va escriure: > On 12 November 2011 21:15, Kevin Donnelly <[email protected]> wrote: > > [SNIP] > > > In effect, you are splitting words artificially (not along linguistically- > > accepted lines) on input, so that you can put them back together again at > > lookup. It would be simpler just to enter and look up a full-form word. > > Given your complaints, elsewhere in the email, about Spanish enclitic > pronouns, I have to wonder to what, specifically, are you referring > here? > > As you mention full-form words, perhaps you're not aware that > paradigms are not obligatory? We could just as easily stick full-form > lists in XML, and they will compile just as well as entries with > paradigms. What's more, the compiled binary representations of both > will be identical.
In fact this is what we do for Maltese verbs. > So if your concern is that where there is an entry that consists of, > say, the string "deput" + the paradigm "bab/y__n", that the runtime > first looks up "deput", then looks up some abstract representation of > the paradigm... let me assure you that this is not the case. > > If, on the other hand, you're referring to how we segment something > like dímelo into decir+me+lo... saying that it's "not along > linguistically-accepted lines" may be a neat rhetorical device, but > it's not true. I think rather he might be referring to, e.g. splitting "man" into "m" + "an"/"en" or "Haus" into "H" "aus", "äuser", etc. Rather than enclitic pronouns. Fran ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
