Re: [Apertium-stuff] Dictionaries, coverage and other dull tasks

Jimmy O'Regan Sat, 12 Nov 2011 16:19:00 -0800

On 12 November 2011 21:15, Kevin Donnelly <[email protected]> wrote:

[SNIP]


> In effect, you are splitting words artificially (not along linguistically-
> accepted lines) on input, so that you can put them back together again at
> lookup.  It would be simpler just to enter and look up a full-form word.

Given your complaints, elsewhere in the email, about Spanish enclitic
pronouns, I have to wonder to what, specifically, are you referring
here?

As you mention full-form words, perhaps you're not aware that
paradigms are not obligatory? We could just as easily stick full-form
lists in XML, and they will compile just as well as entries with
paradigms. What's more, the compiled binary representations of both
will be identical.

So if your concern is that where there is an entry that consists of,
say, the string "deput" + the paradigm "bab/y__n", that the runtime
first looks up "deput", then looks up some abstract representation of
the paradigm... let me assure you that this is not the case.

If, on the other hand, you're referring to how we segment something
like dímelo into decir+me+lo... saying that it's "not along
linguistically-accepted lines" may be a neat rhetorical device, but
it's not true.

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Dictionaries, coverage and other dull tasks

Reply via email to