El dt 13 de 11 de 2012 a les 11:09 +0100, en/na Per Tunedal va escriure: > Hi again, > > "This left-to-right, longest-match way of functioning makes it very easy > to treat > (variable or invariable) multi-word units (MWUs), for input: if a MWU is > not complete, > the acceptance state reached will correspond to a smaller unit, which > will be clipped and > whose transduction will be output (for example, if the dictionary > contains \George" > and the MWU \George Washington", when reading \George W. Bush" the MWU > \George Washington" will abort at the \.", the transduction of \George" > will be > output and the analyser will be ready to process the remaining text, \ > W. Bush")." > > In that case, how does it work with abbreviations? The transducer tries > the MWU \t.ex. ? It knows not to split on punctuation marks, if they > can be found in a MWU?
Simplifying, yes. > If so, I might name the tags for the abbreviations with what ever I find > suitable, couldn't I? And multi word expressions with other punctuation > marks would work as well, I presume. Yes, you can in theory name them what you like. But the norm is to use <abbr> like I have done. Fran ------------------------------------------------------------------------------ Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
