Hi again,

"This left-to-right, longest-match way of functioning makes it very easy
to treat
(variable or invariable) multi-word units (MWUs), for input: if a MWU is
not complete,
the acceptance state reached will correspond to a smaller unit, which
will be clipped and
whose transduction will be output (for example, if the dictionary
contains \George"
and the MWU \George Washington", when reading \George W. Bush" the MWU
\George Washington" will abort at the \.", the transduction of \George"
will be
output and the analyser will be ready to process the remaining text, \
W. Bush")."

In that case, how does it work with abbreviations? The transducer tries
the MWU \t.ex.  ? It knows not to split on punctuation marks, if they
can be found in a MWU?

If so, I might name the tags for the abbreviations with what ever I find
suitable, couldn't I? And multi word expressions with other punctuation
marks would work as well, I presume.

Yours,
Per Tunedal

On Tue, Nov 13, 2012, at 9:57, Francis Tyers wrote:
> El dt 13 de 11 de 2012 a les 09:31 +0100, en/na Per Tunedal va escriure:
--snip--
> > Secondly, I'm curious how Apertium handles word splitting. The points in
> > abbreviations must be handled somehow, wouldn't they? I just thought
> > about simple scripts for aligning, like Bligner, or even OmegaT. They
> > split sentences at punctuation marks. Thus, they have a list of what not
> > to  split, i.e. the abbreviations for the languages in concern. That's
> > why I started this tread. How does Apertium know not to split? Does the
> > tagger look for the tag <abbr> ? Is this a standard solution for
> > Apertium? Or do I have to add it in each language pair somehow?
> 
> Left-to-right longest match with tokenise-as-you-analyse.
> 
> http://www.dlsi.ua.es/~mlf/docum/garrido02p.pdf
> 
> Section 3 describes it.
> 
> Fran
> 
> 
> ------------------------------------------------------------------------------
> Monitor your physical, virtual and cloud infrastructure from a single
> web console. Get in-depth insight into apps, servers, databases, vmware,
> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
> Pricing starts from $795 for 25 servers or applications!
> http://p.sf.net/sfu/zoho_dev2dev_nov
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to