Hi again,
please see my comments below.
Per Tunedal
On Sun, Nov 18, 2012, at 14:22, Francis Tyers wrote:
> El dg 18 de 11 de 2012 a les 12:01 +0100, en/na Per Tunedal va escriure:
> > Hi,
> > I've started some work on abbreviations for the pair Swedish - Danish
> > (sv- da). I imagined that this would be a well-defined, small task of
> > limited scope, but it isn't.
> >
> > Some questions:
> >
> > 1. Internationally accepted standard abbreviations
> >
> > I'm talking about abbreviations for countries (e.g. SE for Sweden), for
> > languages (e.g. sv for Swedish), for currencies (e.g. SEK for svenska
> > kronor - Swedish crowns), for measurement units (e.g. kg or m) and
> > possible some other domains. These abbreviations are common to all
> > languages and it seems unnecessary to add them to all language pairs.
> > Couldn't they be include by default in Apertium? Now they are marked as
> > unknown words and thus get unnecessary attention when post editing.
>
> There complications to "including them by default" for example "kg" and
> "m" are not the same in all languages. For some languages, codes may
> have unexpected ambiguity with other words, thus decreasing the
> performance of the tagger. So no, I don't think it's a good idea to
> include them by default.
>
> A good idea, would perhaps be to have a page on the Wiki with a list
> that people can copy, paste and check. Or a script in the
> trunk/apertium-tools which autogenerates dictionary entries for common
> abbreviations which can then be manually checked before being added.
>
Is this a suitable GCI task?
> > 2. An abbreviation that hasn't any equivalent in the other language in
> > the pair. It has to be translated by an expression.
> > How to treat them?
>
> No idea, give some examples.
Well, you might often abbreviate a word or an expression in one
language, but never dream of abbreviating in it the other (or be
reluctant to doing so). Example:
Danish "ejd." = Swedish "egendom". And further there are phenomenons in
one country that simple doesn't exist in the other country. e.g. Danish
"D." = Swedish ("-") (We don't have any "Dannebrogsordenen" in Sweden!)
or even Danish "AB" = Swedish "-" ( It doesn't exist, I have to use
the Danish full word "andelsboligforening" or some Swedish related
phenomena like "bostadsrättsförening" (can be abbreviated to bfr.) or
"bostadsförening" (very rare, as only a few old ones exist, due to the
fact that it's illegal to create new ones.)
>
> > 3. Abbreviations that are used as prefixes.
> >
> > Many abbreviations are just just as prefixes, like "e-" (e.g. e-post =
> > e-mail) and "id-" (e.g. id-kort = ID, identity card).
> > How to treat them?
>
> This is a form of derivation. Add the derived words as dictionary
> entries.
>
In analogy with what you've already said about derivation. I see.
> Fran
>
>
> ------------------------------------------------------------------------------
> Monitor your physical, virtual and cloud infrastructure from a single
> web console. Get in-depth insight into apps, servers, databases, vmware,
> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
> Pricing starts from $795 for 25 servers or applications!
> http://p.sf.net/sfu/zoho_dev2dev_nov
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff