Per Tunedal <[email protected]>
writes:

> Hi,
>
> On Sat, Aug 11, 2012, at 00:53, Jacob Nordfalk wrote:
>> 2012/8/10 Per Tunedal <[email protected]>

[...]

> What about the translation in the other direction, nb/nn to sv? Is there
> the same need for a Constraint Grammar? Or can I do without it? My
> original plans where to start developing translation in that direction.
> As a native Swede I would find it much easier to translate from
> Norwegian to Swedish: I wouldn't have to check that much in
> dictionaries. Besides, professional translators always translates into
> their mother tongue.

Likewise, people who work on MT are recommended to begin with
translating into their mother tongue :) (that's also the reason for the
state of da→sv, no Swedes have worked on it).

You could simply copy the CG's over unchanged from apertium-nb-nn.

> An other conclusion from the discussion is that I need to create very
> large dictionaries, to overcome that there are much fewer words that are
> exactly the same in Norwegian an Swedish, compared to Norwegian bokmål
> (nb) and Norwegian nynorsk (nn). On the other hand: someone wrote that
> for comprehension, only a short list of difficult words is needed. My
> own conclusion is that it might turn out to be very useful with a "pair"
> Norwegian (nb/nn) to Swedish (sv) containing the most frequent words +
> words that are known to cause difficulties (including "false friends").
> Any one that know of how to figure out what words to include in the
> later list? Collect personal experiences from experts like you?
>
> BTW I ran a few words from the nb frequency wordlist in
> Apertium-caffeine to translate with the da-sv pair. I expected to get a
> very low percentage of unknown words, due to my experiences of the hand
> cream translation. Unfortunately I got as much as 40 % unknown words.
>
> I planned to translate say the first 1500 or so words on the frequency
> list, to get the most important unknown words to work with for a start.
> I expected to get only a few hundred of them, now I'm not so sure any
> longer.

On deciding what words to work on first:

1. get all "closed category" words done (pronouns, determiners, question
words)

2. then make a frequency list of open category words (nouns, verbs,
adjectives, adverbs) and start adding from the top

Of course, if you have a list of false friends, give them priority, but
most such lists are short, that part will hardly take up much time. The
main work with creating a related-languages pair is adding open category
translations.

> What about word order? I found a translation on my sun cream that has
> different word order for Danish (da) and Norwegian bokmål (nb). Just a
> coincidence or a fact to take into account?

What matters is whether it's different in nb and sv …

I doubt the grammatical constructions found in the Sun Cream Corpus are
very representative of normal language use; typically, when you've
worked a while on step 1. and 2. above, you run some text (e.g. news
articles, wikipedia) through the translator and find the most commonly
"odd" or plain wrong grammatical constructions; these you can fix with
transfer rules.

> An other concern of mine: will the solution (3) with separate mono
> lingual dictionaries and a common bilingual dictionary work "out of the
> box" with Apertium, Apertium-caffeine and the OmegaT-plugin? Or does
> this solution imply some changes to the code? Apertium would have to
> find out somehow what monolingual dictionary to look into, wouldn't it?
> I intend to start to play around with Swedish (sv), Danish (da) and
> Norwegian bokmål (nb): can I test drive my dictionaries and rules?

It will; it is the solution used by e.g. en-ca to translate from ca to
either en_GB or en_US.




-- 
Kevin Brubeck Unhammer

GPG: 0x766AC60C


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to