Hi,

On Sat, Aug 11, 2012, at 00:53, Jacob Nordfalk wrote:
> 2012/8/10 Per Tunedal <[email protected]>
> 
> > >
> > > As Danish is a kind of old Norwegian bokmaal, maybe we could inlude that
> > > language too.
> > > Then all three languages could benefit from the combined work.
> >
> > You're right! My hand cream example comes to mind. I quote again:
> >
> > "On my hand cream I can read "N/D Intensivt mykgjørende/blødgørende
> > og pleiende håndkrem/håndcreme.""
> >
> 
> :-)
> 
> 
> >
> > Just for fun I tried to translate a text in Norwegian bokmål (nb) to
> > Swedish with the da-sv pair and just about half the words where marked
> > as unknown.
> >
> 
> Note that da->sv hasnt been developed nor released at all.
> 

Yet it's in Apertium-caffeine! And quite funny to play around with.
What's the problem with the direction da-sv compared to sv-da? Why
hasn't it been officially released?

> 
> WRT whether GC is available for omega-T or not, I think this is too early
> to say anything but that the problem can be solved, and there are many
> ways
> it could be solved:
> - GG java port,
> - Java port thru LanguageTool,
> - Using a locally installed CG (binaries exists for Windows - see
> http://beta.visl.sdu.dk/cg3/single/#windows) as an external program
> - Make OmegaT use the Apertium web service
> 
> Apart from that, Apertiums built-in ruleset might be satisfactory.
> It might also be that Francis' work on lexical disambuguation rules could
> be applied (which I like because it has a much more Apertium-developer
> friendly syntax than CG)
> 

I have interpreted the Constraint Grammar discussion as an indication
that there are a lot of ambiguities to resolve when translating from
Swedish to Norwegian. And thus, the translation quality might be poor if
it isn't included in the project. As the Apertium Wiki wasn't accessible
yesterday, I haven't studied how far I can get by Apertiums built-in
rule set.

What about the translation in the other direction, nb/nn to sv? Is there
the same need for a Constraint Grammar? Or can I do without it? My
original plans where to start developing translation in that direction.
As a native Swede I would find it much easier to translate from
Norwegian to Swedish: I wouldn't have to check that much in
dictionaries. Besides, professional translators always translates into
their mother tongue.

An other conclusion from the discussion is that I need to create very
large dictionaries, to overcome that there are much fewer words that are
exactly the same in Norwegian an Swedish, compared to Norwegian bokmål
(nb) and Norwegian nynorsk (nn). On the other hand: someone wrote that
for comprehension, only a short list of difficult words is needed. My
own conclusion is that it might turn out to be very useful with a "pair"
Norwegian (nb/nn) to Swedish (sv) containing the most frequent words +
words that are known to cause difficulties (including "false friends").
Any one that know of how to figure out what words to include in the
later list? Collect personal experiences from experts like you?

BTW I ran a few words from the nb frequency wordlist in
Apertium-caffeine to translate with the da-sv pair. I expected to get a
very low percentage of unknown words, due to my experiences of the hand
cream translation. Unfortunately I got as much as 40 % unknown words.

I planned to translate say the first 1500 or so words on the frequency
list, to get the most important unknown words to work with for a start.
I expected to get only a few hundred of them, now I'm not so sure any
longer.

What about word order? I found a translation on my sun cream that has
different word order for Danish (da) and Norwegian bokmål (nb). Just a
coincidence or a fact to take into account?

An other concern of mine: will the solution (3) with separate mono
lingual dictionaries and a common bilingual dictionary work "out of the
box" with Apertium, Apertium-caffeine and the OmegaT-plugin? Or does
this solution imply some changes to the code? Apertium would have to
find out somehow what monolingual dictionary to look into, wouldn't it?
I intend to start to play around with Swedish (sv), Danish (da) and
Norwegian bokmål (nb): can I test drive my dictionaries and rules?

> 
> We are near the end of the project period of Google Summer of Code and
> therefore I think the right advice to give to Per is: Get started, and we
> can promise that something will be ready to make it usable from Omega-T
> when you need it.
> 

Sounds great! But I'm a very impatient person. I might ponder over some
problem for a long time, occasionally even for years, but as soon as I
find a feasible solution I want to implement it immediately! 

  Yours,
  Per Tunedal


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to