Supporting nn→da as well should present only minor additions, mostly to
your make system. You'd have two modes generated, nn→da and nb→da, which
would share everything that comes _after_ bidix (ie. structural transfer
and da generation). On the nn/nb side of the pipeline, you could
probably snatch the monodixes, prob files and CG's from apertium-nn-nb
without changes.

The only remaining thing is bidix. Here I would keep one master bidix
from no (nn+nb) to da, which is processed by an XSLT script into two
different bidixes before compilation. Most entries would be the same,
but some would be marked nn-only or nb-only. This kind of thing happens
in a lot of apertium pairs, and should be no trouble to set up.

Hi Kevin
Thanks for the advice! It sounds very doable the way you put it. What do you 
think would be the best way to go about it? make a set of transfer rules for 
nb-da first, cg for the same and then work on adding nn support after? I'm sure 
there's lots of quirks and idiomatic expressions that my nb-da t1x wouldn't 
catch with nn as sl.



Jonas Fromseier Mortensen
stud. BA,  Linguistics
University of Copenhagen
lst...@alumni.ku.dk<mailto:lst...@alumni.ku.dk>
+45 27 44 10 05

[cid:486F6BE4-FBB9-42F5-B717-ACD8355D7639]

On 27/04/2013, at 20.36, Kevin Brubeck Unhammer wrote:

Jonas Fromseier Mortensen
<lst...@alumni.ku.dk<mailto:lst...@alumni.ku.dk>> writes:

Hi Fran

I've submitted my proposal on the wiki.
http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application:
_%22Danish-Norwegian_(Bokm%C3%A5l)_language_pair%22

   1) Instead of just making it nb-da, make it no-da with support for
   analysis/generation of both Bokmål and Nynorsk. Unhammer might
   have more
   ideas on this.

Would it go no>nb>da or should the platform do both simultaneously? As
I'm brand new to this I'd rather be realistic and do one language at a
time.


   2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and
   "convert/port" it to Danish. I'm sure many of the rules could be
   reused,
   but they would need to be adapted to Danish words/tags.


That sounds like a great idea! I'll incorporate that.

   3) For generating the bilingual dictionary try using cognates.


Not sure how this is done yet. Is there a script?

my own that you mentioned on IRC:
4) bidirectionality:
Do students normally finish a bidirectional pair GSoc? I'd be worried
about doing grammaticality judgements for generated nb text. I think
you need a native speaker for that. It'd be hard for me to judge
whether the form is obscure, especially since Norwegian and Danish
were so close a hundred years ago and some forms are still used but
considered archaic. I don't want the generated Norwegian text to be a
hybrid.

Yeah, I'd say do no→da first.

Basically my proposal is to do a rock-solid nb>da pair for starters,
including porting the CG, extending the monodices and bidix and then
see if I have time for nynorsk support and bidirectionality. How does
that sound?

Supporting nn→da as well should present only minor additions, mostly to
your make system. You'd have two modes generated, nn→da and nb→da, which
would share everything that comes _after_ bidix (ie. structural transfer
and da generation). On the nn/nb side of the pipeline, you could
probably snatch the monodixes, prob files and CG's from apertium-nn-nb
without changes.

The only remaining thing is bidix. Here I would keep one master bidix
from no (nn+nb) to da, which is processed by an XSLT script into two
different bidixes before compilation. Most entries would be the same,
but some would be marked nn-only or nb-only. This kind of thing happens
in a lot of apertium pairs, and should be no trouble to set up.



--
Kevin Brubeck Unhammer

Written with baby on lap, please excuse my brevity.


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net<mailto:Apertium-stuff@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/apertium-stuff



<<inline: image001-2.gif>>

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to