Supporting nn→da as well should present only minor additions, mostly to your make system. You'd have two modes generated, nn→da and nb→da, which would share everything that comes _after_ bidix (ie. structural transfer and da generation). On the nn/nb side of the pipeline, you could probably snatch the monodixes, prob files and CG's from apertium-nn-nb without changes.
The only remaining thing is bidix. Here I would keep one master bidix from no (nn+nb) to da, which is processed by an XSLT script into two different bidixes before compilation. Most entries would be the same, but some would be marked nn-only or nb-only. This kind of thing happens in a lot of apertium pairs, and should be no trouble to set up. Hi Kevin Thanks for the advice! It sounds very doable the way you put it. What do you think would be the best way to go about it? make a set of transfer rules for nb-da first, cg for the same and then work on adding nn support after? I'm sure there's lots of quirks and idiomatic expressions that my nb-da t1x wouldn't catch with nn as sl. Jonas Fromseier Mortensen stud. BA, Linguistics University of Copenhagen lst...@alumni.ku.dk<mailto:lst...@alumni.ku.dk> +45 27 44 10 05 [cid:486F6BE4-FBB9-42F5-B717-ACD8355D7639] On 27/04/2013, at 20.36, Kevin Brubeck Unhammer wrote: Jonas Fromseier Mortensen <lst...@alumni.ku.dk<mailto:lst...@alumni.ku.dk>> writes: Hi Fran I've submitted my proposal on the wiki. http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application: _%22Danish-Norwegian_(Bokm%C3%A5l)_language_pair%22 1) Instead of just making it nb-da, make it no-da with support for analysis/generation of both Bokmål and Nynorsk. Unhammer might have more ideas on this. Would it go no>nb>da or should the platform do both simultaneously? As I'm brand new to this I'd rather be realistic and do one language at a time. 2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and "convert/port" it to Danish. I'm sure many of the rules could be reused, but they would need to be adapted to Danish words/tags. That sounds like a great idea! I'll incorporate that. 3) For generating the bilingual dictionary try using cognates. Not sure how this is done yet. Is there a script? my own that you mentioned on IRC: 4) bidirectionality: Do students normally finish a bidirectional pair GSoc? I'd be worried about doing grammaticality judgements for generated nb text. I think you need a native speaker for that. It'd be hard for me to judge whether the form is obscure, especially since Norwegian and Danish were so close a hundred years ago and some forms are still used but considered archaic. I don't want the generated Norwegian text to be a hybrid. Yeah, I'd say do no→da first. Basically my proposal is to do a rock-solid nb>da pair for starters, including porting the CG, extending the monodices and bidix and then see if I have time for nynorsk support and bidirectionality. How does that sound? Supporting nn→da as well should present only minor additions, mostly to your make system. You'd have two modes generated, nn→da and nb→da, which would share everything that comes _after_ bidix (ie. structural transfer and da generation). On the nn/nb side of the pipeline, you could probably snatch the monodixes, prob files and CG's from apertium-nn-nb without changes. The only remaining thing is bidix. Here I would keep one master bidix from no (nn+nb) to da, which is processed by an XSLT script into two different bidixes before compilation. Most entries would be the same, but some would be marked nn-only or nb-only. This kind of thing happens in a lot of apertium pairs, and should be no trouble to set up. -- Kevin Brubeck Unhammer Written with baby on lap, please excuse my brevity. ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net<mailto:Apertium-stuff@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
<<inline: image001-2.gif>>
------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff