Hi Fran

I've submitted my proposal on the wiki.
http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application:_%22Danish-Norwegian_(Bokm%C3%A5l)_language_pair%22<http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application:_"Danish-Norwegian_(Bokmål)_language_pair">

1) Instead of just making it nb-da, make it no-da with support for
analysis/generation of both Bokmål and Nynorsk. Unhammer might have more
ideas on this.

Would it go no>nb>da or should the platform do both simultaneously? As I'm 
brand new to this I'd rather be realistic and do one language at a time.


2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and
"convert/port" it to Danish. I'm sure many of the rules could be reused,
but they would need to be adapted to Danish words/tags.


That sounds like a great idea! I'll incorporate that.

3) For generating the bilingual dictionary try using cognates.

Not sure how this is done yet. Is there a script?

my own that you mentioned on IRC:
4) bidirectionality:
 Do students normally finish a bidirectional pair GSoc? I'd be worried about 
doing grammaticality judgements for generated nb text. I think you need a 
native speaker for that. It'd be hard for me to judge whether the form is 
obscure, especially since Norwegian and Danish were so close a hundred years 
ago and some forms are still used but considered archaic. I don't want the 
generated Norwegian text to be a hybrid.


Basically my proposal is to do a rock-solid nb>da pair for starters, including 
porting the CG, extending the monodices and bidix and then see if I have time 
for nynorsk support and bidirectionality. How does that sound?




Jonas Fromseier Mortensen
stud. BA,  Linguistics
University of Copenhagen
[email protected]<mailto:[email protected]>
+45 27 44 10 05

[cid:486F6BE4-FBB9-42F5-B717-ACD8355D7639]

On 23/04/2013, at 00.13, Francis Tyers wrote:

El dl 22 de 04 de 2013 a les 20:12 +0000, en/na Jonas Fromseier
Mortensen va escriure:
Hi everybody
I would like to propose the idea of starting a nb-da (Norwegian Bokmål
Danish) language pair for GSoC. I'm a linguistics student with some
coding experience (Python, XML). Does this sound like an idea that
could be taken on?

Sounds like a nice idea for a GSOC project. We already have an nn-nb and
a da-sv pair, so this would make a nice pair to include. Some ideas for
you to think about:

1) Instead of just making it nb-da, make it no-da with support for
analysis/generation of both Bokmål and Nynorsk. Unhammer might have more
ideas on this.

2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and
"convert/port" it to Danish. I'm sure many of the rules could be reused,
but they would need to be adapted to Danish words/tags.

3) For generating the bilingual dictionary try using cognates.

Kevin Unhammer's 2009 Nynorsk-Bokmål project should serve as an
inspiration. :)

Fran

1.
https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-nn-nb/apertium-nn-nb.nb-nn.rlx


------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

<<inline: image001-2.gif>>

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to