Hi Fran I've submitted my proposal on the wiki. http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application:_%22Danish-Norwegian_(Bokm%C3%A5l)_language_pair%22<http://wiki.apertium.org/wiki/User:Jonasfromseier/GSoC_2013_Application:_"Danish-Norwegian_(Bokmål)_language_pair">
1) Instead of just making it nb-da, make it no-da with support for analysis/generation of both Bokmål and Nynorsk. Unhammer might have more ideas on this. Would it go no>nb>da or should the platform do both simultaneously? As I'm brand new to this I'd rather be realistic and do one language at a time. 2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and "convert/port" it to Danish. I'm sure many of the rules could be reused, but they would need to be adapted to Danish words/tags. That sounds like a great idea! I'll incorporate that. 3) For generating the bilingual dictionary try using cognates. Not sure how this is done yet. Is there a script? my own that you mentioned on IRC: 4) bidirectionality: Do students normally finish a bidirectional pair GSoc? I'd be worried about doing grammaticality judgements for generated nb text. I think you need a native speaker for that. It'd be hard for me to judge whether the form is obscure, especially since Norwegian and Danish were so close a hundred years ago and some forms are still used but considered archaic. I don't want the generated Norwegian text to be a hybrid. Basically my proposal is to do a rock-solid nb>da pair for starters, including porting the CG, extending the monodices and bidix and then see if I have time for nynorsk support and bidirectionality. How does that sound? Jonas Fromseier Mortensen stud. BA, Linguistics University of Copenhagen [email protected]<mailto:[email protected]> +45 27 44 10 05 [cid:486F6BE4-FBB9-42F5-B717-ACD8355D7639] On 23/04/2013, at 00.13, Francis Tyers wrote: El dl 22 de 04 de 2013 a les 20:12 +0000, en/na Jonas Fromseier Mortensen va escriure: Hi everybody I would like to propose the idea of starting a nb-da (Norwegian Bokmål Danish) language pair for GSoC. I'm a linguistics student with some coding experience (Python, XML). Does this sound like an idea that could be taken on? Sounds like a nice idea for a GSOC project. We already have an nn-nb and a da-sv pair, so this would make a nice pair to include. Some ideas for you to think about: 1) Instead of just making it nb-da, make it no-da with support for analysis/generation of both Bokmål and Nynorsk. Unhammer might have more ideas on this. 2) Take the Oslo-Bergen constraint grammar for Bokmål[1] and "convert/port" it to Danish. I'm sure many of the rules could be reused, but they would need to be adapted to Danish words/tags. 3) For generating the bilingual dictionary try using cognates. Kevin Unhammer's 2009 Nynorsk-Bokmål project should serve as an inspiration. :) Fran 1. https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-nn-nb/apertium-nn-nb.nb-nn.rlx ------------------------------------------------------------------------------ Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
<<inline: image001-2.gif>>
------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
