Hi everyone,
My name is Chris, and I'm a graduate student in Linguistics and Computer
Science in the US. I have several ideas for potential Apertium projects, so
I wanted to bounce them off you and hopefully get some feedback.
First, regarding the potential adoption of a language pair. It looks like
there's no German-Turkish (de-tr) pair -- as I am an advanced speaker of
both of these languages, it seems like creating that pair could be a good
project. However, I really want to do something more programming-intensive.
I think building a module for corpus-based language model learning using a
Vector Space model with grammatical features could be useful and fun.
However, the ideas page suggests that this is needed primarily for Romance
languages - although I have good theoretical knowledge, I am not an
advanced speaker of any Romance languages, so knowing which features in a
particular language could benefit from language-model feedback might be
difficult without significant guidance. However, this project could be a
great learning experience.
Finally, it seems to me that the language model system suggested above
(especially one using NGram probabilities) could be combined with the
project suggesting a new module for multiword specification to create a
system for automatically identifying and tagging multiwords.
Of course, all of these ideas need refining, but I wanted to put them out
there to see what you think. Any feedback you have would be great!
CH
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff