Thank you for your reply. The project seems cool to work on for GSOC2023,
and I would like to participate in. I reckon there are two tasks on the
page and could you tell me where to start?

On Fri, 24 Feb 2023 at 08:20, Kevin Brubeck Unhammer <unham...@fsfe.org>
wrote:

> > I'd like to participate in Google Summer of Code 2023 at Apertium.
> > In particular, I'm interested in adding new language pair and I am
> > thinking to add Japanese-English as I speak Japanese. I took summer
> > school at Tokyo University online on natural language processing
> > before.
> > Could you tell me more about the project?
>
> Hi,
>
> Getting some support for Japanese would be great! I'm not sure if you
> saw the whole IRC discussion, but what we really need in that regard is
> support for the *tokenisation* step, where our regular methods[1] fail
> us, since the text might have no spaces and lots of
> tokenisation-ambiguity. There has been some prior work[2] and it's
> already listed as a potential GsoC project.
>
> Support for anything-Japanese depends on tokenisation. It's also a big
> enough job that it would qualify as a full GsoC project, so if you were
> hoping for jpn-eng in a summer you will be disappointeda (but having a
> toy language pair to test with would help!). On the other hand, if we
> get good spaceless tokenisation we open up the possibility for not just
> Japanese, but Thai, Lao, Chinese etc. – and of course all those writing
> systems used before the invention of the space character :)
>
> regards,
> Kevin
>
> [1] https://wiki.apertium.org/wiki/LRLM
> [2] http://hdl.handle.net/10066/20002
> [3]
> https://wiki.apertium.org/wiki/Task_ideas_for_Google_Code-in/Tokenisation_for_spaceless_orthographies
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to