Hi all, one thing that might be worth considering ia improving support of Japanese in Apertium, is that we currently do not have any good generic solution for the word-tokenisation, this affects especially languages like Japanese where a space- and punct-based tokenisation is much more suboptimal than for European languages. If you'd be interested in formulating a project solving the tokenisation problem, I think it would fit to Apertium gsoc quite well, and if others agree I could (co-)mentor
On Mon, Feb 24, 2020 at 06:12:28AM +0900, Tomohiro Akazawa wrote: > Thank you for your reply. > Considering there are many resources for English and Japanese, possibly I > should change my plan . > Thank you > On Sun, 23 Feb 2020, 23:58 Hèctor Alòs i Font, <hectora...@gmail.com> wrote: > > > Hi Tomohiro, > > > > Maybe it is not the 2019 version of the application form, but the 2020 one > > (if Apertium is elected by Google as a partner organisation) should not be > > very different of this one: > > http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications > > Essentially, for a pair like English and Japanese the main questions > > probably will be: > > > > * reasons why Google and Apertium should sponsor it, > > * a description of how and who it will benefit in society, > > > > (essentially because both English and Japanese are resourceful languages). > > Imho, Okinawan-Japanese would be a much more Apertium-like proposal. But, > > of course, I may be wrong. I should maybe add that for building a > > translator it is not absolutely necessary to be proficient in the source > > language. If you can read it and you have access to grammars, dictionaries > > and informants, this is usually enough. But, of course, the more you know > > the source language (not only the target one), the better. > > > > Hèctor > > > > Missatge de Tomohiro Akazawa <tomohiroakaz...@gmail.com> del dia dg., 23 > > de febr. 2020 a les 14:27: > > > >> Hello. > >> My name is Tomohiro and I am a student of the University of Tokyo in > >> Japan. > >> Seeing the Apertium's idea list for GSoC 2020, I found "Adopt an > >> unreleased language pair" interesting. > >> Do you think it is possible to make the language pair between English > >> and Japanese? > >> Thank you very much. > >> _______________________________________________ > >> Apertium-stuff mailing list > >> Apertium-stuff@lists.sourceforge.net > >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff > >> > > _______________________________________________ > > Apertium-stuff mailing list > > Apertium-stuff@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff -- Doktor Tommi A Pirinen, Computational Linguist, <https://flammie.github.io/purplemonkeydishwasher/>, Universität Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D Entwickler. President of ACL SIGUR SIG for Uralic languages <http://gtweb.uit.no/sigur/>. I tend to follow inline-posting style in desktop e-mail messages.
signature.asc
Description: PGP signature
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff