Hi all,
one thing that might be worth considering ia improving support of
Japanese in Apertium, is that we currently do not have any good
generic solution for the word-tokenisation, this affects especially
languages like Japanese where a space- and punct-based tokenisation is
much more suboptimal than for European languages. If you'd be interested in
formulating a project solving the tokenisation problem, I think it would
fit to Apertium gsoc quite well, and if others agree I could (co-)mentor

On Mon, Feb 24, 2020 at 06:12:28AM +0900, Tomohiro Akazawa wrote:
> Thank you for your reply.
> Considering there are many resources for English and Japanese, possibly I
> should change my plan .
> Thank you



> On Sun, 23 Feb 2020, 23:58 Hèctor Alòs i Font, <hectora...@gmail.com> wrote:
> 
> > Hi Tomohiro,
> >
> > Maybe it is not the 2019 version of the application form, but the 2020 one
> > (if Apertium is elected by Google as a partner organisation) should not be
> > very different of this one:
> > http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications
> > Essentially, for a pair like English and Japanese the main questions
> > probably will be:
> >
> >     * reasons why Google and Apertium should sponsor it,
> >     * a description of how and who it will benefit in society,
> >
> > (essentially because both English and Japanese are resourceful languages).
> > Imho, Okinawan-Japanese would be a much more Apertium-like proposal. But,
> > of course, I may be wrong. I should maybe add that for building a
> > translator it is not absolutely necessary to be proficient in the source
> > language. If you can read it and you have access to grammars, dictionaries
> > and informants, this is usually enough. But, of course, the more you know
> > the source language (not only the target one), the better.
> >
> > Hèctor
> >
> > Missatge de Tomohiro Akazawa <tomohiroakaz...@gmail.com> del dia dg., 23
> > de febr. 2020 a les 14:27:
> >
> >>  Hello.
> >> My name is Tomohiro and I am a student of the University of Tokyo in
> >> Japan.
> >>  Seeing the Apertium's idea list for GSoC 2020, I found "Adopt an
> >> unreleased language pair" interesting.
> >>  Do you think it is possible to make the language pair between English
> >> and Japanese?
> >> Thank you very much.
> >> _______________________________________________
> >> Apertium-stuff mailing list
> >> Apertium-stuff@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >>
> > _______________________________________________
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >


> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


-- 
Doktor Tommi A Pirinen, Computational Linguist,
<https://flammie.github.io/purplemonkeydishwasher/>, Universität
Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D
Entwickler.  President of ACL SIGUR SIG for Uralic languages
<http://gtweb.uit.no/sigur/>.
I tend to follow inline-posting style in desktop e-mail messages.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to