Hi Jonathan, thank you for your feedback. there seem to be enough implementations for Japanese.
-- Tomohiro 2020年2月26日(水) 22:26 Jonathan Washington <jonathan.n.washing...@gmail.com>: > Hi Tommi, all, > > A couple years ago, a Swarthmore student implemented an algorithm for > tokenisation of spaceless orthographies using morphological transducers. > She used a fork of a prototype Japanese transducer developed by another of > my students to evaluate it. > > The work is available at the following urls: > > https://scholarship.tricolib.brynmawr.edu/handle/10066/20002 > > https://github.com/chanlon1/tokenisation > > https://github.com/chanlon1/apertium-jpn > > -- > Jonathan > > On Wed, Feb 26, 2020, 06:38 Tomohiro Akazawa <tomohiroakaz...@gmail.com> > wrote: > >> Thank you for your reply. >> If "improving the support of Japanese on Apertium" could be a new >> project on GSoC, I would find the problems of the current version of >> Apertium and figure out the solutions for them. >> Thank you. >> >> 2020年2月26日(水) 0:47 Tommi A Pirinen <tommi.antero.piri...@uni-hamburg.de>: >> >>> Hi all, >>> one thing that might be worth considering ia improving support of >>> Japanese in Apertium, is that we currently do not have any good >>> generic solution for the word-tokenisation, this affects especially >>> languages like Japanese where a space- and punct-based tokenisation is >>> much more suboptimal than for European languages. If you'd be interested >>> in >>> formulating a project solving the tokenisation problem, I think it would >>> fit to Apertium gsoc quite well, and if others agree I could (co-)mentor >>> >>> On Mon, Feb 24, 2020 at 06:12:28AM +0900, Tomohiro Akazawa wrote: >>> > Thank you for your reply. >>> > Considering there are many resources for English and Japanese, >>> possibly I >>> > should change my plan . >>> > Thank you >>> >>> >>> >>> > On Sun, 23 Feb 2020, 23:58 Hèctor Alòs i Font, <hectora...@gmail.com> >>> wrote: >>> > >>> > > Hi Tomohiro, >>> > > >>> > > Maybe it is not the 2019 version of the application form, but the >>> 2020 one >>> > > (if Apertium is elected by Google as a partner organisation) should >>> not be >>> > > very different of this one: >>> > > http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications >>> > > Essentially, for a pair like English and Japanese the main questions >>> > > probably will be: >>> > > >>> > > * reasons why Google and Apertium should sponsor it, >>> > > * a description of how and who it will benefit in society, >>> > > >>> > > (essentially because both English and Japanese are resourceful >>> languages). >>> > > Imho, Okinawan-Japanese would be a much more Apertium-like proposal. >>> But, >>> > > of course, I may be wrong. I should maybe add that for building a >>> > > translator it is not absolutely necessary to be proficient in the >>> source >>> > > language. If you can read it and you have access to grammars, >>> dictionaries >>> > > and informants, this is usually enough. But, of course, the more you >>> know >>> > > the source language (not only the target one), the better. >>> > > >>> > > Hèctor >>> > > >>> > > Missatge de Tomohiro Akazawa <tomohiroakaz...@gmail.com> del dia >>> dg., 23 >>> > > de febr. 2020 a les 14:27: >>> > > >>> > >> Hello. >>> > >> My name is Tomohiro and I am a student of the University of Tokyo in >>> > >> Japan. >>> > >> Seeing the Apertium's idea list for GSoC 2020, I found "Adopt an >>> > >> unreleased language pair" interesting. >>> > >> Do you think it is possible to make the language pair between >>> English >>> > >> and Japanese? >>> > >> Thank you very much. >>> > >> _______________________________________________ >>> > >> Apertium-stuff mailing list >>> > >> Apertium-stuff@lists.sourceforge.net >>> > >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> > >> >>> > > _______________________________________________ >>> > > Apertium-stuff mailing list >>> > > Apertium-stuff@lists.sourceforge.net >>> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> > > >>> >>> >>> > _______________________________________________ >>> > Apertium-stuff mailing list >>> > Apertium-stuff@lists.sourceforge.net >>> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >>> >>> -- >>> Doktor Tommi A Pirinen, Computational Linguist, >>> <https://flammie.github.io/purplemonkeydishwasher/>, Universität >>> Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D >>> Entwickler. President of ACL SIGUR SIG for Uralic languages >>> <http://gtweb.uit.no/sigur/>. >>> I tend to follow inline-posting style in desktop e-mail messages. >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff