Jonathan, Yes, I would have a look at the resources more thoroughly and see what I can do. Thank you very much for your advice.
-- Tomohiro 2020年2月26日(水) 22:50 Jonathan Washington <jonathan.n.washing...@gmail.com>: > Hi Tomohiro, > > Actually, my point was that there is still a lot to be done. The work I > pointed you to is a proof of concept more than anything, and it has not > been integrated into Apertium. > > If I were you, and interested in participating in GSoC, I would have a > look at those resources and try to get them running, and figure out how > they work and what the limitations are. That will give you a good idea of > what still needs to be done. > > -- > Jonathan > > On Wed, Feb 26, 2020, 08:41 Tomohiro Akazawa <tomohiroakaz...@gmail.com> > wrote: > >> Hi Jonathan, >> >> thank you for your feedback. >> there seem to be enough implementations for Japanese. >> >> -- >> Tomohiro >> >> 2020年2月26日(水) 22:26 Jonathan Washington <jonathan.n.washing...@gmail.com >> >: >> >>> Hi Tommi, all, >>> >>> A couple years ago, a Swarthmore student implemented an algorithm for >>> tokenisation of spaceless orthographies using morphological transducers. >>> She used a fork of a prototype Japanese transducer developed by another of >>> my students to evaluate it. >>> >>> The work is available at the following urls: >>> >>> https://scholarship.tricolib.brynmawr.edu/handle/10066/20002 >>> >>> https://github.com/chanlon1/tokenisation >>> >>> https://github.com/chanlon1/apertium-jpn >>> >>> -- >>> Jonathan >>> >>> On Wed, Feb 26, 2020, 06:38 Tomohiro Akazawa <tomohiroakaz...@gmail.com> >>> wrote: >>> >>>> Thank you for your reply. >>>> If "improving the support of Japanese on Apertium" could be a new >>>> project on GSoC, I would find the problems of the current version of >>>> Apertium and figure out the solutions for them. >>>> Thank you. >>>> >>>> 2020年2月26日(水) 0:47 Tommi A Pirinen <tommi.antero.piri...@uni-hamburg.de >>>> >: >>>> >>>>> Hi all, >>>>> one thing that might be worth considering ia improving support of >>>>> Japanese in Apertium, is that we currently do not have any good >>>>> generic solution for the word-tokenisation, this affects especially >>>>> languages like Japanese where a space- and punct-based tokenisation is >>>>> much more suboptimal than for European languages. If you'd be >>>>> interested in >>>>> formulating a project solving the tokenisation problem, I think it >>>>> would >>>>> fit to Apertium gsoc quite well, and if others agree I could >>>>> (co-)mentor >>>>> >>>>> On Mon, Feb 24, 2020 at 06:12:28AM +0900, Tomohiro Akazawa wrote: >>>>> > Thank you for your reply. >>>>> > Considering there are many resources for English and Japanese, >>>>> possibly I >>>>> > should change my plan . >>>>> > Thank you >>>>> >>>>> >>>>> >>>>> > On Sun, 23 Feb 2020, 23:58 Hèctor Alòs i Font, <hectora...@gmail.com> >>>>> wrote: >>>>> > >>>>> > > Hi Tomohiro, >>>>> > > >>>>> > > Maybe it is not the 2019 version of the application form, but the >>>>> 2020 one >>>>> > > (if Apertium is elected by Google as a partner organisation) >>>>> should not be >>>>> > > very different of this one: >>>>> > > http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications >>>>> > > Essentially, for a pair like English and Japanese the main >>>>> questions >>>>> > > probably will be: >>>>> > > >>>>> > > * reasons why Google and Apertium should sponsor it, >>>>> > > * a description of how and who it will benefit in society, >>>>> > > >>>>> > > (essentially because both English and Japanese are resourceful >>>>> languages). >>>>> > > Imho, Okinawan-Japanese would be a much more Apertium-like >>>>> proposal. But, >>>>> > > of course, I may be wrong. I should maybe add that for building a >>>>> > > translator it is not absolutely necessary to be proficient in the >>>>> source >>>>> > > language. If you can read it and you have access to grammars, >>>>> dictionaries >>>>> > > and informants, this is usually enough. But, of course, the more >>>>> you know >>>>> > > the source language (not only the target one), the better. >>>>> > > >>>>> > > Hèctor >>>>> > > >>>>> > > Missatge de Tomohiro Akazawa <tomohiroakaz...@gmail.com> del dia >>>>> dg., 23 >>>>> > > de febr. 2020 a les 14:27: >>>>> > > >>>>> > >> Hello. >>>>> > >> My name is Tomohiro and I am a student of the University of Tokyo >>>>> in >>>>> > >> Japan. >>>>> > >> Seeing the Apertium's idea list for GSoC 2020, I found "Adopt an >>>>> > >> unreleased language pair" interesting. >>>>> > >> Do you think it is possible to make the language pair between >>>>> English >>>>> > >> and Japanese? >>>>> > >> Thank you very much. >>>>> > >> _______________________________________________ >>>>> > >> Apertium-stuff mailing list >>>>> > >> Apertium-stuff@lists.sourceforge.net >>>>> > >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> > >> >>>>> > > _______________________________________________ >>>>> > > Apertium-stuff mailing list >>>>> > > Apertium-stuff@lists.sourceforge.net >>>>> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> > > >>>>> >>>>> >>>>> > _______________________________________________ >>>>> > Apertium-stuff mailing list >>>>> > Apertium-stuff@lists.sourceforge.net >>>>> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> >>>>> >>>>> -- >>>>> Doktor Tommi A Pirinen, Computational Linguist, >>>>> <https://flammie.github.io/purplemonkeydishwasher/>, Universität >>>>> Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. >>>>> CLARIN-D >>>>> Entwickler. President of ACL SIGUR SIG for Uralic languages >>>>> <http://gtweb.uit.no/sigur/>. >>>>> I tend to follow inline-posting style in desktop e-mail messages. >>>>> _______________________________________________ >>>>> Apertium-stuff mailing list >>>>> Apertium-stuff@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> >>>> _______________________________________________ >>>> Apertium-stuff mailing list >>>> Apertium-stuff@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>> >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff