Hi Tomohiro,

Actually, my point was that there is still a lot to be done.  The work I
pointed you to is a proof of concept more than anything, and it has not
been integrated into Apertium.

If I were you, and interested in participating in GSoC, I would have a look
at those resources and try to get them running, and figure out how they
work and what the limitations are.  That will give you a good idea of what
still needs to be done.

--
Jonathan

On Wed, Feb 26, 2020, 08:41 Tomohiro Akazawa <tomohiroakaz...@gmail.com>
wrote:

> Hi Jonathan,
>
> thank you for your feedback.
> there seem to be enough implementations for Japanese.
>
> --
> Tomohiro
>
> 2020年2月26日(水) 22:26 Jonathan Washington <jonathan.n.washing...@gmail.com>:
>
>> Hi Tommi, all,
>>
>> A couple years ago, a Swarthmore student implemented an algorithm for
>> tokenisation of spaceless orthographies using morphological transducers.
>> She used a fork of a prototype Japanese transducer developed by another of
>> my students to evaluate it.
>>
>> The work is available at the following urls:
>>
>> https://scholarship.tricolib.brynmawr.edu/handle/10066/20002
>>
>> https://github.com/chanlon1/tokenisation
>>
>> https://github.com/chanlon1/apertium-jpn
>>
>> --
>> Jonathan
>>
>> On Wed, Feb 26, 2020, 06:38 Tomohiro Akazawa <tomohiroakaz...@gmail.com>
>> wrote:
>>
>>> Thank you for your reply.
>>> If  "improving the support of Japanese on Apertium" could be a new
>>> project on GSoC, I would find the problems of the current version of
>>> Apertium and figure out the solutions for them.
>>> Thank you.
>>>
>>> 2020年2月26日(水) 0:47 Tommi A Pirinen <tommi.antero.piri...@uni-hamburg.de
>>> >:
>>>
>>>> Hi all,
>>>> one thing that might be worth considering ia improving support of
>>>> Japanese in Apertium, is that we currently do not have any good
>>>> generic solution for the word-tokenisation, this affects especially
>>>> languages like Japanese where a space- and punct-based tokenisation is
>>>> much more suboptimal than for European languages. If you'd be
>>>> interested in
>>>> formulating a project solving the tokenisation problem, I think it would
>>>> fit to Apertium gsoc quite well, and if others agree I could (co-)mentor
>>>>
>>>> On Mon, Feb 24, 2020 at 06:12:28AM +0900, Tomohiro Akazawa wrote:
>>>> > Thank you for your reply.
>>>> > Considering there are many resources for English and Japanese,
>>>> possibly I
>>>> > should change my plan .
>>>> > Thank you
>>>>
>>>>
>>>>
>>>> > On Sun, 23 Feb 2020, 23:58 Hèctor Alòs i Font, <hectora...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > Hi Tomohiro,
>>>> > >
>>>> > > Maybe it is not the 2019 version of the application form, but the
>>>> 2020 one
>>>> > > (if Apertium is elected by Google as a partner organisation) should
>>>> not be
>>>> > > very different of this one:
>>>> > > http://wiki.apertium.org/wiki/Top_tips_for_GSOC_applications
>>>> > > Essentially, for a pair like English and Japanese the main questions
>>>> > > probably will be:
>>>> > >
>>>> > >     * reasons why Google and Apertium should sponsor it,
>>>> > >     * a description of how and who it will benefit in society,
>>>> > >
>>>> > > (essentially because both English and Japanese are resourceful
>>>> languages).
>>>> > > Imho, Okinawan-Japanese would be a much more Apertium-like
>>>> proposal. But,
>>>> > > of course, I may be wrong. I should maybe add that for building a
>>>> > > translator it is not absolutely necessary to be proficient in the
>>>> source
>>>> > > language. If you can read it and you have access to grammars,
>>>> dictionaries
>>>> > > and informants, this is usually enough. But, of course, the more
>>>> you know
>>>> > > the source language (not only the target one), the better.
>>>> > >
>>>> > > Hèctor
>>>> > >
>>>> > > Missatge de Tomohiro Akazawa <tomohiroakaz...@gmail.com> del dia
>>>> dg., 23
>>>> > > de febr. 2020 a les 14:27:
>>>> > >
>>>> > >>  Hello.
>>>> > >> My name is Tomohiro and I am a student of the University of Tokyo
>>>> in
>>>> > >> Japan.
>>>> > >>  Seeing the Apertium's idea list for GSoC 2020, I found "Adopt an
>>>> > >> unreleased language pair" interesting.
>>>> > >>  Do you think it is possible to make the language pair between
>>>> English
>>>> > >> and Japanese?
>>>> > >> Thank you very much.
>>>> > >> _______________________________________________
>>>> > >> Apertium-stuff mailing list
>>>> > >> Apertium-stuff@lists.sourceforge.net
>>>> > >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>> > >>
>>>> > > _______________________________________________
>>>> > > Apertium-stuff mailing list
>>>> > > Apertium-stuff@lists.sourceforge.net
>>>> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>> > >
>>>>
>>>>
>>>> > _______________________________________________
>>>> > Apertium-stuff mailing list
>>>> > Apertium-stuff@lists.sourceforge.net
>>>> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>>>
>>>> --
>>>> Doktor Tommi A Pirinen, Computational Linguist,
>>>> <https://flammie.github.io/purplemonkeydishwasher/>, Universität
>>>> Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D
>>>> Entwickler.  President of ACL SIGUR SIG for Uralic languages
>>>> <http://gtweb.uit.no/sigur/>.
>>>> I tend to follow inline-posting style in desktop e-mail messages.
>>>> _______________________________________________
>>>> Apertium-stuff mailing list
>>>> Apertium-stuff@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>>
>>> _______________________________________________
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> _______________________________________________
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to