G'day...

https://github.com/GavinWz/Apertium does not solve the challenge. The point
is to categorize all of Unicode, not just ASCII. I would recommend using
ICU for it.

And the code is C. We use C++.

-- Tino Didriksen


On Fri, 28 Feb 2020 at 02:43, 杨伟哲 <gavinwzma...@gmail.com> wrote:

>  Hi list,
>
> I’m interested in the “Robust tokenisation in lttoolbox”[1] GSoC project.
> And
> currently I’m writing the proposal.
>
> I have completed the code challenge listed in the project, which has been
> put
> on Pastebin[2]. However, I’m not quite clear where this project starting
> with.
> And I will be much appreciate if you could list somewhere (e.g. GitHub repo
> related to this project) for me to get started with. I will also try to
> learn
> and solve issues there if possible.
>
> Bio: I’m Chinese undergraduate in Software Engineering. In my freshman
> year, I
> joined the high-performance computing center[3] of the university as a
> research
> assistant. Through research and learning during the period, I have a deep
> understanding of software architecture and open source projects.
>
>
> [1]
> http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Robust_tokenisation
>
> [2] https://github.com/GavinWz/Apertium
>
> [3] http://cs.wfu.edu.cn/2014/0603/c1227a33048/page.htm
>
>
> Regards,
>
> Weizhe Yang
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to