On 11 March 2013 18:04, sphinx jiang <[email protected]> wrote: > Hi, > > I would like to suggest an idea for Apertium GSOC program. Several days age > I talked to Jimmy, and was enlightened by the idea "Segmentation by itself". >
Sorry, I wasn't clear enough. The idea is "segmentation". I said that segmentation by itself would probably make a good project, where "by itself" was intended to mean that the project would just be segmentation. In practice, you will also have to work on a language pair where this can be used. zh_ZH-zh_TW is a perfect candidate, because segmentation is not strictly necessary for this language pair - i.e., you use it to demonstrate that segmentation is working, without _needing_ to. In that regard, you will need to also allot some time to developing that language pair, though it will not be the primary focus of the project. > The Hierarchical HMM for segmentation ports, especially the > "imdict-chinese-analyzer", which is for Chinese segment, wrote in Java, I > think it can be transplant to C++, and used for Apertium . Then we can > fulfill the program translate Chinese-ZH to Chinese -TW by self segment. > > Is my idea possible to achieve? I am looking forward to your reply~~ A straightforward port will not be sufficient. The module will, at the very least, also need to handle the Apertium stream format. Your proposal should take this into account. -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
