Re: [Apertium-stuff] Registration for wiki page

2020-03-27 Thread Ayush
Hey, I want to submit a proposal for robust tokenisation task and requesting for a wiki page with name ayushPradhan or ayush0209. Thanks and regards, Ayush Pradhan From: Flammie A Pirinen Sent: 23 March 2020 05:56 PM To: apertium-stuff@lists.sourceforge.net Subject: Re: [Apertium-stuff]

[Apertium-stuff] GSoC Proposal Evaluation

2020-03-27 Thread Rajarshi Roychoudhury
Hi, I have made some changes in the proposal due to some errors and to incorporate the minor changes in the GSoC timeline . Kindly review the proposal and give me your feedback on whether it is ready to be submitted. http://wiki.apertium.org/wiki/User:Rroychoudhury/GSoC_2020_Proposal Best

Re: [Apertium-stuff] Please review my proposal draft

2020-03-27 Thread 杨伟哲
IMO the most difficult thing to tokenize for CJK, especially Chinese, will be the segmentation of words. Because they don't separate characters and words by delimiters. They always appear as a string of characters and words. Another problem is that in Chinese, the same sentence can be interpreted

Re: [Apertium-stuff] Guidance for hin-pan language pair development

2020-03-27 Thread Priyank Modi
Hi all, I've completed the preliminary draft of my proposal and would really appreciate your comments/suggestions on the same : http://wiki.apertium.org/wiki/Pmodi/GSOC_2020_proposal:_Hindi-Punjabi Francis(firstly sorry for cc'ing you personally), since you have been managing the repo, could you

Re: [Apertium-stuff] Please review my proposal draft

2020-03-27 Thread Tommi A Pirinen
On Fri, Mar 27, 2020 at 09:58:53AM +0800, 杨伟哲 wrote: > > Of course, as a Chinese student, I would also be very happy to work > on the CJK. We can keep communicating about the tweaks of the plan > and the other details. Awesome, could you perhaps then make even a small example of how apertium

Re: [Apertium-stuff] GSOC 2020 idea

2020-03-27 Thread Rajarshi Roychoudhury
I have modified the proposal for better explanation of the process. Kindly give a look at it. The bilingual dictionary needs some work to be done, I didn't time to complete it as I was busy determining the sentiment tag . I will try to incorporate it as soon as possible. Please suggest if any

Re: [Apertium-stuff] GSOC 2020 idea

2020-03-27 Thread Rajarshi Roychoudhury
The sentiment tags will help to form more detailed and diverse patterns which can help to form better rules to disambiguate, lexical selection and reorder . As far as those languages where sentiwordnet does not exist, a linguist will be able to determine sentiment polarity. Since i have the

Re: [Apertium-stuff] GSOC 2020 idea

2020-03-27 Thread Tanmai Khanna
Hey I have one doubt, The examples given for mistranslation, I didn't quite understand how sentiment analysis would fix those. Also what about languages for which a SentiWordNet doesn't exist? Thanks and Regards, Tanmai On Fri, Mar 27, 2020 at 3:56 PM Rajarshi Roychoudhury <

Re: [Apertium-stuff] GSOC 2020 idea

2020-03-27 Thread Rajarshi Roychoudhury
Hi, I have finished writing my proposal , wrote a code on how to do sentiment analysis with character embedding as a coding challenge, added words to monolingual and bilingual dictionaries and designed a constraint grammar. I am working on building the bidix and lrx files for now.. Would be very

Re: [Apertium-stuff] GSoC--Apertium Website Development

2020-03-27 Thread Mohit Kumar Verma
Hi So now, I have my GSoC proposal. Please let me know what kind of things are missing and which things I should exclude. Here is the link http://wiki.apertium.org/wiki/User:Yaimgr8/GSoC_2020_Proposal Thanks On Fri, Mar 27, 2020 at 3:27 AM Mohit Kumar Verma wrote: > Thanks. This is what I was