Thanks Rajani. That would be great. @Bisisht: ------------------- I am adding you to the members list. You can commit the code. Here.
svn checkout *https*://nepaliwikipediatranslator.googlecode.com/svn/trunk/nepaliwikipediatranslator --username [email protected] alternatively you can browse the codes here : http://code.google.com/p/nepaliwikipediatranslator/source/browse/#svn%2Ftrunk I think we should talk off list for more. ------------------- On Fri, Apr 13, 2012 at 10:29 PM, Rajesh Pandey <[email protected]>wrote: > sorry I was away > You can start by getting the code from > code.google.com/p/nepaliwikipediatranslator > > > > On Fri, Apr 13, 2012 at 8:56 PM, Bishisht Bhatta < > [email protected]> wrote: > >> Can u provide me some starter points. Where to start I dont know. Sorry. >> But I want to do it.Please let me know asap. :D M damn happy to start this >> bro. >> >> >> On Fri, Apr 13, 2012 at 08:14, Rajesh Pandey <[email protected]>wrote: >> >>> This is great. >>> You can start right away. Just let me know if you need any help to start >>> with. I really need few people to join in the project so that we could work >>> together. Right now I want to have someone start translating the web >>> version of the translator into php/perl/python/java code. >>> >>> >>> On Fri, Apr 13, 2012 at 1:22 PM, Bishisht Bhatta < >>> [email protected]> wrote: >>> >>>> Hello there I am also intrested in this project. And I am also taking >>>> the cource through the courcera.org. Please let me know how can we >>>> start this project. I am very eager on this and want to start asap. >>>> >>>> >>>> On Thu, Apr 12, 2012 at 03:48, Rajesh Pandey >>>> <[email protected]>wrote: >>>> >>>>> @pravin: Thanks Pravin. >>>>> I am extracting them and creating files such as : >>>>> >>>>> http://code.google.com/p/nepaliwikipediatranslator/source/browse/trunk/NepaliWikiPediaTranslator/bin/Debug/nounlist.txt >>>>> from this. I will be extracting Nepali/Hindi and English texts. >>>>> >>>>> @Rakesh: You are welcome. Those NLP classes are great. I wish to get >>>>> some contributors for the translator, convert it to a web application, and >>>>> host somewhere. Lets see. >>>>> >>>>> On Thu, Apr 12, 2012 at 1:55 PM, rakesh bachchan < >>>>> [email protected]> wrote: >>>>> >>>>>> Well I am interested in this and will be happy to find myself in the >>>>>> group. I am also taking the online class of NLP currently being run by >>>>>> Stanford university(coursera.org). >>>>>> >>>>>> Thanking you >>>>>> Rakesh Kumar Bachchan >>>>>> >>>>>> ------------------------------ >>>>>> *From:* pravin joshi <[email protected]> >>>>>> *To:* FOSS Nepal <[email protected]> >>>>>> *Sent:* Thursday, 12 April 2012 6:53 AM >>>>>> *Subject:* [FOSS Nepal] Re: Natural language processing (Nepali) >>>>>> >>>>>> Just saw this mail thread. anyway below is Python code to extract all >>>>>> nepali words from the example of text you gave. >>>>>> # -*- coding: utf-8-*- >>>>>> >>>>>> data = """ >>>>>> <page> >>>>>> [[en:Apple]] >>>>>> [[ne:स्याउ]] >>>>>> [[new:स्याउ]] >>>>>> [[hi:सेव]] >>>>>> [[fr:????]] >>>>>> </page> >>>>>> """ >>>>>> def get_next_target(data): >>>>>> start_link = data.find('[[ne:') >>>>>> if start_link == -1: >>>>>> return None, 0 >>>>>> start_quote = data.find('[[ne:', start_link) >>>>>> end_quote = data.find(']]', start_quote + 1) >>>>>> nepWord = data[start_quote + 1:end_quote] >>>>>> nepWord = nepWord.split(":")[-1] >>>>>> return nepWord, end_quote >>>>>> >>>>>> def get_all_nepData(data): >>>>>> links = [] >>>>>> while True: >>>>>> url, endpos = get_next_target(data) >>>>>> if url: >>>>>> links.append(url) >>>>>> data = data[endpos:] >>>>>> else: >>>>>> break >>>>>> return links >>>>>> >>>>>> if __name__ == "__main__": >>>>>> t = get_all_nepData(data)-- >>>>>> for i in t: >>>>>> print i >>>>>> >>>>>> Regarding autocomplete and word suggestion you might want to look at >>>>>> Bayes Theorem and using bulk text. You might want to read this paper >>>>>> thoroughly --- http://norvig.com/spell-correct.html >>>>>> >>>>>> Pravin >>>>>> >>>>>> On Apr 11, 10:00 am, Rajesh Pandey <[email protected]> wrote: >>>>>> > Hi Folks, >>>>>> > *"If any one of you are interested in this please reply, so that we >>>>>> could >>>>>> > work in this. "* >>>>>> > >>>>>> > I am interested to make a group of few people who would be >>>>>> interested in >>>>>> > data mining. If you are already involved in nlp-class.org. that >>>>>> would be >>>>>> > great as well. >>>>>> > Not to be confused with the word "data mining", The only thing we >>>>>> would do >>>>>> > is extract Nepali words from wiktionary database >>>>>> > dump<http://dumps.wikimedia.org/backup-index.html>where we would >>>>>> > extract Nepali words and save them so that they could be >>>>>> > used for various purposes. >>>>>> > For instance: >>>>>> > 1) Autocomplete >>>>>> > 2) Nepali corpus >>>>>> > 3) Nepali translator >>>>>> > >>>>>> > How "Autocomplete" works is providing suggestions while we start >>>>>> typing, if >>>>>> > we have a list of words, we can provide suggestions for the users. >>>>>> > >>>>>> > The Nepali corpus, which contains words which are tagged as "Noun", >>>>>> > "Adjective" etc can be created. I wish to use them in one of the >>>>>> "open >>>>>> > source translator for >>>>>> > Nepali<http://code.google.com/p/nepaliwikipediatranslator>" >>>>>> > in which I am also involved in. >>>>>> > >>>>>> > The database dump of Wiktionary has an XML file which contains a >>>>>> lot of >>>>>> > words and their English equivalents along with equivalents in other >>>>>> > available languages. >>>>>> > >>>>>> > For instance : There would be >>>>>> > <page> >>>>>> > [[en:Apple]] >>>>>> > [[ne:स्याउ]] >>>>>> > [[new:स्याउ]] >>>>>> > [[hi:सेव]] >>>>>> > [[fr:????]] >>>>>> > </page> >>>>>> > >>>>>> > etc >>>>>> > So we need to extract स्याउ and Apple or a list of स्याउ, केरा , >>>>>> सुन्तला in >>>>>> > a file. So that we could suggest स्याउ when a user starts typing स >>>>>> or >>>>>> > suggest केरा when a user starts writing क . This is autocomplete. >>>>>> > >>>>>> > When we have स्याउ and Apple, we will have a Nepali translator as >>>>>> well. >>>>>> > >>>>>> > ================== >>>>>> > Sorry for the ambiguous subject: Natural language processing: I >>>>>> could have >>>>>> > added a more specific title, or "Data mining" would have been >>>>>> another >>>>>> > subject. Thanks for your patience in reading this email :). >>>>>> > ====================== >>>>>> > Want to create a web based php/python/java application [Nepali >>>>>> translator] >>>>>> > based on code.google.com/p/nepaliwikipediatranslator ?, You are >>>>>> welcome. >>>>>> > (Not .Net, because we already have a lot of stuff in .NET, and we >>>>>> are >>>>>> > looking for .net alternatives so that we could use them in Linux >>>>>> easily) >>>>>> > ====================== >>>>>> > -- >>>>>> > Rajesh Pandey >>>>>> >>>>>> -- >>>>>> FOSS Nepal mailing list: [email protected] >>>>>> http://groups.google.com/group/foss-nepal >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> >>>>>> Mailing List Guidelines: >>>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >>>>>> Community website: http://www.fossnepal.org/ >>>>>> >>>>>> >>>>>> -- >>>>>> FOSS Nepal mailing list: [email protected] >>>>>> http://groups.google.com/group/foss-nepal >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> >>>>>> Mailing List Guidelines: >>>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >>>>>> Community website: http://www.fossnepal.org/ >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Rajesh Pandey >>>>> >>>>> -- >>>>> FOSS Nepal mailing list: [email protected] >>>>> http://groups.google.com/group/foss-nepal >>>>> To unsubscribe, e-mail: [email protected] >>>>> >>>>> Mailing List Guidelines: >>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >>>>> Community website: http://www.fossnepal.org/ >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Bishisht Bhatta >>>> Pepsicola Townplanning-35 >>>> Kathmandu,Nepal >>>> +977-(980-641-6309) >>>> +977-(981-352-7344) >>>> +977-(984-984-9525) >>>> >>>> >>>> ****************************************************************************************** >>>> Freelance Programmer >>>> Cheap Webhosting and Webdesign. >>>> Software Development and Maintenance. >>>> >>>> ****************************************************************************************** >>>> Computer Engineering Student, Nepal College of Information Technology >>>> http://www.ncit.net.np/ >>>> Balkumari, Lalitpur >>>> >>>> >>>> ****************************************************************************************** >>>> Volunteer at Nepal Wireless Networking Project >>>> Plesae visit >>>> http://www.nepalwireless.net/ >>>> http://himanchal.org/ >>>> >>>> >>>> -- >>>> FOSS Nepal mailing list: [email protected] >>>> http://groups.google.com/group/foss-nepal >>>> To unsubscribe, e-mail: [email protected] >>>> >>>> Mailing List Guidelines: >>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >>>> Community website: http://www.fossnepal.org/ >>>> >>> >>> >>> >>> -- >>> Rajesh Pandey >>> >>> -- >>> FOSS Nepal mailing list: [email protected] >>> http://groups.google.com/group/foss-nepal >>> To unsubscribe, e-mail: [email protected] >>> >>> Mailing List Guidelines: >>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >>> Community website: http://www.fossnepal.org/ >>> >> >> >> >> -- >> Regards, >> Bishisht Bhatta >> Pepsicola Townplanning-35 >> Kathmandu,Nepal >> +977-(980-641-6309) >> +977-(981-352-7344) >> +977-(984-984-9525) >> >> >> ****************************************************************************************** >> Freelance Programmer >> Cheap Webhosting and Webdesign. >> Software Development and Maintenance. >> >> ****************************************************************************************** >> Computer Engineering Student, Nepal College of Information Technology >> http://www.ncit.net.np/ >> Balkumari, Lalitpur >> >> >> ****************************************************************************************** >> Volunteer at Nepal Wireless Networking Project >> Plesae visit >> http://www.nepalwireless.net/ >> http://himanchal.org/ >> >> -- >> FOSS Nepal mailing list: [email protected] >> http://groups.google.com/group/foss-nepal >> To unsubscribe, e-mail: [email protected] >> >> Mailing List Guidelines: >> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines >> Community website: http://www.fossnepal.org/ >> > > > > -- > Rajesh Pandey > -- Rajesh Pandey -- FOSS Nepal mailing list: [email protected] http://groups.google.com/group/foss-nepal To unsubscribe, e-mail: [email protected] Mailing List Guidelines: http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines Community website: http://www.fossnepal.org/
