sorry I was away
You can start by getting the code from
code.google.com/p/nepaliwikipediatranslator


On Fri, Apr 13, 2012 at 8:56 PM, Bishisht Bhatta
<[email protected]>wrote:

> Can u provide me some starter points. Where to start I dont know. Sorry.
> But I want to do it.Please let me know asap. :D M damn happy to start this
> bro.
>
>
> On Fri, Apr 13, 2012 at 08:14, Rajesh Pandey <[email protected]>wrote:
>
>> This is great.
>> You can start right away. Just let me know if you need any help to start
>> with. I really need few people to join in the project so that we could work
>> together. Right now I want to have someone start translating the web
>> version of the translator into php/perl/python/java code.
>>
>>
>> On Fri, Apr 13, 2012 at 1:22 PM, Bishisht Bhatta <
>> [email protected]> wrote:
>>
>>> Hello there I am also intrested in this project. And I am also taking
>>> the cource through the courcera.org. Please let me know how can we
>>> start this project. I am very eager on this and want to start asap.
>>>
>>>
>>> On Thu, Apr 12, 2012 at 03:48, Rajesh Pandey <[email protected]>wrote:
>>>
>>>> @pravin: Thanks Pravin.
>>>> I am extracting them and creating files such as :
>>>>
>>>> http://code.google.com/p/nepaliwikipediatranslator/source/browse/trunk/NepaliWikiPediaTranslator/bin/Debug/nounlist.txt
>>>> from this. I will be extracting Nepali/Hindi and English texts.
>>>>
>>>> @Rakesh: You are welcome. Those NLP classes are great. I wish to get
>>>> some contributors for the translator, convert it to a web application, and
>>>> host somewhere. Lets see.
>>>>
>>>> On Thu, Apr 12, 2012 at 1:55 PM, rakesh bachchan <
>>>> [email protected]> wrote:
>>>>
>>>>> Well I am interested in this and will be happy to find myself in the
>>>>> group. I am also taking the online class of NLP currently being run by
>>>>> Stanford university(coursera.org).
>>>>>
>>>>> Thanking you
>>>>> Rakesh Kumar Bachchan
>>>>>
>>>>>   ------------------------------
>>>>> *From:* pravin joshi <[email protected]>
>>>>> *To:* FOSS Nepal <[email protected]>
>>>>> *Sent:* Thursday, 12 April 2012 6:53 AM
>>>>> *Subject:* [FOSS Nepal] Re: Natural language processing (Nepali)
>>>>>
>>>>> Just saw this mail thread. anyway below is Python code to extract all
>>>>> nepali words from the example of text you gave.
>>>>> # -*- coding: utf-8-*-
>>>>>
>>>>> data = """
>>>>> <page>
>>>>> [[en:Apple]]
>>>>> [[ne:स्याउ]]
>>>>> [[new:स्याउ]]
>>>>> [[hi:सेव]]
>>>>> [[fr:????]]
>>>>> </page>
>>>>> """
>>>>> def get_next_target(data):
>>>>>     start_link = data.find('[[ne:')
>>>>>     if start_link == -1:
>>>>>         return None, 0
>>>>>     start_quote = data.find('[[ne:', start_link)
>>>>>     end_quote = data.find(']]', start_quote + 1)
>>>>>     nepWord = data[start_quote + 1:end_quote]
>>>>>     nepWord = nepWord.split(":")[-1]
>>>>>     return nepWord, end_quote
>>>>>
>>>>> def get_all_nepData(data):
>>>>>     links = []
>>>>>     while True:
>>>>>         url, endpos = get_next_target(data)
>>>>>         if url:
>>>>>             links.append(url)
>>>>>             data = data[endpos:]
>>>>>         else:
>>>>>             break
>>>>>     return links
>>>>>
>>>>> if __name__ == "__main__":
>>>>>     t = get_all_nepData(data)--
>>>>>     for i in t:
>>>>>         print i
>>>>>
>>>>> Regarding autocomplete and word suggestion you might want to look at
>>>>> Bayes Theorem and using bulk text. You might want to read this paper
>>>>> thoroughly --- http://norvig.com/spell-correct.html
>>>>>
>>>>> Pravin
>>>>>
>>>>> On Apr 11, 10:00 am, Rajesh Pandey <[email protected]> wrote:
>>>>> > Hi Folks,
>>>>> > *"If any one of you are interested in this please reply, so that we
>>>>> could
>>>>> > work in this. "*
>>>>> >
>>>>> > I am interested to make a group of few people who would be
>>>>> interested in
>>>>> > data mining. If you are already involved in nlp-class.org. that
>>>>> would be
>>>>> > great as well.
>>>>> > Not to be confused with the word "data mining", The only thing we
>>>>> would do
>>>>> > is extract Nepali words from wiktionary database
>>>>> > dump<http://dumps.wikimedia.org/backup-index.html>where we would
>>>>> > extract Nepali words and save them so that they could be
>>>>> > used for various purposes.
>>>>> > For instance:
>>>>> > 1) Autocomplete
>>>>> > 2) Nepali corpus
>>>>> > 3) Nepali translator
>>>>> >
>>>>> > How "Autocomplete" works is providing suggestions while we start
>>>>> typing, if
>>>>> > we have a list of words, we can provide suggestions for the users.
>>>>> >
>>>>> > The Nepali corpus, which contains words which are tagged as "Noun",
>>>>> > "Adjective" etc can be created. I wish to use them in one of the
>>>>> "open
>>>>> > source translator for
>>>>> > Nepali<http://code.google.com/p/nepaliwikipediatranslator>"
>>>>> > in which I am also involved in.
>>>>> >
>>>>> > The database dump of Wiktionary has an XML file which contains a lot
>>>>> of
>>>>> > words and their English equivalents along with equivalents in other
>>>>> > available languages.
>>>>> >
>>>>> > For instance : There would be
>>>>> > <page>
>>>>> > [[en:Apple]]
>>>>> > [[ne:स्याउ]]
>>>>> > [[new:स्याउ]]
>>>>> > [[hi:सेव]]
>>>>> > [[fr:????]]
>>>>> > </page>
>>>>> >
>>>>> > etc
>>>>> > So we need to extract स्याउ and Apple or a list of स्याउ, केरा ,
>>>>> सुन्तला in
>>>>> > a file. So that we could suggest स्याउ when a user starts typing स
>>>>>  or
>>>>> > suggest केरा when a user starts writing क . This is autocomplete.
>>>>> >
>>>>> > When we have स्याउ and Apple, we will have a Nepali translator as
>>>>> well.
>>>>> >
>>>>> > ==================
>>>>> > Sorry for the ambiguous subject: Natural language processing: I
>>>>> could have
>>>>> > added a more specific title, or "Data mining" would have been another
>>>>> > subject. Thanks for your patience in reading this email :).
>>>>> > ======================
>>>>> > Want to create a web based php/python/java application [Nepali
>>>>> translator]
>>>>> > based on code.google.com/p/nepaliwikipediatranslator ?, You are
>>>>> welcome.
>>>>> > (Not .Net, because we already have a lot of stuff in .NET, and we are
>>>>> > looking for .net alternatives so that we could use them in Linux
>>>>> easily)
>>>>> > ======================
>>>>> > --
>>>>> > Rajesh Pandey
>>>>>
>>>>> --
>>>>> FOSS Nepal mailing list: [email protected]
>>>>> http://groups.google.com/group/foss-nepal
>>>>> To unsubscribe, e-mail: [email protected]
>>>>>
>>>>> Mailing List Guidelines:
>>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
>>>>> Community website: http://www.fossnepal.org/
>>>>>
>>>>>
>>>>>   --
>>>>> FOSS Nepal mailing list: [email protected]
>>>>> http://groups.google.com/group/foss-nepal
>>>>> To unsubscribe, e-mail: [email protected]
>>>>>
>>>>> Mailing List Guidelines:
>>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
>>>>> Community website: http://www.fossnepal.org/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Rajesh Pandey
>>>>
>>>> --
>>>> FOSS Nepal mailing list: [email protected]
>>>> http://groups.google.com/group/foss-nepal
>>>> To unsubscribe, e-mail: [email protected]
>>>>
>>>> Mailing List Guidelines:
>>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
>>>> Community website: http://www.fossnepal.org/
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Bishisht Bhatta
>>> Pepsicola Townplanning-35
>>> Kathmandu,Nepal
>>> +977-(980-641-6309)
>>> +977-(981-352-7344)
>>> +977-(984-984-9525)
>>>
>>>
>>> ******************************************************************************************
>>> Freelance Programmer
>>> Cheap Webhosting and Webdesign.
>>> Software Development and Maintenance.
>>>
>>> ******************************************************************************************
>>> Computer Engineering Student, Nepal College of Information Technology
>>> http://www.ncit.net.np/
>>> Balkumari, Lalitpur
>>>
>>>
>>> ******************************************************************************************
>>> Volunteer at Nepal Wireless Networking Project
>>> Plesae visit
>>> http://www.nepalwireless.net/
>>> http://himanchal.org/
>>>
>>>
>>>  --
>>> FOSS Nepal mailing list: [email protected]
>>> http://groups.google.com/group/foss-nepal
>>> To unsubscribe, e-mail: [email protected]
>>>
>>> Mailing List Guidelines:
>>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
>>> Community website: http://www.fossnepal.org/
>>>
>>
>>
>>
>> --
>> Rajesh Pandey
>>
>> --
>> FOSS Nepal mailing list: [email protected]
>> http://groups.google.com/group/foss-nepal
>> To unsubscribe, e-mail: [email protected]
>>
>> Mailing List Guidelines:
>> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
>> Community website: http://www.fossnepal.org/
>>
>
>
>
> --
> Regards,
> Bishisht Bhatta
> Pepsicola Townplanning-35
> Kathmandu,Nepal
> +977-(980-641-6309)
> +977-(981-352-7344)
> +977-(984-984-9525)
>
>
> ******************************************************************************************
> Freelance Programmer
> Cheap Webhosting and Webdesign.
> Software Development and Maintenance.
>
> ******************************************************************************************
> Computer Engineering Student, Nepal College of Information Technology
> http://www.ncit.net.np/
> Balkumari, Lalitpur
>
>
> ******************************************************************************************
> Volunteer at Nepal Wireless Networking Project
> Plesae visit
> http://www.nepalwireless.net/
> http://himanchal.org/
>
>  --
> FOSS Nepal mailing list: [email protected]
> http://groups.google.com/group/foss-nepal
> To unsubscribe, e-mail: [email protected]
>
> Mailing List Guidelines:
> http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
> Community website: http://www.fossnepal.org/
>



-- 
Rajesh Pandey

-- 
FOSS Nepal mailing list: [email protected]
http://groups.google.com/group/foss-nepal
To unsubscribe, e-mail: [email protected]

Mailing List Guidelines: 
http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
Community website: http://www.fossnepal.org/

Reply via email to