Hi Folks,
*"If any one of you are interested in this please reply, so that we could
work in this. "*

I am interested to make a group of few people who would be interested in
data mining. If you are already involved in nlp-class.org. that would be
great as well.
Not to be confused with the word "data mining", The only thing we would do
is extract Nepali words from wiktionary database
dump<http://dumps.wikimedia.org/backup-index.html>where we would
extract Nepali words and save them so that they could be
used for various purposes.
For instance:
1) Autocomplete
2) Nepali corpus
3) Nepali translator

How "Autocomplete" works is providing suggestions while we start typing, if
we have a list of words, we can provide suggestions for the users.

The Nepali corpus, which contains words which are tagged as "Noun",
"Adjective" etc can be created. I wish to use them in one of the "open
source translator for
Nepali<http://code.google.com/p/nepaliwikipediatranslator>"
in which I am also involved in.

The database dump of Wiktionary has an XML file which contains a lot of
words and their English equivalents along with equivalents in other
available languages.

For instance : There would be
<page>
[[en:Apple]]
[[ne:स्याउ]]
[[new:स्याउ]]
[[hi:सेव]]
[[fr:????]]
</page>

etc
So we need to extract स्याउ and Apple or a list of स्याउ, केरा , सुन्तला in
a file. So that we could suggest स्याउ when a user starts typing स  or
suggest केरा when a user starts writing क . This is autocomplete.

When we have स्याउ and Apple, we will have a Nepali translator as well.

==================
Sorry for the ambiguous subject: Natural language processing: I could have
added a more specific title, or "Data mining" would have been another
subject. Thanks for your patience in reading this email :).
======================
Want to create a web based php/python/java application [Nepali translator]
based on code.google.com/p/nepaliwikipediatranslator ?, You are welcome.
(Not .Net, because we already have a lot of stuff in .NET, and we are
looking for .net alternatives so that we could use them in Linux easily)
======================
-- 
Rajesh Pandey

-- 
FOSS Nepal mailing list: [email protected]
http://groups.google.com/group/foss-nepal
To unsubscribe, e-mail: [email protected]

Mailing List Guidelines: 
http://wiki.fossnepal.org/index.php?title=Mailing_List_Guidelines
Community website: http://www.fossnepal.org/

Reply via email to