Re: Language Detection Library/Code
On Tue, Dec 28, 2010 at 12:42 AM, Shashwat Anand anand.shash...@gmail.com wrote: Regarding dictionary lookup+n-gram approach I didn't quite understand what you wanted to say. Run through trigram analysis first, if it identified multiple languages as being matches within the error margin then split the text into words, and look up each word in the respective dictionaries to get a second opinion. Katie -- CoderStack http://www.coderstack.co.uk/python-jobs The Software Developer Job Board -- http://mail.python.org/mailman/listinfo/python-list
Language Detection Library/Code
Can anyone suggest a *language detection library* in python which works on a phrase of say 2-5 words. -- ~l0nwlf -- http://mail.python.org/mailman/listinfo/python-list
Re: Language Detection Library/Code
On Mon, Dec 27, 2010 at 7:10 PM, Shashwat Anand anand.shash...@gmail.com wrote: Can anyone suggest a language detection library in python which works on a phrase of say 2-5 words. Generally such libraries work by bi/trigram frequency analysis, which means you're going to have a fairly high error rate with such small phrases. If you're only dealing with a handful of languages it may make more sense to combine an existing library with a simple dictionary lookup model to improve accuracy. Katie -- CoderStack http://www.coderstack.co.uk/perl-jobs-in-london The Software Developer Job Board -- http://mail.python.org/mailman/listinfo/python-list
Re: Language Detection Library/Code
On Tue, Dec 28, 2010 at 6:03 AM, Katie T ka...@coderstack.co.uk wrote: On Mon, Dec 27, 2010 at 7:10 PM, Shashwat Anand anand.shash...@gmail.com wrote: Can anyone suggest a language detection library in python which works on a phrase of say 2-5 words. Generally such libraries work by bi/trigram frequency analysis, which means you're going to have a fairly high error rate with such small phrases. If you're only dealing with a handful of languages it may make more sense to combine an existing library with a simple dictionary lookup model to improve accuracy. Katie Infact I'm dealing with very few languages - German, French, Italian, Portugese and Russian. I read papers mentioning bi/tri gram frequency but was unable to find any library. 'guess-language' doesn't perform at all. The cld (Compact Language Detection) module of Google chrome performs well but it is not a standalone library ( I hope someone ports it ). Regarding dictionary lookup+n-gram approach I didn't quite understand what you wanted to say. -- http://mail.python.org/mailman/listinfo/python-list
Re: Language Detection Library/Code
Hi I already Developed a language detection with Python Here is the Link. With Regards, Santhosh V.Kumar -- http://mail.python.org/mailman/listinfo/python-list
Re: Language Detection Library/Code
Hi I already Developed a language detection with Python Here is the Link. http://code.google.com/p/langdet/ With Regards, Santhosh V.Kumar -- http://mail.python.org/mailman/listinfo/python-list