Cool. Sounds like you are ahead of the game.  

Sent from my iPhone

On Oct 10, 2013, at 13:15, Dean Jones <[email protected]> wrote:

> On 10 October 2013 12:46, Ted Dunning <[email protected]> wrote:
>> For language detection, you are going to have a hard time doing better than
>> one of the standard packages for the purpose.  See here:
>> 
>> http://blog.mikemccandless.com/2011/10/accuracy-and-performance-of-googles.html
> 
> Thanks for the pointer Ted. I'm a big fan of the Tika project, we use
> it for content extraction already. For various reasons though, we have
> rolled our own language detector (mainly, neither of these packages
> cover all of the languages we need to identify - language-detection
> doesn't do Catalan, Tika doesn't do Welsh).
> 
> Dean.

Reply via email to