Tika language extraction

2010-06-10 Thread Sandhya Agarwal
Hello, It is observed that TIKA does not extract the Content-Language for documents encoded in UTF-8. For natively encoded documents, it works fine. Any idea on how we can resolve this ? Thanks, Sandhya

Re: Tika language extraction

2010-06-10 Thread Ken Krugler
Hi Sandhya, It is observed that TIKA does not extract the Content-Language for documents encoded in UTF-8. For natively encoded documents, it works fine. Any idea on how we can resolve this ? I would post this question to the u...@tika.apache.org mailing list, and include more details on