Re: Having Problem in Word Count and Language Detaction

2013-10-26 Thread Chris Mattmann
Hi Animesh, Please detail your issue here on dev@tika.apache.org and I'm sure someone can help. Cheers, Chris -Original Message- From: Animesh Kumar animesh.sa...@gmail.com Date: Wednesday, October 23, 2013 9:15 PM To: dev-ow...@tika.apache.org dev-ow...@tika.apache.org Subject: Fwd:

Re: Having Problem in Word Count and Language Detaction

2013-10-26 Thread Oleg Tikhonov
Hi Animesh, my wild guess is that N-gram profile for Chinese wasn't trained pretty well. Try recreate Chinese language profile. Have a look here: http://www.ibm.com/developerworks/opensource/tutorials/os-apache-tika/section6.html Hope it helps. On Sat, Oct 26, 2013 at 8:48 PM, Chris Mattmann

Re: Having Problem in Word Count and Language Detaction

2013-10-26 Thread Oleg Tikhonov
This one is better https://issues.apache.org/jira/browse/TIKA-546 On Sat, Oct 26, 2013 at 10:05 PM, Oleg Tikhonov o...@apache.org wrote: Hi Animesh, my wild guess is that N-gram profile for Chinese wasn't trained pretty well. Try recreate Chinese language profile. Have a look here: