Alex Ott
Thu, 03 Dec 2009 23:49:55 -0800
Re Current snapshot of tika (0.6) process this file correctly, returning 在么 as text
Li Leon at "Fri, 4 Dec 2009 11:04:58 +0800" wrote: LL> Hi all, LL> LL> LL> I'm using the following command to filter out the attached doc which is in Chinese. The doc was filtered fine but only with gibberish output. LL> Any ideas? LL> LL> "type "chinese char.doc" | java -jar "tika-app-0.4.jar" -x" -- With best wishes, Alex Ott, MBA http://alexott.blogspot.com/ http://xtalk.msk.su/~ott/ http://alexott-ru.blogspot.com/