Hi all,

I just started using tika. I tried to extract English words in html files,
it works fine?
And I try to integrate a Chinese words tokenizer into solr, and search
again, many previous hitted english words does not hit anymore.

Is there already a solution from tika to extract chinese content within a
html file?

Thanks in advance.

Reply via email to