Hi,
To be absolutely sure that only Tika is used you should also remove the
parse-html plugin from plugin.includes. Make sure all references to the
parse-html plugin are removed from the parse-plugins.xml. (Looking at your
snippet it seems as this is the case).
With Tika itself or Boilerpipe I'm
Sorry, I think it works. I was trying 'parsechecker' and it doesn't apply
'regexnormalizer' rules by default.
So, case solved, thanks a lot!
On Sunday, September 9, 2012, Sebastian Nagel wrote:
Redirects are filtered and normalized. It works for 1.4/1.5 and should for
trunk.
One subtlety:
please un-subscribe me
Can you pls let me know how you solved your problem?
I am also getting the same error which you had.
Getting the index with pdf's file name but not the content in those
--
View this message in context:
4 matches
Mail list logo