On Fri, 5 Feb 2016, Steven White wrote:
I went over to Tika's home page and tried to figure out what are the JARs I
need (so I don't have to use Tika's JARs that come with Solr). I looked
around and couldn't find a "dist" of the JARs.
There isn't one - it's expected that you'll be using Maven
unning Solr
> inside foreign Application Servers. So everything should work out of box.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *Fr
:44 PM
To: user@tika.apache.org
Subject: Re: Using Tika that comes with Solr 5.2
Nick, that would be a good think to do: changing Ignore to Warn otherwise
newcomers will have no clue why this isn't working.
Another question to the team regarding this topic.
I see JARs under s
Nick, that would be a good think to do: changing Ignore to Warn otherwise
newcomers will have no clue why this isn't working.
Another question to the team regarding this topic.
I see JARs under solr\contrib\morphlines-cell\lib\ and
solr\contrib\morphlines-core\lib\ The ones under "morphlines-cel
On Wed, 3 Feb 2016, Uwe Schindler wrote:
The reason for this behaviour is part of TIKA: If a parser cannot load
because of classes it refers to are missing, it is automatically
disabled. Because you missed the actual PDF/Powerpoint/… classes, this
is what happens for all those parsers.
I wond
.-Meier-Allee 63, D-28213 Bremen
<http://www.thetaphi.de/> http://www.thetaphi.de
eMail: u...@thetaphi.de
From: Steven White [mailto:swhite4...@gmail.com]
Sent: Wednesday, February 03, 2016 2:48 PM
To: user@tika.apache.org
Subject: Re: Using Tika that comes with Solr 5.2
Thanks ev
Thanks everyone. After posting about this issue, I found my issue. I was
missing a whole set of Tika JARs that are found under Solr:
\solr\contrib\extraction\lib\
Steve
On Wed, Feb 3, 2016 at 8:29 AM, Nick Burch wrote:
> On Tue, 2 Feb 2016, Steven White wrote:
>
>> What I'm finding is that Ti
On Tue, 2 Feb 2016, Steven White wrote:
What I'm finding is that Tika will not extract the raw text off PDF,
Powerpoint, ets. files but it will off raw text files.
I'd suggest you try some of the steps in the troubleshooting page:
http://wiki.apache.org/tika/Troubleshooting%20Tika
Probably st
The problem (I think) is that tika-parsers.jar includes just the Tika parsers
(wrappers) around a boatload of actual parsers/dependencies (POI, PDFBox, etc).
If you are using jars, I’d recommend the tika-app.jar which includes all
dependencies.
From: Steven White [mailto:swhite4...@gmail.com]
S