looks like a great example to put on the website too ;)

------------------------
Chris Mattmann
[email protected]




-----Original Message-----
From: Nick Burch <[email protected]>
Reply-To: <[email protected]>
Date: Thursday, June 26, 2014 5:23 AM
To: "[email protected]" <[email protected]>
Subject: RE: Question re installing Tika

>On Thu, 26 Jun 2014, Richard wrote:
>> You haven't by chance happen to have programmatically looped through a
>> directory full of pdfs and used Tika to extract each of their pdf
>> contents into separate text or xml files? If so, what do you recommend
>> to do the extraction?
>
>For a proof of concept, how about something simple like a bash for loop
>and the tika app?
>
>for i in *.pdf; do j=`echo "$i" | sed 's/.pdf//'`; java -jar tika-app.jar
>   --text "$i" > "$j.txt"; java -jar tika-app.jar --xml "$i" > "$j.xml";
>done
>
>Nick


Reply via email to