Thanks very much Chris ... its all working now.
You haven't by chance happen to have programmatically looped through a 
directory full of pdfs and used Tika to extract each of their pdf contents into 
separate text or xml files? If so, what do you recommend to do the extraction?
Kind regards
Richard 
> Date: Mon, 16 Jun 2014 23:03:49 -0700
> Subject: Re: Question re installing Tika
> From: [email protected]
> To: [email protected]; [email protected]
> CC: [email protected]
> 
> Hi Richard,
> 
> No problem at all, my attempted answers below:
> 
> 
> -----Original Message-----
> From: Richard <[email protected]>
> Date: Monday, June 16, 2014 3:47 PM
> To: Chris Mattmann <[email protected]>, "[email protected]"
> <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: RE: Question re installing Tika
> 
> >Thanks very much for responding to me, Chris. I hope you don't mind if I
> >ask a few more questions about the setup process which I have done to
> >date as follows (and by way of
> > background I have a Windows 7 64 bit pc):
> >
> >
> >1) I downloaded the tika-app-1.5.jar
> ><http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.5.jar> from
> >http://tika.apache.org/download.html
> >2) I was recommended by a friend to rename it to tika-app.jar, which I
> >have done, and placed it in my c:\Users\Myusername directory
> >3) I added the environment variable JAVA_HOME (as a system variable).
> >4) I then brought up the cmd window, changed directory to
> >c:\Users\Myusername and typed in  "java -jar tika-app.jar"
> >
> >
> >However the gui does not appear.
> 
> Yep, if you type java -jar tika-app.jar --help, you'll see the command
> line output and the switches.
> I believe to pull the GUI up you need to do:
> 
> java -jar tika-app.jar --gui
> 
> > 
> >
> >
> >I have the latest version of Java: Version 7 Update 60 but I was
> >wondering if I needed the Java SDK to run this?
> >
> >
> >Many thanks again for your help
> 
> No problem, see above :)
> 
> Cheers,
> Chris
> 
> >
> >
> >Richard
> >
> >
> >> From: [email protected]
> >> To: [email protected]; [email protected]
> >> CC: [email protected]
> >> Subject: Re: Question re installing Tika
> >> Date: Thu, 12 Jun 2014 03:27:20 +0000
> >> 
> >> Hi Richard,
> >> 
> >> Hope you are well, will try and answer below:
> >> 
> >> 
> >> -----Original Message-----
> >> 
> >> From: Richard <[email protected]>
> >> Date: Friday, June 6, 2014 6:07 AM
> >> To: "[email protected]" <[email protected]>,
> >> "[email protected]" <[email protected]>
> >> Subject: Question re installing Tika
> >> 
> >> >Hello
> >> > 
> >> >I am new to the Apache suite of products and dealing with text in pdfs,
> >> >more generally. In particular I am trying to install Tika (the
> >> >tika-app_1.5.jar) as well as Solr on my Windows 7 pc.
> >> >
> >> > 
> >> >However I am confused about how to do the Tika installation.
> >> >
> >> > 
> >> >From reading various webpages (eg
> >> >http://tika.apache.org/1.5/gettingstarted.html
> >> ><http://tika.apache.org/1.5/gettingstarted.html>) it seems I need to
> >> > 
> >> >1) 
> >> >Download the .jar from
> >> >http://tika.apache.org/download.html
> >> ><http://tika.apache.org/download.html> (do I need to put it in a
> >>specific
> >> >windows folder?)
> >> 
> >> Nope you don't have to put in any specific folder, wherever you are
> >> comfortable calling the jar from.
> >> 
> >> >2) 
> >> >Download Maven 2 (from http://maven.apache.org/ ) and follow up the
> >> >instructions for Windows on
> >> >http://maven.apache.org/download.cgi#Installation
> >> 
> >> No need to do this unless you are building from scratch.
> >> 
> >> >3) 
> >> >Also where do I set the base directory?
> >> 
> >> You just need to install Apache Tika and its *-app.jar file into some
> >> folder, and then
> >> call it by doing java -jar /path/to/tika-*version*-app.jar --help
> >> 
> >> > 
> >> >4) 
> >> >Where do I run the command ³mvn install² from? Is it the command line?
> >> 
> >> If you are building from source, then you would run this at the top
> >>level
> >> directory containing
> >> files like pom.xml, tika-parent, tika-parsers, etc.
> >> 
> >> >
> >> >
> >> >Any help would be most gratefully received.
> >> 
> >> Cheers!
> >> 
> >> Chris
> >> 
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Chief Architect
> >> Instrument Software and Science Data Systems Section (398)
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 168-519, Mailstop: 168-527
> >> Email: [email protected]
> >> WWW: http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Associate Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> 
> >> 
> >> 
> >> 
> >> >
> >> 
> >
> >
> >
> >
> 
> 
                                          

Reply via email to