Hi Dave,
thanks for the clarification -- the documentation isn't exactly helpful
here.
However.... still no success with the tika-app. Here's what I tried
(bear with me, I'm on a windows system...)
- ran java -jar tika-app-1.2.jar -t -s 9998
- said "nc localhost 9998 < somefile.pdf"
- doesn't come back.
- ran java -jar tika-app-1.2.tar -t -p 9998 [as per your mail, albeit
undocumented...]
- said "nc localhost 9998 < somefile.pdf"
- doesn't come back
- hit ^C in the nc terminal window --> got a "connection reset"
exception in the tika-app terminal window...
- using python: opened an AF_INET/SOCK_STREAM socket, connected to
localhost/9998, sent the contents of a pdf file, tried to recv(1024) -->
timeout
What am I doing wrong?
If I can rely on the fact that tika-server will give me the text UTF-8
encoded, then that would be fine with me (since I cannot specify the
encoding). That's why I would like to use tika-app, because here I can
say "-eutf8".
confused,
Oliver
On 22.07.2012 01:42, Dave Meikle wrote:
Hi Oliver,
Wondering if you are getting confused between the Tika Application in
server mode (-s or -p option) which allows socket level communication
and the Tika Server which allows REST-ful communication.
Using the Tika Application you can use the server mode to perform
extraction via a TCP socket. For example, the commands below will
extract the contents of the sample file:
/java -jar tika-app-1.2.jar -t -s 9998 &/
/nc localhost 9998 < samplefile.pdf/
Using the Tika Server you can use Tika using REST-full calls. For
example, the commands below will extract the contents of a sample file
via a HTTP PUT:
/java -jar tika-server-1.2.jar &/
/curl -T samplefile.pdf http://localhost:9998/tika/
You can find out more about the JSR-311 Tika Server here:
http://wiki.apache.org/tika/TikaJAXRS
Cheers,
Dave
On 21 Jul 2012, at 17:08, Oliver Steinau wrote:
Hi,
I want to run tika-app-1.2.jar in server mode (using java -jar
tika-app-1.2.jar -t -s 9998), but it doesn't respond to any request.
Using netstat I see that it listens on port 9998 and accepts an
incoming request, but other than that nothing happens.
Running it in gui or CLI mode works just fine, though.
I downloaded an old version of the tika-server
(tika-server-1.0-20110309.180805-5.jar), and running this works just
fine. However, I would of course like to use the newest version...
I'm on a 64bit Windows 7 system, and "java -version" says:
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Any help would be greatly appreciated!
Thanks,
Oliver