Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "TikaJAXRS" page has been changed by maxcom:
http://wiki.apache.org/tika/TikaJAXRS?action=diff&rev1=7&rev2=8

Comment:
update tika-server documentation

  
  = Services =
  
+ Add services uses HTTP "PUT" request. Original file must be sent in request 
body without any additional encoding (do not use multipart/form-data or other 
containers).
+ 
+ You may optionally specify content type in "Content-Type" header. If you do 
not specify mime type, Tika will use its detectors to guess it.
+ 
+ You may specify additional identifier in URL after resource name, like 
"/tika/my-file-i-sent-to-tika-resource" for "/tika" resource. Tikaserver uses 
this name only for logging, so you may put there file name, UUID or any other 
identifier (do not forget to url-encode any special characters).
+ 
+ Resources may return following HTTP codes:
+ 
+ * 200 Ok - request completed sucessfully
+ * 204 No content - request completed sucessfully, result is empty
+ * 422 Unprocessable Entity - Unsupported mime-type, encrypted document & etc
+ * 500 Error - Error while processing document
+ 
  == Metadata Resource ==
  
  {{{
@@ -39, +52 @@

  
  {{{
  $ curl -X PUT -d @zipcode.csv http://localhost:9998/meta --header 
"Content-Type: text/csv"
+ $ curl -T price.xls http://localhost:9998/meta
  }}}
  
  Returns:
@@ -68, +82 @@

  
  {{{
  $ curl -X PUT -d @GeoSPARQL.pdf http://localhost:9998/tika --header 
"Content-type: application/pdf"
+ $ curl -T price.xls http://localhost:9998/tika
  }}}
  
  == Unpacker Resource ==
@@ -75, +90 @@

  {{{
  /unpacker
  }}}
- HTTP PUTs an embedded document type to the /unpacker service and you get back 
a zip of the extracted text for each resource filename in the original PUT 
embedded document type.
+ HTTP PUTs an embedded document type to the /unpacker service and you get back 
a zip or tar of the extracted text for each resource filename in the original 
PUT embedded document type.
+ 
+ Default return type is ZIP (without internal compression). Use "Accept" 
header for TAR return type.
  
  Some Example calls with cURL:
  
@@ -85, +102 @@

  $ curl -X PUT -d @foo.zip http://localhost:9998/unpacker --header 
"Content-type: application/zip"
  }}}
  
+ === PUT doc file and get back met file tar ===
+ 
+ {{{
+ $ curl -T Doc1_ole.doc -H "Accept: application/x-tar" 
http://localhost:9998/unpacker > /var/tmp/x.tar
+ }}}
+ 
+ == "All" resoure ==
+ 
+ Get text, metadata and attachements in one request.
+ 
+ {{{
+ $ curl -T Doc1_ole.doc http://localhost:9998/all > /var/tmp/x.zip
+ }}}
+ 
+ Text is stored in "__TEXT__" file, metadata cvs in "__METADATA__". Use 
"accept" header if you want TAR output.
+ 

Reply via email to