Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "TikaJAXRS" page has been changed by HaydenYoung:
https://wiki.apache.org/tika/TikaJAXRS?action=diff&rev1=17&rev2=18

  
  = Extracting A Document From A URL =
  
- It is possible to use remote files with TikaJAXRS by downloading it via its 
URL first then piping it to the appropriate service:
+ It is possible to use a remote file with TikaJAXRS by downloading it via its 
URL first then piping it to the appropriate service:
  {{{
- curl "http://url/to/my.file"; | curl -X PUT -T - http://localhost:9998/meta
+ $ curl -s "http://url/to/my.file"; | curl -X PUT -T - 
http://localhost:9998/meta
- curl "http://url/to/my.file"; | curl -X PUT -T - http://localhost:9998/tika
+ $ curl -s "http://url/to/my.file"; | curl -X PUT -T - 
http://localhost:9998/tika
  }}}
  
  The caveat with above is that it fetches the entire file, so large files such 
as video can take some time to download. With services such as "meta" it may be 
faster to extract a remote file's header first using cURL:
  {{{
- curl -I http://url/to/my.file
+ $ curl -I http://url/to/my.file
  }}}
- If the file's contents is suitable for extraction (E.g. it is a PDF, word 
processing document or some other text file), send it on to TikaJAXRS:
+ If the file's content is suitable for extraction (E.g. content type is a PDF, 
word processing document or some other text file), send it on to TikaJAXRS:
  {{{
- curl "http://url/to/my.file"; | curl -X PUT -T - http://localhost:9998/tika
+ $ curl -s "http://url/to/my.file"; | curl -X PUT -T - 
http://localhost:9998/tika
  }}}
+ While the output of cURL's header information is not as cleanly formatted as 
TikaJAXRS's "meta" service, performance may outweigh this drawback.
  

Reply via email to