Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaJAXRS" page has been changed by HaydenYoung: https://wiki.apache.org/tika/TikaJAXRS?action=diff&rev1=16&rev2=17 $ curl -T Doc1_ole.doc -H "Accept: application/x-tar" http://localhost:9998/unpacker > /var/tmp/x.tar }}} == "All" resource == - Get text, metadata and attachements in one request. + Get text, metadata and attachments in one request. {{{ $ curl -T Doc1_ole.doc http://localhost:9998/all > /var/tmp/x.zip }}} Text is stored in {{{__TEXT__}}} file, metadata cvs in {{{__METADATA__}}}. Use "accept" header if you want TAR output. + = Extracting A Document From A URL = + + It is possible to use remote files with TikaJAXRS by downloading it via its URL first then piping it to the appropriate service: + {{{ + curl "http://url/to/my.file" | curl -X PUT -T - http://localhost:9998/meta + curl "http://url/to/my.file" | curl -X PUT -T - http://localhost:9998/tika + }}} + + The caveat with above is that it fetches the entire file, so large files such as video can take some time to download. With services such as "meta" it may be faster to extract a remote file's header first using cURL: + {{{ + curl -I http://url/to/my.file + }}} + If the file's contents is suitable for extraction (E.g. it is a PDF, word processing document or some other text file), send it on to TikaJAXRS: + {{{ + curl "http://url/to/my.file" | curl -X PUT -T - http://localhost:9998/tika + }}} +
