Tim Allison created TIKA-3527:
---------------------------------
Summary: Add simple URLFetcher to tika-core
Key: TIKA-3527
URL: https://issues.apache.org/jira/browse/TIKA-3527
Project: Tika
Issue Type: Task
Reporter: Tim Allison
In 1.x, users could send a URL including a file url to tika-server and have
tika-server fetch the bytes. In 2.x, we created the tika-pipes modules and
included a file fetcher in tika-core and put an http-fetcher in its own module
because of its dependency on httpclient.
To smooth the transition to 2.x, it might be useful to add a URLFetcher that
uses the built-in basic Java URL.getConnection() functionality. I'd want to
prohibit the file protocol because of the history with that as a vulnerability.
If folks want to fetch files, they have to explicitly choose a different
fetcher and specify a base path.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)