[
https://issues.apache.org/jira/browse/TIKA-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542682#comment-17542682
]
Tom Brisland commented on TIKA-3776:
------------------------------------
Yeah sounds good if that's the way you want to go, it does seem a shame to have
a duplicate method though. I think that the ideal solution would be to have
TikaInputStream.get not overwrite the filename if it's already set, but
obviously I didn't want to risk breakage.
I went with passing empty Metadata because it matches the behaviour in
S3Fetcher, so there was precedent at least.
> HttpFetcher overwrites filename passed in
> -----------------------------------------
>
> Key: TIKA-3776
> URL: https://issues.apache.org/jira/browse/TIKA-3776
> Project: Tika
> Issue Type: Bug
> Reporter: Tom Brisland
> Assignee: Tim Allison
> Priority: Major
>
> The HttpFetcher spools file content to a temporary file with
> `TikaInputStream.get()` and passes through the existing metadata.
> TikaInputStream overwrites the filename with that of the temporary file.
> This means that passing the filename to the detect endpoint as a header is
> ignored and results from detection are incorrect.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)