[
https://issues.apache.org/jira/browse/TIKA-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved TIKA-1260.
---------------------------------
Resolution: Not A Problem
Fix Version/s: (was: 1.5)
What you're seeing is the result of using the file name as a hint of the type
of the file. If the file name ends in {{.txt}} or some similar suffix, it
probably should be treated as a text file, even if it doesn't contain anything.
Only when no such hints are available will Tika fall back to
{{application/octet-stream}}. See:
{code}
$ touch empty.txt
$ java -jar tika-app-1.5.jar --detect empty.txt
text/plain
$ java -jar tika-app-1.5.jar --detect < empty.txt
application/octet-stream
{code}
> Detection result for zero-byte files is text/plain
> --------------------------------------------------
>
> Key: TIKA-1260
> URL: https://issues.apache.org/jira/browse/TIKA-1260
> Project: Tika
> Issue Type: Bug
> Components: detector
> Affects Versions: 1.5
> Environment: Linux Mint 16
> Reporter: Johan van der Knijff
> Priority: Minor
> Labels: empty, zero-length
>
> Running Tika with the -d (detection) option, any zero-byte files are
> identified as "text/plain". I'm wondering if this is the intended behavior? I
> know the Unix File tool reports "inode/x-empty" in such cases. Perhaps Tika
> should do this as well?
--
This message was sent by Atlassian JIRA
(v6.2#6252)