[
https://issues.apache.org/jira/browse/TIKA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Micka updated TIKA-1632:
------------------------------
Description:
In our environment we encounter many compressed streams, one of them (which is
currently not supported by Tika 1.8) is ZLIB. According to my sources and
experience the magics that cover majority of ZLIB archives are these:
<mime-type type="application/zlib">
<_comment>Zlib Compressed Archive</_comment>
<magic priority="45">
<match value="\x78\x01" type="string" offset="0" />
<match value="\x78\x9c" type="string" offset="0" />
<match value="\x78\xda" type="string" offset="0" />
</magic>
</mime-type>
Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950
was:
In our environment we encounter many compressed streams, one of them (which is
currently not supported by Tika) is ZLIB. According to my sources and
experience the magics that cover majority of ZLIB archives are these:
<mime-type type="application/zlib">
<_comment>Zlib Compressed Archive</_comment>
<magic priority="45">
<match value="\x78\x01" type="string" offset="0" />
<match value="\x78\x9c" type="string" offset="0" />
<match value="\x78\xda" type="string" offset="0" />
</magic>
</mime-type>
Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950
> ZLIB magic detection support
> ----------------------------
>
> Key: TIKA-1632
> URL: https://issues.apache.org/jira/browse/TIKA-1632
> Project: Tika
> Issue Type: Improvement
> Components: detector
> Reporter: Pavel Micka
> Priority: Minor
>
> In our environment we encounter many compressed streams, one of them (which
> is currently not supported by Tika 1.8) is ZLIB. According to my sources and
> experience the magics that cover majority of ZLIB archives are these:
> <mime-type type="application/zlib">
> <_comment>Zlib Compressed Archive</_comment>
> <magic priority="45">
> <match value="\x78\x01" type="string" offset="0" />
> <match value="\x78\x9c" type="string" offset="0" />
> <match value="\x78\xda" type="string" offset="0" />
> </magic>
> </mime-type>
> Well described here:
> http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
> Original RFC here:
> http://tools.ietf.org/html/rfc1950
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)