[ 
https://issues.apache.org/jira/browse/TIKA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Micka updated TIKA-1632:
------------------------------
    Description: 
In our environment we encounter many compressed streams, one of them (which is 
currently not supported by Tika 1.8) is ZLIB. According to my sources and 
experience the magics that cover majority of ZLIB archives are these:

    <mime-type type="application/zlib">
        <_comment>Zlib Compressed Archive</_comment>
        <magic priority="45">
            <match value="\x78\x01" type="string" offset="0" />
            <match value="\x78\x9c" type="string" offset="0" />
            <match value="\x78\xda" type="string" offset="0" />
        </magic>
    </mime-type>

Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950

  was:
In our environment we encounter many compressed streams, one of them (which is 
currently not supported by Tika) is ZLIB. According to my sources and 
experience the magics that cover majority of ZLIB archives are these:

    <mime-type type="application/zlib">
        <_comment>Zlib Compressed Archive</_comment>
        <magic priority="45">
            <match value="\x78\x01" type="string" offset="0" />
            <match value="\x78\x9c" type="string" offset="0" />
            <match value="\x78\xda" type="string" offset="0" />
        </magic>
    </mime-type>

Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950


> ZLIB magic detection support
> ----------------------------
>
>                 Key: TIKA-1632
>                 URL: https://issues.apache.org/jira/browse/TIKA-1632
>             Project: Tika
>          Issue Type: Improvement
>          Components: detector
>            Reporter: Pavel Micka
>            Priority: Minor
>
> In our environment we encounter many compressed streams, one of them (which 
> is currently not supported by Tika 1.8) is ZLIB. According to my sources and 
> experience the magics that cover majority of ZLIB archives are these:
>     <mime-type type="application/zlib">
>         <_comment>Zlib Compressed Archive</_comment>
>         <magic priority="45">
>             <match value="\x78\x01" type="string" offset="0" />
>             <match value="\x78\x9c" type="string" offset="0" />
>             <match value="\x78\xda" type="string" offset="0" />
>         </magic>
>     </mime-type>
> Well described here:
> http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
> Original RFC here:
> http://tools.ietf.org/html/rfc1950



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to