Pavel Micka created TIKA-1632:
---------------------------------

             Summary: ZLIB magic detection support
                 Key: TIKA-1632
                 URL: https://issues.apache.org/jira/browse/TIKA-1632
             Project: Tika
          Issue Type: Improvement
          Components: detector
            Reporter: Pavel Micka
            Priority: Minor


In our environment we encounter many compressed streams, one of them (which is 
currently not supported by Tika) is ZLIB. According to my sources and 
experience the magics that cover majority of ZLIB archives are these:

    <mime-type type="application/zlib">
        <_comment>Zlib Compressed Archive</_comment>
        <magic priority="45">
            <match value="\x78\x01" type="string" offset="0" />
            <match value="\x78\x9c" type="string" offset="0" />
            <match value="\x78\xda" type="string" offset="0" />
        </magic>
    </mime-type>

Well described here:
http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like
Original RFC here:
http://tools.ietf.org/html/rfc1950



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to