[ 
https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17816191#comment-17816191
 ] 

Tim Allison commented on TIKA-3784:
-----------------------------------

[~nick] (and cc [~tom_1st] from TIKA-4194), I agree that parsing these things 
would probably be best as a container detector. When I run AS1Dump on one of 
the p12 files, I get this:

{noformat}
Sequence
    Integer(3)
    Sequence
        ObjectIdentifier(1.2.840.113549.1.7.1)
        Tagged [CONTEXT 0]
            DER Octet String[2603] 
    Sequence
        Sequence
            Sequence
                ObjectIdentifier(2.16.840.1.101.3.4.2.3)
                NULL
            DER Octet String[64] 
        DER Octet String[64] 
        Integer(1000000)
{noformat}

Is there anything in there I can use to detect p12?

> Detector returns "application/x-x509-key" when scanning a .p12 file
> -------------------------------------------------------------------
>
>                 Key: TIKA-3784
>                 URL: https://issues.apache.org/jira/browse/TIKA-3784
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 1.26
>            Reporter: Matthias Hofbauer
>            Priority: Critical
>
> We are using tika to check if the MIME type of the file extensions matches 
> with the MIME type of the file content.
> After our upgrade from tika-core 1.22 to 1.26 our logic does not work anymore 
> for certificates of type .p12, .pfx, .cer, .der.
> For the .p12 and .pfx extension the MIME type is "application/x-pkcs12" but 
> the tika detector returns "application/x-x509-key" instead.
> After checking the tika-mimetype.xml and comparing it to my .p12 file I found 
> the following MIME magic which explains why I got these types back.
> {code:xml}
> <mime-type type="application/x-x509-key;format=der">
>     <sub-class-of type="application/x-x509-key"/>
>     <!-- These are just a bunch of magic integers as defined by the key 
> format... -->
>     <!-- Always seem to have a version integer as their first entry, -->
>     <!--  normally 00, 01 or 02, check for that -->
>     <magic priority="40">
>       <match value="0x3081FF020100" type="string"
>               mask="0xFFFF00FFFFFC" offset="0"/>
>       <match value="0x3082FFFF020100" type="string"
>               mask="0xFFFF0000FFFFFC" offset="0"/>
>     </magic>
> </mime-type> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to