[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2022-07-26 Thread Adrien Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17571451#comment-17571451
 ] 

Adrien Nguyen commented on TIKA-1997:
-

[~NguyenKhacChinh] [~roberto.benedetti] 

I see that the config file also specifies  for 
pkcs7-signature and   for application/x-dbf

If the filename had the correct extension, would it be a higher priority then 
the magic value ?

Is there a way to change that priority or disable application/x-dbf entirely in 
our own app ?

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2019-04-18 Thread Roberto Benedetti (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821195#comment-16821195
 ] 

Roberto Benedetti commented on TIKA-1997:
-

I think your problem is that {{application/x-dbf }} has a higher priority than 
{{application/pkcs7-signature}} and the regular expression with that mime type 
accidentally matches your file.

 

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2019-04-18 Thread Chinh Nguyen (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821015#comment-16821015
 ] 

Chinh Nguyen commented on TIKA-1997:


Currently, with tika 1.20, the attached pkcs7 file is detected as a DBF file:
  
 {quote}java -jar tika-app-1.20.jar -d test.xml.p7m
 application/x-dbf
{quote}

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2019-01-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752361#comment-16752361
 ] 

ASF GitHub Bot commented on TIKA-1997:
--

dedabob commented on pull request #267: TIKA-1997 Problem in Tika().detect for 
xml file signed in CADES
URL: https://github.com/apache/tika/pull/267
 
 
   Proper application/pkcs7-mime and application/pkcs7-signature types 
detection.
   No changes to application/x-pkcs7-certificates cause they break some tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2019-01-23 Thread Tim Allison (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749970#comment-16749970
 ] 

Tim Allison commented on TIKA-1997:
---

Please do share example files. Thank you for the ping, the references and the 
explanation.

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2019-01-18 Thread Roberto Benedetti (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746737#comment-16746737
 ] 

Roberto Benedetti commented on TIKA-1997:
-

Updated references are:
 * [RFC-5652, Cryptographic Message Syntax 
(CMS)|https://tools.ietf.org/html/rfc5652]
 * [RFC-5751, Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 
Message Specification|https://tools.ietf.org/html/rfc5751]
 * [RFC-7468, Textual Encodings of PKIX, PKCS, and CMS 
Structures|https://tools.ietf.org/html/rfc7468]

Tika looks for "pkcs7-signedData" OID at the beginning of the file and, if 
found, returns "application/pkcs7-signature".

There are, however, three media types with that OID at the beginning, namely:
 * "application/pkcs7-signature", extention ".p7s",  when the signed content is 
not present (detached signature)
 * "application/pkcs7-mime; smime-type=signed-data", extension ".p7m", when the 
signed content is present
 * "application/pkcs7-mime; smime-type=certs-only", extension ".p7c" (".p7b" 
not mentioned but can be found too), when there are only certificates and 
(optionally) CRLs

Extension ".p7m" is also used when the OID at the beginning is 
"pkcs7-envelopedData" and the media type is "application/pkcs7-mime; 
smime-type=enveloped-data".

Extension ".p7z" is used when the OID at the beginning is 
"id-smime-ct-compressedData" and the media type is "application/pkcs7-mime; 
smime-type=compressed-data".

Furthermore the label in the textual encoding is always PKCS7 (i.e. the file 
begins with "-BEGIN PKCS7").

I can provide examples, built using openssl, but to support those media types 
Tika shall:
 * return parameters in media type when detecting streams
 * return different extensions based on media type parameters
 * further inspect streams when "-BEGIN PKCS7" or pkcs7-signedData are 
found (like it does for XML streams)

 

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2017-06-08 Thread Alessandro Scaldaferro (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042465#comment-16042465
 ] 

Alessandro Scaldaferro commented on TIKA-1997:
--

In https://www.ietf.org/rfc/rfc2633.txt I've found the following infos:
   MIME TypeFile Extension
   Application/pkcs7-mime (signedData,  .p7m
   envelopedData)

   Application/pkcs7-mime (degenerate   .p7c
   signedData "certs-only" message)

   Application/pkcs7-signature  .p7s

Looks like .p7m files are signed file (the original file + the signature data), 
and .p7s files are "signature files" containing only the signature data but not 
the original file.


> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2016-09-19 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504023#comment-15504023
 ] 

Nick Burch commented on TIKA-1997:
--

Running your file through the openssl tool {{ asn1parse }}, it shows your file 
as having / being a first object of type {{ pkcs7-signedData }}. It also shows 
the signature from {{ INFOCERT SPA }}. So, it does look to be a signed PKCS7 
file, and hence Tika appears to be doing the right thing

Unless I've mis-understood something about PKCS7 files and/or the asn1 dump 
output?

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2016-09-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493792#comment-15493792
 ] 

Tim Allison commented on TIKA-1997:
---

[~gagravarr], any recommendations on this one?

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
>Priority: Blocker
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES

2016-06-07 Thread Michele Andreano (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318100#comment-15318100
 ] 

Michele Andreano commented on TIKA-1997:


Attached to this issue you can find an example

> Problem in Tika().detect for xml file signed in CADES
> -
>
> Key: TIKA-1997
> URL: https://issues.apache.org/jira/browse/TIKA-1997
> Project: Tika
>  Issue Type: Sub-task
>  Components: detector
>Affects Versions: 1.13
> Environment: JDK 1.7
>Reporter: Michele Andreano
> Fix For: 1.13
>
> Attachments: test.xml.p7m
>
>
> When I submit a tika a xml file signed in P7M format, I expect tika return as 
> mimetype application / pkcs7-mime instead gives me application / 
> pkcs7-signature.
> How is it possible?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)