[jira] [Commented] (TIKA-1379) error in Tika().detect for xml files with xades signature

2015-10-20 Thread Alessandro De Angelis (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965174#comment-14965174
 ] 

Alessandro De Angelis commented on TIKA-1379:
-

...

> error in Tika().detect for xml files with xades signature
> -
>
> Key: TIKA-1379
> URL: https://issues.apache.org/jira/browse/TIKA-1379
> Project: Tika
>  Issue Type: Bug
>  Components: detector
>Affects Versions: 1.4
>Reporter: Alessandro De Angelis
>  Labels: new-parser
> Fix For: 1.12
>
>
> we tried to get the mime type of an xml file with xades signature embedded. 
> the result is "text/html" and not the expected "text/xml" or 
> "application/xml".
> here is an example of the xml file:
> {code}
> 
> 
>   00094853 0003 2
>   2013-09-23
>   2013-09-23
>   D69017
>   FILOSOFIA DELLA SCIENZA
>   D69
>   TEATRO E ARTI VISIVE
>   
>   1233456
>   PAOLINO
>   PAPERINO
>   23.0
>   23
>   
>   
>   
>   2012
>   6.0
>   
>   9
>   جامعة البندقية - TEST
>   Verbale_3
>   QUI QUO QUA
>   D69017
>   FILOSOFIA DELLA SCIENZA
>   D69
>   TEATRO E ARTI VISIVE
>   QUI QUO QUA
> 26-09-2013 09:55:53 CEST(+0200)
> 
>   3
>   11.09.03
> 
> http://www.w3.org/2000/09/xmldsig#; 
> Id="sig08744308748201048377">
> 
>  Algorithm="http://www.w3.org/2006/12/xml-c14n11;>
>  Algorithm="http://www.w3.org/2001/04/xmldsig-more#rsa-sha256;>
> 
> 
> http://www.w3.org/2002/06/xmldsig-filter2;>
>  xmlns:dsig-xpath="http://www.w3.org/2002/06/xmldsig-filter2; 
> Filter="subtract">/descendant::ds:Signature
> 
> http://www.w3.org/TR/1999/REC-xslt-19991116;>
> http://www.kion.it/webesse3/multilingua; 
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform; 
> exclude-result-prefixes="kion" version="1.0">
>   
>   
>   
>select="/VERBALI/VERBALE">
>select="/VERBALI/VERBALE/SOSTITUZIONE_DOCUMENTO">
>select="/VERBALI/VERBALE/RAGGRUPPAMENTO">
>select="/VERBALI/VERBALE/COMMISSIONE">
>   
>   
>   
>   
>http-equiv="Content-Type">
>
>test="$sostituzione_root">
>   Dichiarazione 
> conformità Verbale Esame
>   
>   
>   Verbalizzazione 
> esame
>   
>   
>   
>td  {font-family: Arial; font-size:10pt;} 
>div {font-family: Arial; font-size:10pt;}
>pre {font-family: Arial; font-size:10pt;} 
>   
>   
>   
>   
>
>test="$sostituzione_root">
>colspan="2"> select="$verbale_root/ATENEO_DES">
>colspan="2">DICHIARAZIONE DI 
> CONFORMITÀ
>colspan="2">Il sottoscritto  select="$verbale_root/TITOLARE_PROCEDIMENTO">, docente di 
> 
>  
>   
>   
>     
>   
>test="$sostituzione_root/MOTIVAZIONE">
>   
> PREMESSO CHE
>   
>  
>   
>  select="$sostituzione_root/MOTIVAZIONE">
>   
>  
>   
> 
>   
>   
>   
>   
> DICHIARA
>    
> 
>

[jira] [Commented] (TIKA-1379) error in Tika().detect for xml files with xades signature

2015-03-20 Thread Tyler Palsulich (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372133#comment-14372133
 ] 

Tyler Palsulich commented on TIKA-1379:
---

The file is still detected as text/html. Should we update the magic to detect 
it as xml?

 error in Tika().detect for xml files with xades signature
 -

 Key: TIKA-1379
 URL: https://issues.apache.org/jira/browse/TIKA-1379
 Project: Tika
  Issue Type: Bug
  Components: detector
Affects Versions: 1.4
Reporter: Alessandro De Angelis
  Labels: new-parser
 Fix For: 1.8


 we tried to get the mime type of an xml file with xades signature embedded. 
 the result is text/html and not the expected text/xml or 
 application/xml.
 here is an example of the xml file:
 {code}
 VERBALI ad_cod=D69017 batch_id=0 cds_cod=D69 data_app=2013-09-23
 VERBALE Id=1 tipologia=Verbale esame
   VERB_NUM00094853 0003 2/VERB_NUM
   DATA_APP2013-09-23/DATA_APP
   DATA_ESA2013-09-23/DATA_ESA
   AD_CODD69017/AD_COD
   ADFILOSOFIA DELLA SCIENZA/AD
   CDS_CODD69/CDS_COD
   CDSTEATRO E ARTI VISIVE/CDS
   TIPO_ESA/TIPO_ESA
   MAT1233456/MAT
   NOMEPAOLINO/NOME
   COGNOMEPAPERINO/COGNOME
   VOTO23.0/VOTO
   VOTODECOD23/VOTODECOD
   CAUSALE/CAUSALE
   TIPO_MODULO/TIPO_MODULO
   IMG_PATH/IMG_PATH
   AA_SES_ID2012/AA_SES_ID
   AD_CFU6.0/AD_CFU
   NOTA/NOTA
   ATENEO9/ATENEO
   ATENEO_DESجامعة البندقية - TEST/ATENEO_DES
   TIPO_DOCUMENTOVerbale_3/TIPO_DOCUMENTO
   TITOLARE_PROCEDIMENTOQUI QUO QUA/TITOLARE_PROCEDIMENTO
   AD_STU_CODD69017/AD_STU_COD
   AD_STUFILOSOFIA DELLA SCIENZA/AD_STU
   CDS_STU_CODD69/CDS_STU_COD
   CDS_STUTEATRO E ARTI VISIVE/CDS_STU
   DOCENTEQUI QUO QUA/DOCENTE
 DATA_DOCUMENTO26-09-2013 09:55:53 CEST(+0200)/DATA_DOCUMENTO
 SOFTWARE_DI_CREAZIONE
   NOME3/NOME
   VERSIONE11.09.03/VERSIONE
 /SOFTWARE_DI_CREAZIONE
 /VERBALEds:Signature xmlns:ds=http://www.w3.org/2000/09/xmldsig#; 
 Id=sig08744308748201048377
 ds:SignedInfo
 ds:CanonicalizationMethod 
 Algorithm=http://www.w3.org/2006/12/xml-c14n11;/ds:CanonicalizationMethod
 ds:SignatureMethod 
 Algorithm=http://www.w3.org/2001/04/xmldsig-more#rsa-sha256;/ds:SignatureMethod
 ds:Reference URI=
 ds:Transforms
 ds:Transform Algorithm=http://www.w3.org/2002/06/xmldsig-filter2;
 dsig-xpath:XPath 
 xmlns:dsig-xpath=http://www.w3.org/2002/06/xmldsig-filter2; 
 Filter=subtract/descendant::ds:Signature/dsig-xpath:XPath
 /ds:Transform
 ds:Transform Algorithm=http://www.w3.org/TR/1999/REC-xslt-19991116;
 xsl:stylesheet xmlns:kion=http://www.kion.it/webesse3/multilingua; 
 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; 
 exclude-result-prefixes=kion version=1.0
   kion:ml module=FirmaDigitale target=kion/kion:ml
   xsl:output method=xml/xsl:output
   xsl:variable name=mostra_ad_figlie select=1/xsl:variable
   xsl:variable name=verbale_root 
 select=/VERBALI/VERBALE/xsl:variable
   xsl:variable name=sostituzione_root 
 select=/VERBALI/VERBALE/SOSTITUZIONE_DOCUMENTO/xsl:variable
   xsl:variable name=RAGG_ROOT 
 select=/VERBALI/VERBALE/RAGGRUPPAMENTO/xsl:variable
   xsl:variable name=COMM_ROOT 
 select=/VERBALI/VERBALE/COMMISSIONE/xsl:variable
   
   xsl:template match=/
   html
   head
   meta content=text/html;charset=UTF-8 
 http-equiv=Content-Type/meta
   xsl:choose 
   xsl:when 
 test=$sostituzione_root
   titleDichiarazione 
 conformità Verbale Esame/title
   /xsl:when
   xsl:otherwise
   titleVerbalizzazione 
 esame/title
   /xsl:otherwise
   /xsl:choose
   style type=text/css
td  {font-family: Arial; font-size:10pt;} 
div {font-family: Arial; font-size:10pt;}
pre {font-family: Arial; font-size:10pt;} 
   /style
   /head
   body
   table
   xsl:choose 
   xsl:when 
 test=$sostituzione_root
   trtd align=center 
 colspan=2bigstrongxsl:value-of 
 select=$verbale_root/ATENEO_DES/xsl:value-of/strong/bigbr/br/td/tr