[ 
https://issues.apache.org/jira/browse/ANY23-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254789#comment-13254789
 ] 

Lewis John McGibbney edited comment on ANY23-26 at 4/16/12 4:23 PM:
--------------------------------------------------------------------

Initial WIP. This breaks HCardExtractorTest#testImgSrcDataUrl and 
#testObjectDataDataUri. 

I've attached my failing tests, along with the two HTML documents which the 
tests currently fail on. They both seem to be failing on either 
AbstractExtractorTestCase#assertExtract or 
HCardExtractorTest#assertDefaultVCard... 

For reference we only use Tika core and parsers on the following two classes

./core/src/main/java/org/apache/any23/mime/TikaMIMETypeDetector.java
./core/src/main/java/org/apache/any23/encoding/TikaEncodingDetector.java
                
      was (Author: lewismc):
    Initial WIP. This breaks HCardExtractorTest#testImgSrcDataUrl and 
#testObjectDataDataUri. 

I've attached my failing tests, along with the two HTML documents which the 
tests currently fail on. They both seem to be failing on either 
AbstractExtractorTestCase#assertExtract or 
HCardExtractorTest#assertDefaultVCard... 

For reference we only use Tika core and parsers on the following two classes

./core/src/main/java/org/apache/any23/mime/TikaMIMETypeDetector.java:import 
org.apache.tika.mime.MimeTypes;
./core/src/main/java/org/apache/any23/encoding/TikaEncodingDetector.java:import 
org.apache.tika.parser.txt.CharsetDetector;  
                  
> Upgrade dependency to Apache Tika 1.1
> -------------------------------------
>
>                 Key: ANY23-26
>                 URL: https://issues.apache.org/jira/browse/ANY23-26
>             Project: Apache Any23
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Lewis John McGibbney
>             Fix For: 0.8.0
>
>         Attachments: 14-img-src-data-url.html, 19-object-data-data-uri.html, 
> ANY23-26.patch, org.apache.any23.extractor.html.HCardExtractorTest.txt
>
>
> Upgrading to Apache Tika will hopefully provide a wealth of benefits for the 
> project. This issue should act as an umbrella issue to track these changes. 
> It would be great to delegate as much as possible to Tika if deemed suitable 
> to enhance functionality and to reduce our dependencies on external projects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to