[
https://issues.apache.org/jira/browse/ANY23-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254789#comment-13254789
]
Lewis John McGibbney edited comment on ANY23-26 at 4/16/12 4:23 PM:
--------------------------------------------------------------------
Initial WIP. This breaks HCardExtractorTest#testImgSrcDataUrl and
#testObjectDataDataUri.
I've attached my failing tests, along with the two HTML documents which the
tests currently fail on. They both seem to be failing on either
AbstractExtractorTestCase#assertExtract or
HCardExtractorTest#assertDefaultVCard...
For reference we only use Tika core and parsers on the following two classes
./core/src/main/java/org/apache/any23/mime/TikaMIMETypeDetector.java
./core/src/main/java/org/apache/any23/encoding/TikaEncodingDetector.java
was (Author: lewismc):
Initial WIP. This breaks HCardExtractorTest#testImgSrcDataUrl and
#testObjectDataDataUri.
I've attached my failing tests, along with the two HTML documents which the
tests currently fail on. They both seem to be failing on either
AbstractExtractorTestCase#assertExtract or
HCardExtractorTest#assertDefaultVCard...
For reference we only use Tika core and parsers on the following two classes
./core/src/main/java/org/apache/any23/mime/TikaMIMETypeDetector.java:import
org.apache.tika.mime.MimeTypes;
./core/src/main/java/org/apache/any23/encoding/TikaEncodingDetector.java:import
org.apache.tika.parser.txt.CharsetDetector;
> Upgrade dependency to Apache Tika 1.1
> -------------------------------------
>
> Key: ANY23-26
> URL: https://issues.apache.org/jira/browse/ANY23-26
> Project: Apache Any23
> Issue Type: Improvement
> Affects Versions: 0.7.0
> Reporter: Lewis John McGibbney
> Fix For: 0.8.0
>
> Attachments: 14-img-src-data-url.html, 19-object-data-data-uri.html,
> ANY23-26.patch, org.apache.any23.extractor.html.HCardExtractorTest.txt
>
>
> Upgrading to Apache Tika will hopefully provide a wealth of benefits for the
> project. This issue should act as an umbrella issue to track these changes.
> It would be great to delegate as much as possible to Tika if deemed suitable
> to enhance functionality and to reduce our dependencies on external projects.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira