Sorry, I don't understand what this output is telling me?

Ie these 5 files are Tika's sources.... but, what's wrong with them?

I thought we were talking about certain emails from the Enron corpus
where Tika hits an exception or fails to extract text...

Mike McCandless

http://blog.mikemccandless.com

On Wed, Sep 7, 2011 at 1:04 PM, Steve Aulenbach <[email protected]> wrote:
> Hi Mike,
> Here you go. I ran a quick analysis on revision 1166216 and saw the
> following:
>
> Analysis Summary:
>
> Files: 510
>
> *** Warning *** File(s) Not Found 5:
>
> /tika-parsers/src/main/java/org/apache/tika/detect/ContainerAwareDetector.java
>
> /tika-parsers/src/main/java/org/apache/tika/detect/POIFSContainerDetector.java
>
> /tika-parsers/src/main/java/org/apache/tika/detect/ZipContainerDetector.java
>
> /tika-parsers/src/test/java/org/apache/tika/parser/chm/TestUtils.java
>
> /tika-parsers/target/surefire-reports/TEST-org.apache.tika.parser.chm.TestUtils.xml
>
> Thanks,
> Steve
>
>
> On Wed, Sep 7, 2011 at 6:29 AM, Michael McCandless
> <[email protected]> wrote:
>>
>> On Tue, Sep 6, 2011 at 9:29 PM, Mark Kerzner <[email protected]>
>> wrote:
>>
>> > Is anybody interested in the results of all the testing that
>> > I am doing, and if yes, how should I report my findings?
>>
>> I'm interested!  This sounds great....
>>
>> Tika should strive to have no errors on any valid documents... so if
>> you (or anyone) are hitting bugs in Tika/POI/PDFBox/etc., let's
>> characterize them, open issues, and get them fixed :)
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>
>

Reply via email to