[ 
https://issues.apache.org/jira/browse/TIKA-16?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith R. Bennett updated TIKA-16:
---------------------------------

    Attachment: testWORD.doc

Added sample word file which contains:

----
Sample Word .doc File

Created with
Microsoft Word X for Mac, Service Release 1
----


> Issues with data files used for testing by TestParsers.
> -------------------------------------------------------
>
>                 Key: TIKA-16
>                 URL: https://issues.apache.org/jira/browse/TIKA-16
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Keith R. Bennett
>         Attachments: testEXCEL.xls, testOO2.odt, testPDF.PDF, testPPT.PPT, 
> testWORD.doc
>
>
> The TestParsers class requires that the following test files be available:
> testPDF.PDF
> testTXT.TXT
> testRTF.RTF
> textXML.XML
> testPPT.PPT
> testWORD.doc
> testEXCEL.xls
> testOO2.odt
> testHTML.html
> Only the following are provided:
> testHTML.html
> testRTF.rtf
> testTXT.txt
> testXML.xml
> Issue #1: When specifying the file names in the source code and for the files 
> themselves, we should make sure that they are equal case-sensitively.  My 
> personal preference is for lower case extensions, but in any case they should 
> be consistent.  Therefore, where necessary, we need to rename the file or 
> change the source code so that they match.  (This is the case for *.rtf, 
> *.txt, *.xml above.)
> Issue #2: The missing files need to be added to the repository so that the 
> test does not fail.  A minimal file would suffice for the short term, but 
> ultimately it would be nice to have files that exercise the parsers to the 
> fullest.
> Issue #3: We need to agree on a directory.  The source code currently 
> specifies that they should be in src/test/resources/testFiles, but in the 
> respository they are in src/test/resources.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to