[ 
https://issues.apache.org/jira/browse/TIKA-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Shaju updated TIKA-1966:
-------------------------------
    Attachment: pages.pages
                connors_20040127.key
                budget.numbers

I've tested with these files which are made using latest version of iOS. 

> Issue in parsing iWorksDocument with Apache Tika
> ------------------------------------------------
>
>                 Key: TIKA-1966
>                 URL: https://issues.apache.org/jira/browse/TIKA-1966
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.12
>         Environment: Ubuntu 15
>            Reporter: Sachin Shaju
>         Attachments: budget.numbers, connors_20040127.key, pages.pages
>
>
> I was trying to parse iWorksDoc with Apache Tika. But am not getting parsed 
> content as it is instead getting some other output from the content handler. 
> Code snippet that I've used and the output I got is added below.
>  private void parseFile(File file) {
>     try{
>         File file = new File("/home/user/tika/samples/budget.numbers");
>         FileInputStream inputStream = new FileInputStream(file);
>         ParseContext context = new ParseContext();
>         BodyContentHandler bodyHandler = new BodyContentHandler(-1);
>         Parser parser=new AutoDetectParser();
>         parser.parse(inputStream, bodyHandler, new Metadata(), context);
>         System.out.println("Contents of the file :"+bodyHandler.toString());
>         }
>         catch(IOException | SAXException | TikaException e){
>             e.printStackTrace();
>         }
> }
> Output :-
> Contents of the file :
> Index/Document.iwa
> Index/ViewState.iwa
> Index/CalculationEngine.iwa
> Index/Tables/HeaderStorageBucket-2.iwa
> Index/Tables/Tile.iwa
> Index/Metadata.iwa
> Metadata/Properties.plist
> I'm able to detect the file type using Detector api correctly. But am not 
> getting the useful content out of the document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to