[jira] [Commented] (TIKA-1840) No way to link slide notes to slide in PPT output.

ASF GitHub Bot (JIRA) Fri, 22 Jan 2016 02:32:18 -0800

    [ 
https://issues.apache.org/jira/browse/TIKA-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112270#comment-15112270
 ]


ASF GitHub Bot commented on TIKA-1840:
--------------------------------------

GitHub user zetisam opened a pull request:

    https://github.com/apache/tika/pull/72

    fix for TIKA-1840 contributed by zetisam

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zetisam/tika TIKA-1840

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/72.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #72
    
----
commit 52b82bddef7c7ae8a430c9871594295e71882055
Author: Sam Heijens <[email protected]>
Date:   2016-01-22T10:09:48Z

    fix for TIKA-1840 contributed by zetisam

----


> No way to link slide notes to slide in PPT output.
> --------------------------------------------------
>
>                 Key: TIKA-1840
>                 URL: https://issues.apache.org/jira/browse/TIKA-1840
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.11
>            Reporter: Sam H
>
> I'm integrating Apache Tika into my project, and I want to extract (text) 
> information from Powerpoint slides. Both PPT and PPTX
> I've noticed when using PPT format, the slide notes are all aggregated at the 
> end of the XML output, and there is no way to identify which note belongs to 
> which slide.
> I began looking at the code and found the following:
> {code}
> // TODO Find the Notes for this slide and extract inline
> {code}
> in 
> [HSLFExtractor.java|https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java]
>  on line 140 
> I would like to implement this part and contribute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1840) No way to link slide notes to slide in PPT output.

Reply via email to