[jira] [Commented] (TIKA-1715) Save embedded images into another location

Damiano (JIRA) Tue, 18 Aug 2015 05:52:24 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701203#comment-14701203
 ]


Damiano commented on TIKA-1715:
-------------------------------

then I would add that the code is extremely slow. The file is only 400kB. It 
keeps around 22 seconds!

> Save embedded images into another location
> ------------------------------------------
>
>                 Key: TIKA-1715
>                 URL: https://issues.apache.org/jira/browse/TIKA-1715
>             Project: Tika
>          Issue Type: Test
>          Components: metadata
>    Affects Versions: 1.10
>            Reporter: Damiano
>              Labels: newbie
>
> Hello,
> I am having a strange problem deadling with embedded images.
> This is my code:
> {code:xml}
>     public void getImages() throws IOException, TikaException, SAXException {
>         
>         try (InputStream stream = new FileInputStream(this.fileName)) {
>             RecursiveParserWrapper p = new RecursiveParserWrapper(
>                 new AutoDetectParser(),
>                 new 
> BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE.IGNORE, -1)
>             );            
>             
>             ParseContext context = new ParseContext();
>             PDFParserConfig config = new PDFParserConfig();
>             config.setExtractInlineImages(true);
>             config.setExtractUniqueInlineImagesOnly(true);
>             context.set(org.apache.tika.parser.pdf.PDFParserConfig.class, 
> config);
>             context.set(org.apache.tika.parser.Parser.class, p);            
>             
>             p.parse(stream, new BodyContentHandler(-1), new Metadata(), 
> context);
>             
>             List<Metadata> metadatas = p.getMetadata();
>                         
>             FileInputStream f = new FileInputStream("/tmp/" + 
> metadatas.get(1).get("File Name"));
>             //FileInputStream f = new 
> FileInputStream(metadatas.get(1).get("File Name"));
>             
>             System.out.println(f.available());
>         }
>     }
> {code}
> I can get the name of the embedded images with get("File Name") but the path 
> seems invalid.
> I need to save all the embedded images (inline images) to another location.
> Thank you in advance!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1715) Save embedded images into another location

Reply via email to