RE: Extracting vector graphics from pdf

Allison, Timothy B. Tue, 28 Feb 2017 05:34:54 -0800

Thank you, Tilman!

-----Original Message-----
From: Tilman Hausherr [mailto:thaush...@t-online.de] 
Sent: Monday, February 27, 2017 9:38 AM
To: us...@pdfbox.apache.org
Cc: user@tika.apache.org
Subject: Re: Extracting vector graphics from pdf


http://stackoverflow.com/a/38933039/535646

This allows to collect the lines. However it won't output an image.

Tilman

Am 27.02.2017 um 13:20 schrieb Allison, Timothy B.:
> PDFBox Colleagues,
>    Any recommendations?
>
>            Best,
>
>                   Tim
>
> -----Original Message-----
> From: Andisa Dewi [mailto:theknight...@yahoo.com]
> Sent: Monday, February 27, 2017 5:32 AM
> To: user@tika.apache.org
> Subject: Extracting vector graphics from pdf
>
> Hello guys,
>
> I'm currently extracting images from a whole lot of pdf files, however some 
> of images (or figures) are somehow not extracted. I'm thinking it might have 
> to do with the fact that those images are vector graphics (as usually the 
> case in a lot of scientific papers). My question is, is it possible to 
> extract vector graphics from pdfs using Tika?
>
> I attached an example of the pdf (here for example, all images are extracted 
> except Figure 2).
>
> The way I'm extracting the images are the same as in the example code:
>
> Parser parser = new AutoDetectParser(); Metadata m = new Metadata(); 
> ParseContext c = new ParseContext(); ContentHandler h = new 
> BodyContentHandler(-1); PDFParserConfig pdfConfig = new 
> PDFParserConfig(); pdfConfig.setExtractInlineImages(true);
> c.set(PDFParserConfig.class, pdfConfig); c.set(Parser.class, parser); 
> EmbeddedDocumentExtractor ex = new MyEmbeddedDocumentExtractor(c); 
> c.set(EmbeddedDocumentExtractor.class, ex); parser.parse(inputstream, 
> h, m, c);
>
>
> Thanks!
>
> Regards,
>
> Eli
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org

RE: Extracting vector graphics from pdf

Reply via email to