Damiano created TIKA-1715:
-----------------------------
Summary: Save embedded images into another location
Key: TIKA-1715
URL: https://issues.apache.org/jira/browse/TIKA-1715
Project: Tika
Issue Type: Test
Components: metadata
Affects Versions: 1.10
Reporter: Damiano
Hello,
I am having a strange problem deadling with embedded images.
This is my code:
{code:xml}
public void getImages() throws IOException, TikaException, SAXException {
try (InputStream stream = new FileInputStream(this.fileName)) {
RecursiveParserWrapper p = new RecursiveParserWrapper(
new AutoDetectParser(),
new
BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE.IGNORE, -1)
);
ParseContext context = new ParseContext();
PDFParserConfig config = new PDFParserConfig();
config.setExtractInlineImages(true);
config.setExtractUniqueInlineImagesOnly(true);
context.set(org.apache.tika.parser.pdf.PDFParserConfig.class,
config);
context.set(org.apache.tika.parser.Parser.class, p);
p.parse(stream, new BodyContentHandler(-1), new Metadata(),
context);
List<Metadata> metadatas = p.getMetadata();
FileInputStream f = new FileInputStream("/tmp/" +
metadatas.get(1).get("File Name"));
//FileInputStream f = new
FileInputStream(metadatas.get(1).get("File Name"));
System.out.println(f.available());
}
}
{code}
I can get the name of the embedded images with get("File Name") but the path
seems invalid.
I need to save all the embedded images (inline images) to another location.
Thank you in advance!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)