Kabir Soneja created PDFBOX-6010: ------------------------------------ Summary: PDF Image Extraction resulting in an infinite recursion Key: PDFBOX-6010 URL: https://issues.apache.org/jira/browse/PDFBOX-6010 Project: PDFBox Issue Type: Bug Reporter: Kabir Soneja
Hi, I am working on extracting images from a PDF using pdfbox version 2.0.34. While doing so we have our own recursive logic to recurse through all PDResources for each page and within each page we check for all the objects to filter out images. This recursive logic has a max depth of 25 to avoid infinite recursion. When trying out the image extraction for the same PDF using the CLI, the image is extracted within a second indicating that the image extraction logic within the pdfbox source code is handling image extraction using an ImageGraphicsEngine defined within the source code. * To handle image extraction, are there are any API directly provided by PDFBox? * Is there any way to reuse the image extraction logic within the source code i.e is it exposed as a public API? * Any other suggestions to handle image extraction gracefully with/without recursion? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org