>Everything should be able to reset + re.read, i mean it's just data on >disk/memory so i don't see why it wouldn't work (without bugs).
The image is in-line, and Gfx::buildImageStream() creates an EmbedStream() with EmbedStream(parser->getStream(), std::move(dict), false, 0, true); EmbedStream() doesn't implement its own reset(). When I tried creating an EmbedStream::reset() that called str->reset() and cleared record and replay, when DCTStream::DCTStream() eventually gets the stream after the reset before the second pass for optimizecolorspace, it starts back at the beginning of the object instead of where the in-line DCT image starts. It eventually finds the image (which is better than before where the read position remained at the end of the image and the second pass found nothing), but it doesn't seem like a good solution. EmbedStream::getStart(), setPos(), and moveStart() all print messages like error(errInternal, -1, "Internal: called getStart() on EmbedStream"); Are you sure that EmbedStream() can be reset in this context without messing up the parser? I have been using -optimizecolorspace for about 5 years, and this is the first time that it didn't work, so if in-line images can't be reset without messing up the parser, I could have PSOutputDev::doImageL1Sep() skip the scan if it can detect that the image stream an EmbedStream. William ________________________________ From: poppler <[email protected]> on behalf of Albert Astals Cid <[email protected]> Sent: Friday, September 11, 2020 5:57 AM To: [email protected] <[email protected]> Subject: Re: [poppler] pdftops -optimizecolorspace and ImageStream::reset() El dijous, 10 de setembre de 2020, a les 23:00:57 CEST, William Bader va escriure: > I have a PDF where 'pdftops -level1sep -optimizecolorspace' gets 'Syntax > Error: Could not find start of jpeg data' and drops part of the image. > The problem happens in PSOutputDev::doImageL1Sep() where it prescans the > image by making a new ImageStream(str, width, colorMap->getNumPixelComps(), > colorMap->getBits()). > When I made the patch to add -optimizecolorspace, I had first tried scanning > the original stream and then using imgStr->reset(), but it didn't work for > some types of streams, so I switched creating a new stream, which is the code > currently in poppler. > But even that doesn't seem to work for DCTStream. > > Is the problem that some types of streams can never be reread (which means > that -optimizecolorspace can't work as written) or that rereading streams > isn't well tested and I might be able to fix it by reviewing the > initialization? Everyting should be able to reset + re.read, i mean it's just data on disk/memory so i don't see why it wouldn't work (without bugs). Cheers, Albert > > I can post a bug report, including the PDF, which is only 350KB, if it would > help. > > Thanks, William > _______________________________________________ poppler mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________ poppler mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/poppler
