El dilluns, 2 d’abril de 2018, a les 10:22:51 CEST, suzuki toshiya va escriure: > Hi,
Hi 6 months later :/ > > Now I'm thinking about the possibility to add "image_list" > API, which is similar to text_list API of cpp frontend, > giving the list of the structures including the rectangle > and the pointer to the image data stream. > > The easiest idea would be the incorporation of ImageOutputDev > into cpp frontend. However, there is a known issue in > ImageOutputDev; the images drawn by tiling operations are > not counted. > > https://bugs.freedesktop.org/show_bug.cgi?id=91734 > > I should emphasize this is not so marginal case. When I > make a PDF from a HTML with many small images, via Firefox > on GNU/Linux, often the resulted PDF draw the images by > the titling operation, although the images never repeated X-o. You mean the images are in the pdf as a tile repeat of 1? > I'm not sure whether the fix in above bugzilla is right or > not (it seems that nobody reviews the quick fix patch), but > this fix just enables to list (with original metrics), and > extract the image data - the metrics in drawn result is not > available. So it is not the perfect solution to discuss the > "image_list" API. > > there would be a rationale for the original author to > write such simple patch. The tiling operations are executed > as: > > 1) create new output (e.g. splash bitmap, cairo surface, > etc) to draw a single image as a pattern > > 2) transfer the drawn image to original output > > to calculate the positions & metrics in the resulted image, > the chain of the temporal output should be kept. > > The difficulty to handle the images drawn by tiling would > be: > > * it is not easy to count how many times the image are > repeated. > > * to obtain the position & metrics, the chain of tiling > operation should be preserved. we cannot assume the > rendering of the image for the title do not invoke yet > another tiling operation. > > Thinking about the alternative, the possibility would be > parsing SVG (or XML, or CairoScript) generated by > CairoOutputDev. It seems that SVG generated by Cairo has > a flat structure (no grouped coordinate transform), all > position & metric informations could be retrieved by > the neighborhood XML elements. > > However, there are 3 concerns. > > -- > > a) nobody guarantees the forward compatibility about the > flat structure of SVG (or CairoScript, XML surface). > > b) poppler has no dependency with XML parsing library, > except of the case that fontconfig depending libexpat. > > c) tiling onto SVG or XML surface can cause some > rasterization. > > when I convert pattern-tiling example at > > https://developer.mozilla.org/en-US/docs/Web/SVG/Tutorial/Patterns > > onto PDF by librsvg, it includes no raster data > (pattern.pdf.xz), but if I revert it from PDF to SVG > by pdftocairo (pattern.re.svgz), the result includes > the raster data X-o. > > therefore, there is a possibility that inexisting images > are counted in this method. > > -- > > So, what is the right way? I'd say keep ignoring tiles for the time being, and if you find lots of cases where a tile is "wrongly" used, ask the people that generate it to "fix" the pdf, since obviously it's not what they wanted. > if it is not the time to put "image_list" into cpp frontend It is ok, actually i know someone else that wanted to do that. > , is it acceptable to add similar features to pdftimage or pdftocairo? pdftoppm and pdftocairo have a different purpose, they just render a given page, what would you do with tiled images for them? Cheers, Albert > > Regards, > mpsuzuki > _______________________________________________ poppler mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/poppler
