[
https://issues.apache.org/jira/browse/PDFBOX-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026880#comment-14026880
]
John Hewson edited comment on PDFBOX-2092 at 10/10/14 11:50 PM:
----------------------------------------------------------------
To answer Tilman's question:
{quote}
And would this packed raster method also work with an image that has more than
4 color components? (Don't know if the spec allows such images, but we have a
PDF with a DeviceN colorspace with 6 elements, but no image)
{quote}
No, it wouldn't as packed rasters have a 32-bit limit, which is 4 8-bit color
components. DeviceN often has 6 components and can have up to 32 according to
the spec. In addition, the other color spaces expect the raster passed to their
toImageRGB implementation to be banded and not packed.
Petr noticed this also:
{quote}
But it is just a quick and dirty test to see the effect on performance.
SampledImageReader creates a packed raster and fills it in from8bitx(). I have
changed just the color space class PDDeviceRGB to expect the different type of
raster, all other color space classes would need to be adjusted, too.
{quote}
The problem is that it's not possible for color spaces with more than 4 8-bit
components to use a packed raster, so we can't handle DeviceN. What we could do
though is offer a fast path for images with <= 4 components where we generate a
packed raster, that way only DeviceN needs to handle a banded raster. So Petr
is quite right:
{quote}
a banded raster would still need to come into PDDeviceN#toRGBImage(), but what
comes out can be a packed raster with DirectColorModel.
{quote}
But... a DeviceN color space can output to another DeviceN color space, so it
is in theory possible to have a DeviceN with say 10 channels that outputs to 6
channels, which outputs to CMYK, then RGB (phew!). So it's strictly necessary
to support DeviceN outputting to a banded raster if the output color space is
another DeviceN space (as the existing code does). Alternatively we could
choose not to support DeviceN -> DeviceN because it's probably an extreme
edge-case.
In summary, in order to use packed rasters:
- SampledImageReader must read a packed raster for images with <= 4 components,
it will need to do this in both from8bit and fromAny. Banded rasters will still
be read for DeviceN.
- all color spaces with <= 4 components can switch to expecting a packed
raster, no need to handle banded any more
- DeviceN will read from a banded raster as is currently the case, and must
instead output a packed raster (as long as the output is not another DeviceN
space, in which case the current code can be used).
- it's possible to optimise DeviceN spaces with <= 4 components (i.e., most) to
use packed rasters also, it's just a lot of work.
Note that the first three points _must_ be addressed in order for any patch
using packed rasters to be applied without breaking other color spaces.
was (Author: jahewson):
To answer Tilman's question:
{quote}
And would this packed raster method also work with an image that has more than
4 color components? (Don't know if the spec allows such images, but we have a
PDF with a DeviceN colorspace with 6 elements, but no image)
{quote}
No, it wouldn't as packed rasters have a 32-byte limit, which is 4 8-bit color
components. DeviceN often has 6 components and can have up to 32 according to
the spec. In addition, the other color spaces expect the raster passed to their
toImageRGB implementation to be banded and not packed.
Petr noticed this also:
{quote}
But it is just a quick and dirty test to see the effect on performance.
SampledImageReader creates a packed raster and fills it in from8bitx(). I have
changed just the color space class PDDeviceRGB to expect the different type of
raster, all other color space classes would need to be adjusted, too.
{quote}
The problem is that it's not possible for color spaces with more than 4 8-bit
components to use a packed raster, so we can't handle DeviceN. What we could do
though is offer a fast path for images with <= 4 components where we generate a
packed raster, that way only DeviceN needs to handle a banded raster. So Petr
is quite right:
{quote}
a banded raster would still need to come into PDDeviceN#toRGBImage(), but what
comes out can be a packed raster with DirectColorModel.
{quote}
But... a DeviceN color space can output to another DeviceN color space, so it
is in theory possible to have a DeviceN with say 10 channels that outputs to 6
channels, which outputs to CMYK, then RGB (phew!). So it's strictly necessary
to support DeviceN outputting to a banded raster if the output color space is
another DeviceN space (as the existing code does). Alternatively we could
choose not to support DeviceN -> DeviceN because it's probably an extreme
edge-case.
In summary, in order to use packed rasters:
- SampledImageReader must read a packed raster for images with <= 4 components,
it will need to do this in both from8bit and fromAny. Banded rasters will still
be read for DeviceN.
- all color spaces with <= 4 components can switch to expecting a packed
raster, no need to handle banded any more
- DeviceN will read from a banded raster as is currently the case, and must
instead output a packed raster (as long as the output is not another DeviceN
space, in which case the current code can be used).
- it's possible to optimise DeviceN spaces with <= 4 components (i.e., most) to
use packed rasters also, it's just a lot of work.
Note that the first three points _must_ be addressed in order for any patch
using packed rasters to be applied without breaking other color spaces.
> Very slow rendering of scanned document
> ---------------------------------------
>
> Key: PDFBOX-2092
> URL: https://issues.apache.org/jira/browse/PDFBOX-2092
> Project: PDFBox
> Issue Type: Improvement
> Components: Rendering
> Affects Versions: 2.0.0
> Environment: Win7 x64 EN
> JDK6,JDK7,JDK8
> Reporter: Juraj Lonc
> Attachments: PDFBOX-2092.patch, SCAN_20140522_160457490_page2.pdf
>
>
> It takes extremely long to render this file to image.
> Depends on computer but it can take 15s+ to render 1 page.
> When I skip drawing of inserted image /Im0, then rendering is fast. So there
> is something wrong with drawing that image in
> {code}
> PageDrawer.drawImage(Image awtImage, AffineTransform at)
> {code}
> when I comment out line
> {code}
> graphics.drawImage(awtImage, imageTransform, null);
> {code}
> then rendering process takes 6s
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)