Now I feel sad. I never saw the getOpaqueImage() method, which appears to do exactly what I wanted... so sad.
Constantine -- There is a computer disease that anybody who works with computers knows about. It's a very serious disease and it interferes completely with the work. The trouble with computers is that you 'play' with them! - Richard P. Feynman On Fri, Sep 24, 2021 at 11:46 AM Constantine Dokolas <cdoko...@gmail.com> wrote: > To keep the implementation hidden is reasonable. The thing is that access > to the different forms/parts of a standard-conforming image XObject is > still limited (not to mention creation). That is, you are hiding stuff > that's *not *in the API, and that's not really fair. I should be able to > get/set the two images as BufferedImage separately. Anyway, it seems I have > a working workaround already. > > What I meant by avoiding BufferedImage, was actually the use of getImage() > which combines the base image and mask. My bad. The thing is that splitting > the two is usually done to downscale the background when it's low on > "content", like with scanned documents (background is low-fluctuation > black-grey and mask is just a stencil for that). > > For example, I was processing a page with a full-page image with mask > having the base as a 659x904 grayscale image (JPEG2000 encoded to 906 > bytes) and a mask of 2636x3616 (JBIG2-encoded to 21836 bytes). Using > getImage and replacing the original with that, resulted to base and SMask > images (PDFBox changed the pair from Base-Mask to Base-SMask) both at > 2636x3616, encoded as grayscale JPEGs of 511252 and 691079 respectively. > So, from ~23KB, it went all the way to ~1.5MB. That's, of course, a kind of > worst-case scenario (i.e. not trying any compression options). > > Hope these images pass through the mailer... > [image: image.png][image: image.png] > > So, I'm only saying that the API needs to get a little better at > supporting PDF 32000-1:2008 section 8.9.6 in particular. Beyond that, I > must still congratulate the team for what's already done. PDFBox is a great > tool that's given us the opportunity to do great stuff. Keep up the good > work. I honestly hope to be able to contribute, but after 3 years of > working with the standard I still don't consider myself proficient in the > PDF format. > > Thanks, > Constantine > -- > There is a computer disease that anybody who works with computers knows > about. It's a very serious disease and it interferes completely with the > work. The trouble with computers is that you 'play' with them! > - Richard P. Feynman > > > On Thu, Sep 23, 2021 at 8:06 PM Tilman Hausherr <thaush...@t-online.de> > wrote: > >> The reason we kept this package local is so that we can make changes >> without breaking the API. So for you the best would be to copy that file. >> >> Alternatively copy the mask from the alpha layer of the BufferedImage. >> You mention you don't want to use BufferedImage but how else would you >> process this? >> >> Tilman >> >> Am 23.09.2021 um 12:04 schrieb Constantine Dokolas: >> > It just occurred to me that this should have been posted on "users" not >> > "dev", so I'm forwarding it here. >> > Sorry for the confusion. >> > Constantine >> > >> > ---------- Forwarded message --------- >> > From: Constantine Dokolas <cdoko...@gmail.com> >> > Date: Wed, Sep 22, 2021 at 7:02 PM >> > Subject: Getting and setting image and mask separately (PDImageXObject) >> > To: <d...@pdfbox.apache.org> >> > >> > >> > I'm processing images in PDFs and I sometimes get images with a "Mask". >> I >> > want to separately retrieve the image (being the base) and the "Mask" >> and >> > also generate (from these) a new base-Mask pair. This is in order to >> > preserve the original format of the (optimized) image resource and >> > size/compression of the individual images (using the BufferedImage can >> > affect the resource size significantly). >> > >> > Unfortunately, SampledImageReader is only package-visible and I can't >> use >> > it. What are my options? >> > >> > There is also no PDImageXObject.setMask(...), but I guess I can directly >> > set the "Mask" in the dictionary. >> > >> > Thanks in advance, >> > Constantine >> > >> > -- >> > There is a computer disease that anybody who works with computers knows >> > about. It's a very serious disease and it interferes completely with the >> > work. The trouble with computers is that you 'play' with them! >> > - Richard P. Feynman >> > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org >> For additional commands, e-mail: users-h...@pdfbox.apache.org >> >>