Am 07.11.2021 um 16:09 schrieb Kevin Day:
ok - I think that maybe there are two things going on, then. The first is
that we are allocating a single buffer that is bigger than the default heap
(700MB > 500MB). I really think that the only two solutions are to move
this off the heap or reduce the size of the allocation.
Then the second issue is maybe a memory leak (I have not been able to see
this because of the first problem unless I set my max heap to a huge number
- and that makes memory leak analysis very difficult). I am going to clone
the PdfBox git repo now and swap that class in then do a little bit of
profiling with jvisualvm to see if I can help find the memory leak.
Question: The Git repo for PdfBox says that it is a mirror. If I do wind
up creating a pull request against this repo, will you be able to accept it?
Not directly but we can create a .diff / .patch file from the PR.
In the meantime I ran my rendering regression checks which renders 1000
files. The hard disk C: went full after using 260 GB :-(
Tilman
FYI - the use of sun.* classes is just to work around a long standing
native resource leak bug in the Windows MappedByteBuffer implementation -
On Windows, MappedByteBuffer does not release the allocated native resource
until finalize() is called, and because the MappedByteBuffer itself does
not take up much heap space, it can stay on the heap for quite a long time
even after all references are gone. So the sun.* reference is just to
force the release of those native resources. I used the same technique
when I created the high throughput IO layer for iText. Instead of
referencing the package directly, we can use reflection (and only use it
for winXXX platform) - this shouldn't be a big problem.
K
Kevin Day
*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
www.trumpetinc.com <http://trumpetinc.com/>
LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc>
Proud to be Great Place To Work
<https://www.greatplacetowork.com/certified-company/7012667> certified
since 2019
On Sun, Nov 7, 2021 at 7:53 AM Tilman Hausherr <thaush...@t-online.de>
wrote:
I tried it and it works pretty fast. The memory leak is still there but
it comes later.
Other drawbacks arethat it doesn't clean up after itself fast enough,
and it uses a sun.* class.
Tilman
Am 07.11.2021 um 14:31 schrieb Tilman Hausherr:
Am 05.11.2021 um 22:09 schrieb Kevin Day:
ok - so should we be clamping the xstep in some way? Or at this
depth of
the algorithm do we not have enough context to actually tell that it
will
be outside the page/clipping region?
Yes + yes, that is the problem. The context would have to include the
region but also the current transformation matrix. Which could change
despite using the same pattern. So it's tricky.
I'm beginning to think that the BigBufferedImage might be the right
solution... This is very inefficient, but honestly, the PDF is not
exactly
well formed, so if these particular files wind up being slower to render
b/c they have to swap image content to disk, I think that is OK...
Maybe... the good thing is that we know when the image will be "too
big". The license is OK:
https://www.apache.org/legal/resolved.html
But I'm still wondering whether we have a memory leak. If we have,
then it should be fixed.
Tilman
- K
Kevin Day
*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
www.trumpetinc.com <http://trumpetinc.com/>
LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter <https://twitter.com/trumpetinc>
Proud to be Great Place To Work
<https://www.greatplacetowork.com/certified-company/7012667> Certified
2020-2021
On Fri, Nov 5, 2021 at 12:47 PM Tilman Hausherr <thaush...@t-online.de>
wrote:
Am 05.11.2021 um 20:38 schrieb Kevin Day:
I do have a bit of experience with memory leaks (side effect of
writing
high performance java code, unfortunately!)
I am pretty sure that this is not a memory leak, though. If this
was a
memory leak, we would see calls happening multiple times before
failure.
The failure happens on the very first call to create the buffered
image:
BufferedImage image = new BufferedImage(rasterWidth,
rasterHeight,
BufferedImage.TYPE_INT_ARGB);
Nothing is even getting put into the weakcache.
That is done in TilingPaintFactory. The cache was needed for some files
that have used the pattern several times, to avoid recreating the image
for the TexturePaint.
I think the core issue is that the raster data for the image is
just too
big to store on the heap (without the heap being massive). It's
13152x13152x4 bytes (there is an alpha channel on this as well) -
that's
almost 700MB - just for this one pattern.
Tilman wrote: "That
pattern has an XStep and YStep of 23438 although the image is 2148 x
440. Because of a matrix scale of 0.0673396 the image pattern size is
1578 x 1578 at 72 dpi. So at 1200 dpi the size would be about 26300 x
26300"
I am fairly familiar with PDF (I contributed the parsing library to
iText
back in the day), but I'm not very familiar with rendering tile
patterns,
so the above is not intuitive to me (yet). Is there some chance that
PdfBox is doing a lot more work than is necessary for this particular
PDF?
Like is the matrix scale just absurdly bad in this file? For an image
that
is originally 2148x440, it seems like requiring a 700MB raster
should not
be necessary - but I do NOT have nearly enough experience with this
area
of
PDF...
The ridiculous thing isn't the matrix, it's the XStep and YStep. This
goes well outside of the page. The 0.067 matrix scaling makes it
less bad.
A weakness in PDFBox is that we don't readjust the tile image, or
use an
alternative painting method that doesn't use TexturePaint when the
image
would be painted only once.
Tilman
Thanks,
- Kevin
Kevin Day
*trumpet**p| *480.961.6003 x1002
*e| *ke...@trumpetinc.com
www.trumpetinc.com <http://trumpetinc.com/>
LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet
Blog
<http://trumpetinc.com/blog/>| Twitter
<https://twitter.com/trumpetinc>
Proud to be Great Place To Work
<https://www.greatplacetowork.com/certified-company/7012667>
Certified
2020-2021
On Fri, Nov 5, 2021 at 11:27 AM Tilman Hausherr
<thaush...@t-online.de>
wrote:
If you're experienced with memory leaks then it would be nice if you
could search.
Things I tried that didn't help:
- calling graphics.setPaint(null) after operations (to "lose"
TilingPaint objects)
- disabling the (weak) cache of TilingPaint objects
- adding finalize to see if the TilingPaint class gets finalized
(yes)
- adding finalize to see if the HighResolutionImageIcon class gets
finalized (yes)
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org