-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Tilman,
On 8/1/17 4:42 PM, Tilman Hausherr wrote:
> Am 01.08.2017 um 22:09 schrieb Christopher Schultz: Tilman,
>
> On 8/1/17 3:22 PM, Tilman Hausherr wrote:
>>>> The only thing that comes close to what you want is to create
>>>> your PDDocument with MemoryUsageSetting.setupMixed(...) as
>>>> parameter.
> So that we can buffer to disk if the in-memory representation gets
> too big? That sounds like a good approach, and probably the most
> useful to m e.
>
> It also appears that I can set a maximum in-memory limit like
> this:
>
> MemoryUsageSetting mus = MemoryUsageSetting.setupMainMemoryOnly(1
> * 1024 * 1024); PDDocument doc = new PDDocument(mus);
>
>> Yes. Although this would mean you'd get an exception if you use
>> more. That's why I recommend the mixed one. You could use the
>> memory limit for stress tests, i.e. create the "worst" possible
>> file and see what you need.
I think I'm okay with an exception in these cases. As I said, our PDFs
only end up being a few kiB in size, so I've put a 1MiB cap on the
memory-only memory usage strategy for the time being.
I'm curious about what's being constrained, here... does PDFBox
estimate its current memory-usage of various PD* objects in memory and
push to disk when that's exceeded, or does it just limit the amount of
memory that gets used when serializing out to a stream.
>> Note that only streams are cached. Ordinary java structures (e.g.
>> maps, numbers, strings) are not.
Can you tell me a little more about that? When you say "streams are
cached", what does that mean exactly?
Or have I essentially already asked that question above?
> ... and then this should enforce a 1MiB size limit, no? I think
> that's all I want... there shouldn't be any reason for me to have
> to touch the disk: my files are really quite small. I just don't
> want something to go wrong with my client code and inadvertently go
> into an infinite loop adding "Hello World" to the document over and
> over until I have 50k pages in the PDF and an OOME on my hands.
>
>>>> What you should do is to care to not have anything duplicate.
>>>> So if you have a company logo on every page, create your
>>>> object object only once. Same for fonts.
> We have something like:
>
> private Font _theFont;
>
> ... contentStream.setFont(_theFont);
> contentStream.newLineAtOffset(x,y); contentStream.showText("Hello,
> world"); ...
>
>
> Many many times. The Font object reference stays the same, so I'm
> guessing that's okay and the font is used once and referenced many
> times, right?
>
>> Yes!
>
>> To create small PDF files, use PDType0Font.load() instead of
>> PDTrueTypeFont.load(), this will subset the fonts after saving.
We are using PDType1Font.FONTNAME for everything, so we aren't calling
.load for anything at all.
>>>> And try to have only one content stream per page. (We
>>>> recently had a guy who had a huge number of content streams
>>>> and wondered why his PDF was so big).
> Check: we have only one PDPageContentStream per page.
>
> We have a single logo on the first page and nothing repeated.
>
> Our PDFs are almost 100% plain-text with lots of whitespace (which
> doesn't count, I know). When base64 encoded, they are typically
> only a few kb in size.
>
> I'm mostly operating from a position of borderline unhealthy
> paranoia, but I'd rather have a bit of code added to ensure that I
> don't have to get paged in the middle of the night to restart a
> service that has suffered an OOME.
>
>> This all sounds harmless. All the memory problems I can think of
>> were related to rendering, not PDF creation.
Sounds good.
>> We've had a least one speed complaint, but that one is solved in
>> the current version.
I'll make sure we are up-to-date.
Thanks,
- -chris
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBCAAGBQJZgQwMAAoJEBzwKT+lPKRYws4P/RvvC0+6xp5fMINPAey98Pj6
cxTSeAkm0RsLl9lZrCxBjVRHNGsKBd1G70fgFEp6uB+5tU14Na0m1nZZ2WNGtiko
dwTseWL/m/FiggHDrzsT+RQVlbBoUzhBpyHYmEkRnbfQnS98eE0ZTSlN59IAStzn
yD7jFEds/nJucJZk9O6so9lOa9waGMf+s2MEp1YfMizytuIRK4ch3JG5/cBVQa8S
2W3J/Y/fIQWXOAx433XuVG9rC00RKtaMJahjOwyhmUIznNlR/yGH+0iiqwziUyXX
UtqsPTyFrGHQcHr4gaiewug6V//P5HC+XYhqyU0AR1EJolYSGXPY0UtRuTgCtAQ0
FXFjaYPppumKCjV9QMIfRcps7XclwoV/kiip5H3DIZwIL81PRE3rjthuE75uAjps
OEtGWjte9DDfDkkV6gudp0DmCBWq6oMyw7m4vm7rLACPXt0ziZtEKU698N7m88T6
vFxLtZloUbGVj0UAe4Sr6e31fw+5+dp2gpFNgKSP8FBGWAGLA+6srSA9sucpsqev
yG4QgReFNclDgO7i/6H5W1DcNZeTOwLJ+vT5BJafSvgHBGhLGy3F1uM3IyeFMgf7
XBHr4Em8p41aGS0BCvtGQ+xFMPCPKIHEvZxLZ+1JxboS0g5+KT8LHnCWvXjc6gSa
w9Dyle4TNPUoJHp24k/p
=YM5j
-----END PGP SIGNATURE-----
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]