#34356: Memory leak when generating PDFs
-------------------------------------+-------------------------------------
Reporter: Robin (Robert) | Owner: nobody
Thomas |
Type: Bug | Status: new
Component: Core (Other) | Version: 4.1
Severity: Normal | Resolution:
Keywords: memory memory-leak | Triage Stage:
pdf weasyprint | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Robin (Robert) Thomas:
Old description:
> == Context
>
> Our app generates a one-page PDF report for users. It contains a few
> small SVG and PNG icons, and 4 big textual tables. The PDF is generated
> once, after which it is put in a storage bucket for subsequent retrieval.
>
> == Problem
>
> The app is Django 4.1.6, Weasyprint 57.2, running on Heroku (heroku-22).
> We're not having any issues retrieving previously-generated PDFs, but
> each time it generates a new PDF (filesize 38kb) the app's memory RSS
> increases by 20 - 40mb, as reported by Heroku. This memory usage doesn't
> go down until the server is restarted.
>
> Unfortunately Heroku doesn't automatically restart the server until both
> memory RSS and swap exceed the 512mb limit, so once RSS is used up we
> start getting a lot of pings about OOM errors and have to manually
> restart it.
>
> == What we've tried
>
> Even after removing all images, fonts, and CSS (filesize 32kb) each
> generation still increases the memory RSS by about 17mb.
>
> If we remove everything from the report template, leaving just <!DOCTYPE
> html><html lang="en"><head><title>Test</title><body></body></html>
> (filesize 863b), each generation increases the memory RSS by about 1.3mb.
>
> == Reproduce
>
> I deployed a little test app to show this in action, with a link to the
> source code: https://weasyprint-mem.herokuapp.com/
>
> You can see from the attached image that every time a PDF is generated it
> increases the memory usage, although not always consistently. I would
> expect the data for each PDF to be garbage-collected once it has
> rendered:
>
> [[Image(https://code.djangoproject.com/raw-
> attachment/ticket/34356/219976184-2e826b19-eb1d-40a8-926a-
> b9751468f0eb.jpg)]]
>
> == Related
>
> I opened a bug ticket about this with Weasyprint
> (https://github.com/Kozea/WeasyPrint/issues/1496). They say that because
> they cannot reproduce this when running just Weasyprint by itself from
> the command-line, the memory leak must be elsewhere in the ecosystem.
New description:
== Context
Our app generates a one-page PDF report for users. It contains a few small
SVG and PNG icons, and 4 big textual tables. The PDF is generated once,
after which it is put in a storage bucket for subsequent retrieval.
== Problem
The app is Django 4.1.6, Weasyprint 57.2, running on Heroku (heroku-22).
We're not having any issues retrieving previously-generated PDFs, but each
time it generates a new PDF (filesize 38kb) the app's memory RSS increases
by 20 - 40mb, as reported by Heroku. This memory usage doesn't go down
until the server is restarted.
Unfortunately Heroku doesn't automatically restart the server until both
memory RSS and swap exceed the 512mb limit, so once RSS is used up we
start getting a lot of pings about OOM errors and have to manually restart
it.
== What we've tried
Even after removing all images, fonts, and CSS (filesize 32kb) each
generation still increases the memory RSS by about 17mb.
If we remove everything from the report template, leaving just <!DOCTYPE
html><html lang="en"><head><title>Test</title><body></body></html>
(filesize 863b), each generation increases the memory RSS by about 1.3mb.
== Related
I opened a bug ticket about this with Weasyprint
(https://github.com/Kozea/WeasyPrint/issues/1496). They say that because
they cannot reproduce this when running just Weasyprint by itself from the
command-line, the memory leak must be elsewhere in the ecosystem.
== Reproduce
I deployed a little test app to show this in action, with a link to the
source code: https://weasyprint-mem.herokuapp.com/
You can see from the attached image that every time a PDF is generated it
increases the memory usage, although not always consistently. I would
expect the data for each PDF to be garbage-collected once it has rendered:
[[Image(https://code.djangoproject.com/raw-
attachment/ticket/34356/219976184-2e826b19-eb1d-40a8-926a-
b9751468f0eb.jpg)]]
--
--
Ticket URL: <https://code.djangoproject.com/ticket/34356#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/010701866f59dfa3-9a1fa69e-ca29-43cd-9c93-3dca1eccb069-000000%40eu-central-1.amazonses.com.