Jeremy Fleischman created THRIFT-5733:
-----------------------------------------
Summary: Building code with circular `include`s can result in tons
of memory usage and eventual segfault
Key: THRIFT-5733
URL: https://issues.apache.org/jira/browse/THRIFT-5733
Project: Thrift
Issue Type: Bug
Components: Compiler (General)
Affects Versions: 0.18.1
Environment: I'm on Linux, but this also happens to my coworkers on
macOS.
Reporter: Jeremy Fleischman
Attachments: 2023-09-01_16-30-26_pattern.png
If I try to build the following thrift code, it pretty quickly segfaults:
*setup:*
{{$ cat foo.thrift}}
{{include "bar.thrift"}}
{{$ cat bar.thrift}}
{{include "foo.thrift"}}
*build:*
{{$ thrift --allow-64bit-consts --gen py:slots foo.thrift}}
{{[2] 210654 segmentation fault (core dumped) thrift --allow-64bit-consts --gen
py:slots foo.thrift}}
Not very user friendly error message I've ever received ;), but pretty must
just a cosmetic issue (maybe there's a buffer overflow somewhere and some
potential security exploit to worry about if you're compiling untrusted thrift
code, but I personally never do that, so it doesn't stress me out).
However, if you add a 3rd file to the mix, things can get {_}really weird{_}.
*setup:*
{{$ cat foo.thrift}}
{{include "bar.thrift"}}
{{$ cat bar.thrift}}
{{include "large-enum.thrift"}}
{{include "foo.thrift"}}
{{$ cat large-enum.thrift}}
{{enum LargeEnum {}}
{{ FOO0 = 0,}}
{{ FOO1 = 1,}}
{{ ... [FOO2 through FOO1998] ...}}
{{ FOO1999 = 1999,
}}
{{}}}
If I try to build this, it'll suck up all 32 GiB of RAM on my machine and
render my computer completely unusable. If you reduce the number of entries in
{{{}LargeEnum{}}}, you can get the thrift compiler to use a ton of RAM before
it finally segfaults as in the first example. I've attached a screenshot so you
can see how RAM and CPU gets used on my machine while attempting to build the
above code.
I've also put together a simple repro on
[https://github.com/jfly/2023-09-01-thrift-circular-import,] which can
autogenerate the 3 files described above. (Just be careful when running it that
you kill it before it soaks up all of your ram!)
Yesterday, this explosive use of RAM brought our company's build server (with
128 GiB of RAM!) to its knees. We spent a lot of time flailing around before we
finally tracked it down to one problematic PR that introduced a circular
include.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)