[ https://issues.apache.org/jira/browse/THRIFT-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17761496#comment-17761496 ]
Jeremy Fleischman commented on THRIFT-5733: ------------------------------------------- [~jensg], I don't know why this is happening, I haven't dug into the thrift source code to try to understand. Or maybe you were asking about _what_ exactly is weird? I just reworded my original post to hopefully be clearer about the behavior I'm seeing. > Building code with circular `include`s can result in tons of memory usage and > eventual segfault > ----------------------------------------------------------------------------------------------- > > Key: THRIFT-5733 > URL: https://issues.apache.org/jira/browse/THRIFT-5733 > Project: Thrift > Issue Type: Bug > Components: Compiler (General) > Affects Versions: 0.18.1 > Environment: I'm on Linux, but this also happens to my coworkers on > macOS. > Reporter: Jeremy Fleischman > Priority: Major > Attachments: 2023-09-01_16-30-26_pattern.png > > > If I try to build the following thrift code, it pretty quickly segfaults: > *setup:* > {code:java} > $ cat foo.thrift > include "bar.thrift" > $ cat bar.thrift > include "foo.thrift" > {code} > *build:* > {code:java} > $ thrift --allow-64bit-consts --gen py:slots foo.thrift > [2] 210654 segmentation fault (core dumped) thrift --allow-64bit-consts --gen > py:slots foo.thrift{code} > Not very user friendly error message I've ever received ;), but pretty must > just a cosmetic issue (maybe there's a buffer overflow somewhere and some > potential security exploit to worry about if you're compiling untrusted > thrift code, but I personally never do that, so it doesn't stress me out). > However, if you add a 3rd file to the mix, things can get {_}really weird{_}. > If I try to build the following code, it'll suck up all 32 GiB of RAM on my > machine and render my computer completely unusable. If you reduce the number > of entries in {{{}LargeEnum{}}}, you can get the thrift compiler to use a ton > of RAM before it finally segfaults as in the first example. I've attached a > screenshot so you can see how RAM and CPU gets used on my machine while > attempting to build the above code. > *problematic code:* > {code:java} > $ cat foo.thrift > include "bar.thrift" > $ cat bar.thrift > include "large-enum.thrift" > include "foo.thrift" > $ cat large-enum.thrift > enum LargeEnum { > FOO0 = 0, > FOO1 = 1, > ... [FOO2 through FOO1998] ... > FOO1999 = 1999, > } > {code} > I've also put together a simple repro on > [https://github.com/jfly/2023-09-01-thrift-circular-import,] which can > autogenerate the 3 files described above. (Just be careful when running it > that you kill it before it soaks up all of your ram!) > Yesterday, this explosive use of RAM brought our company's build server (with > 128 GiB of RAM!) to its knees. We spent a lot of time flailing around before > we finally tracked it down to one problematic PR that introduced a circular > include. -- This message was sent by Atlassian Jira (v8.20.10#820010)