[issue21074] Too aggressive constant folding

Andrew Dalke Sun, 21 May 2017 19:25:16 -0700

Andrew Dalke added the comment:

I know this issue was closed many years ago, and I don't propose re-opening it. 
I write this comment because some of the earlier comments here make it sound 
like only a foolish or perverse programmer might be affected by this 'too 
aggressive constant folding'. I'll provide a real-world example of how it 
affected me. It took me several hours to track it down, and even longer to 
decide that the fault shouldn't be solely attributed to poor coding practices 
on my side.


I recently updated a code base from Python 2.7 to Python 3.5+. It contains a C 
extension which handles 64-bit indexing. One of the test files, 
"test_large.py", exercises the API with multi-gigabyte strings. It typically 
takes about 10 minutes to run so it isn't part of the normal test suite. 
Instead, it's decorated with a @unittest.skipUnless(), and only enabled if the 
file is executed directly or if an environment variable is set.

The file creates about 25 multi-GB string constants, like 's = b"\xfe" * 
(2**32+1)'. Those alone require a minute to create, but that's acceptable 
overhead because these tests are usually skipped, and when not skipped are only 
10% of the total run-time. Here is an example extracted from my code; this 
tests the population count on a byte string:

RUN_ALL = "LARGE_TESTS" in os.environ
if __name__ ==  "__main__":
    RUN_ALL = True

@unittest.skipUnless(RUN_ALL, "large tests not enabled; set LARGE_TESTS")
class LargeTests(unittest.TestSuite):
    def test_popcount(self):
        s = b"\xfe\xff" * (2**31 + 1)
        self.assertEqual(bitops.byte_popcount(s), 15*(2**31 + 1))

if __name__ ==  "__main__":
    unittest.main()

As part of the update I did a 'move function' refactoring across the code base 
and re-ran the tests. Unit test discovery seemed to hang and ^C didn't work. 
Investigation showed it was in __import__("test_large"), which was needed 
because I had changed code in test_large.py. I finally figured out it was due 
to constant folding, which created the string, found it was too large, and 
discarded it. Test discovery took a minute, even though all of the tests were 
marked as 'skip' and would not be called.

Once done, the compiler generated a .pyc file. I hadn't noticed the problem 
earlier because the .py file rarely changed, so rarely needed to be recompiled. 
It would have a bigger issue if I ran test_large.py directly, as that will 
always trigger the one minute compilation, even if I specified a test which 
didn't use them. (There were no bugs in 64-bit handling during the update so I 
didn't need to do that.)

I was going to report the issue, then found that INADA Naoki had reported 
almost exactly the same issue here, back in 2014.

I was bewildered by some of the comments here, because they seemed to suggest I 
was at fault for writing such poor quality code in the first place. Others may 
be in the same boat as me, so I'll make a few points to add some 
counter-balance.

"Are we really supposed to protect programmers from their own folly by 
second-guessing when constant folding might be required and when it might not?"

If there is a 'folly', it is shared with the developers of Python's 
constant-folding implementation who thought there wouldn't be a problem, and 
provide no mechanisms (like #2506 proposed in 2008 to disable optimization; 
also available in #26145) which might help people like me diagnose a problem. 
But I don't think there was any folly. There was an engineering decision that 
the benefits of constant folding outweighed the negatives. Just like in my case 
there was an engineering decision that constant expressions which worked in 
Python 2.5-2.7 didn't need to be made future-proof against improved 
constant-folding.

"How is the interpreter supposed to know the function isn't called?"

Perhaps a standard-library decorator which says that a function will be skipped 
when run as a unit test? But really, the question should be "how is the 
*byte-code compiler* supposed to know". This highlights a shift between the 
Python I started with, which left everything up to the run-time virtual machine 
interpreter, and the smarter compile-time optimizer we have now. As it gets 
smarter, we developers seem to need to know more about how the optimizer works 
in order to avoid unwanted side-effects. Currently this knowledge is 'arcane'.

"simply declare a manifest constant and use that instead"

The fundamental problem is there's no way for a programmer to create large 
constant value which is safe from a sufficiently smart compiler, and nothing 
which outlines how smart the compiler will get. Instead, people figure out what 
works operationally, but that's specific to a given CPython version.

My code ran into problems because Python's constant folding improved from under 
me. Even if I follow that advice, how do I avoid having a future version of 
Python restore the problem?

For example, the following function is currently not subject to constant 
folding.

def big_x():
    N = 4294967296
    return b'x' * N

A future version of Python (perhaps 3.9 with an AST optimizer) might implement 
constant propagation, making this identical to writing "b'x' * 4294967296" now, 
including the long compile time needed to create and discard that 4GB string.

Of course it's harder to do constant folding on a value in the module 
namespace, so the following is less likely to run into future problems. (A 
hypothetical Python could do optimistic constant propagation during module 
compilation, then at run-time check if N has changed, and if not, use the 
pre-computed value.)

N = 4294967296
def big_x():
    return b'x' * N

This distinction is subtle and, I argue, demonstrates that it isn't all that 
simple.


"It would be ashamed to hack-up constant-folding to handle an arcane case of an 
uncalled function that does space intensive constant operation."

I don't think my example test case was arcane. To the contrary, I think many 
people would write the test the same way. I agree that few people construct 
test cases that large. An alternate series of questions might start with the 
Bayesian: of those who need such constants, how many are negatively affected 
(eg, #27695)? How long does it take them to figure out a workaround? And how 
many of them think to file a bug, read comments here which accuse them of folly 
and arcane practices, and stop? As I almost did.

Yes, it has several 'trivially easy work-around's - once the developer figures 
out the fundamental problem. Which workarounds are guaranteed to continue to 
work, and how do non-core developers learn about the possible problem and 
workarounds?

People do get confused about this behavior. One comment from 
http://stackoverflow.com/questions/34113609/why-does-python-preemptively-hang-when-trying-to-calculate-a-very-large-number/
 , which tests a RLIMIT_CPU by computing 10**(10**10), is "This is spooky". Nor 
is it easy to figure out. Others from the same SO page write: "I'm guessing 
that Python is trying to optimize by calculating constants beforehand...which 
backfires. But I have no proof of this.", "Do you know of a place in the 
documentation or something where this precalculation of constants is discussed? 
I tried Googling, but I'm not turning up anything useful", and "It took quite a 
bit of searching, but I have discovered evidence that Python 3 does precompute 
constant literals".

To be clear, I know nothing will change in the current peephole implementation. 
I have read most of the issues on bugs.python.org and stackverflow.com related 
to it. The most frequent assessment is that the current peephole optimizer is 
good enough, and too fragile, and the hope is that an AST optimizer -- 
something first proposed at least 9 years ago (#2506) -- is the right solution. 
This progresses slowly at #11549 (started in 2011), with the same INADA Naoki 
recently picking it up again because "it was suspended (because of lack of 
reviewers, maybe)". 

What I don't want is to see the same useless behavior of creating and 
discarding large constant strings replicated in the AST branch and justified by 
an argument that current experience with the peephole optimizer shows that it's 
not a problem in real-world code.

At the very least, I think there should be some documentation about this arcane 
corner of the CPython implementation.

"Special cases aren't special enough to break the rules."

The peephole optimizer is nothing but support for the special case of 
evaluating dynamic code at compile-time. In an aphorism challenge, the peephole 
optimizer is almost exactly the '*right* now' in "Although never is often 
better than *right* now." I am not saying the peephole optimizer should not 
have been added, rather, the Zen of Python gives different answers depending on 
where one draws the line.

----------
nosy: +dalke

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue21074>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue21074] Too aggressive constant folding

Reply via email to