New submission from Greg Price <[email protected]>:
Currently `_PyObject_VAR_SIZE` effectively has a precondition that it must not
be passed arguments which would overflow in its arithmetic. If that's violated,
it overflows... which is undefined behavior already, and in fact the likely
next thing that happens is we allocate a short buffer for a giant array of
items, then overflow the buffer, bam!
(The arithmetic is on `Py_ssize_t`, a signed integer type; that's why overflow
is UB.)
It's quite difficult to maintain that precondition at call sites, because it
effectively means writing a check that duplicates the content of
`_PyObject_VAR_SIZE` modulo some algebraic rearrangement. Empirically we don't
consistently manage that: we just fixed an 11-year-old bug where such a check
was wrong (luckily in the safe/over-conservative direction), and some other
call paths don't immediately appear to be checking at all.
So, I think it would be helpful to move that check into `_PyObject_VAR_SIZE`
itself.
Cc'ing the three folks on the interest list "coverity scan", as that one seems
the closest match for who'd likely be interested in commenting on this.
Details below, largely from GH-14838 which got me looking at this.
---
In GH-14838, @sir-sigurd fixed an overflow check on a call to
PyObject_GC_NewVar. The call site looked like this:
if ((size_t)size > ((size_t)PY_SSIZE_T_MAX - sizeof(PyTupleObject) -
sizeof(PyObject *)) / sizeof(PyObject *)) {
return (PyTupleObject *)PyErr_NoMemory();
}
op = PyObject_GC_NewVar(PyTupleObject, &PyTuple_Type, size);
but the `- sizeof(PyObject *)` had the wrong sign. Happily the bug made the
check too conservative, but that's basically a lucky coin-flip we had -- a
similar bug could just as easily have gone the other direction, making the code
vulnerable to arithmetic overflow, then memory corruption from overflowing a
too-small buffer.
The bug was present for 11 years.
Sergey, I am still very curious how you discovered the bug. :-)
I feel like there's an opportunity for a more complete fix as a followup to
this; that's this thread.
---
My next question on reading that fix was, OK, how can we write this so it's
hard to get wrong?
I think the point is that `PyObject_GC_NewVar` really has a precondition, that
the `nitems` it's passed (`size` in this caller) not be so big that the
allocation will overflow. Specifically `_PyObject_VAR_SIZE(tp, nitems)` needs
to not overflow. All the trickiness in this overflow check is basically an
algebraic rearrangement of what `_PyObject_VAR_SIZE(&PyTuple_Type, size)` would
say, plus substituting in the values actually found at `&PyTuple_Type`.
So, a sort of minimal fix would be to make something with a name like
`PyObject_GC_NewVar_WouldOverflow` that encodes that. Maybe that in turn would
just be a macro to delegate to a `_PyObject_VAR_SIZE_WOULD_OVERFLOW`, so that
each layer only has to know what it actually knows.
I think that immediately raises a next question, though, which is: are other
call sites of `PyObject_GC_NewVar` checking properly for overflow? At a quick
grep, I see some that don't appear to be checking.
And: why should they have to? It seems like the best place for this check is
immediately before the possibly-overflowing computation; or if not there, then
the closer to it the better.
So, the ideal solution: `_PyObject_VAR_SIZE` itself would be augmented to first
do this check, and only if it won't overflow then go and actually multiply.
There are only a handful of call sites of `_PyObject_VAR_SIZE`, so it wouldn't
even be much of a migration to then take care of each of them.
Then we wouldn't need these prophylactic checks at callers' callers -- which
(a) this bug demonstrates are hard to consistently get right, and (b) the grep
mentioned above suggests we may not be consistently doing in the first place.
----------
components: Interpreter Core
messages: 351573
nosy: Greg Price, brett.cannon, christian.heimes, sir-sigurd, twouters
priority: normal
severity: normal
status: open
title: _PyObject_VAR_SIZE should avoid arithmetic overflow
versions: Python 3.9
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue38079>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com