On Fri, Sep 06, 2019 at 12:51:56PM -0400, Derrick Stolee wrote:

> > This one in theory benefits lots of other callsites, too, since it means
> > we'll actually return NULL instead of nonsense like "8". But grepping
> > around for calls to this function, I found literally zero of them
> > actually bother checking for a NULL result. So there are probably dozens
> > of similar segfaults waiting to happen in other code paths.
> > Discouraging.
> >
> > This is sort-of attributable to my 834876630b (get_commit_tree(): return
> > NULL for broken tree, 2019-04-09). Before then it was a BUG(). However,
> > that state was relatively short-lived. Before 7b8a21dba1 (commit-graph:
> > lazy-load trees for commits, 2018-04-06), we'd have similarly returned
> > NULL (and anyway, BUG() is clearly wrong since it's a data error).
> > 
> > None of which argues against your patches, but it's kind of sad that the
> > issue is present in so many code paths. I wonder if we could be handling
> > this in a more central way, but I don't see how short of dying.
> This is due to the mechanical conversion from using commit->tree->oid to
> get_commit_tree_oid(commit). Those consumers were not checking if the
> tree pointer was NULL, either, but they probably assumed that the
> parse_commit() call would have failed earlier. Now that we are using this
> method (for performance reasons to avoid creating too many 'struct tree's)
> it makes sense to convert some of them to checking the return value more
> carefully.

Right, none of this is new at all. We have historically been very loose
about assuming that things like commit->tree were valid. And they
_usually_ are. Even if we're missing the object on disk, lookup_tree()
is happy to assign it a struct (unless the object was already seen as
another type!).  I think turning that case into an error from
parse_commit() would cover a lot of cases easily, without forcing each
caller to check for NULL.


Reply via email to