On 01/01/2017 06:59 AM, Jeff King wrote:
> On Sat, Dec 31, 2016 at 06:30:01PM -0800, Junio C Hamano wrote:
>
>> Michael Haggerty <[email protected]> writes:
>>
>>> "refname" has already been checked by check_refname_format(), so it
>>> cannot have consecutive slashes.
>>
>> In the endgame state, this has two callers. Both use what came in
>> the transaction->updates[] array. Presumably "has already been
>> checked by check_refname_format()" says that whoever created entries
>> in that array must have called the function, but it would be helpful
>> to be more explicit here.
>
> Hmm, yeah. This is called when we are deleting a ref, and I thought we
> explicitly _didn't_ do check_refname_format() when deleting, so that
> funny-named refs could be deleted. It's only is_refname_safe() that we
> must pass.
>
> So I have no idea if that's a problem in the code or not, but it is at
> least not immediately obvious who is responsible for calling
> check_refname_format() here.
My assumption was that only valid reference names should ever be allowed
to be inserted into a `ref_transaction` entry. But Peff is right that
sometimes the reference name is checked by `refname_is_safe()` rather
than `check_refname_format()`. Let's audit this more carefully...
* `ref_transaction_add_update()` relies on its caller doing the check
(this fact is documented). Its callers are:
* `ref_transaction_update()` (the usual codepath), which checks the
reference itself using either check_refname_format() or
refname_is_safe() depending on what kind of update it is.
* `split_head_update()` passes the literal string "HEAD".
* `split_symref_update()` picks apart reference updates that go
through existing symbolic references. As such I don't think they
are an attack surface. It doesn't do any checking itself (here
we're talking about its `referent` argument). It has only one
caller:
* `lock_ref_for_update()`, which gets `referent` from:
* `files_read_raw_ref()`, which gets the value either:
* by reading a filesystem-level symlink's contents and
checking it with `check_refname_format()`, or
* reading a symref from the filesystem. In this case, *the
value is not checked*.
Obviously this chain of custody is disconcertingly long and complicated.
And the gap for symrefs should probably be fixed, even though they are
hopefully trustworthy.
`refname_is_safe()` checks that its argument is either UPPER_CASE or
looks like a plausible filename under "refs/". Contrary to its
docstring, it *does not* accept strings that contain successive slashes
or "." or ".." components. It was made stricter in
e40f355 "refname_is_safe(): insist that the refname already be
normalized", 2016-04-27
, but the docstring wasn't updated at that time. I'll fix it.
I think the best thing to do is to drop this patch from the patch
series, and address these checks in a separate series. (There are more
known problems in this area, for example that the checks in
`check_refname_format()` are not a strict superset of the checks in
`refname_is_safe()`.)
Michael