DxeImageVerificationLib:Enhanced verification of Offset

Laszlo Ersek Wed, 19 Aug 2020 02:33:54 -0700

On 08/18/20 19:29, Bret Barkelew wrote:
> Jiewen,
> 
> I don’t completely agree with your second point. If the underlying issue is 
> already out of embargo, and we’ve just failed to fix it, that’s not a new 
> issue. I would want to see fixes to the fixes fast-tracked (or at least 
> heavily prioritized), rather than re-entering a full embargo period.


I had the same thought at first -- in the end I guess "it depends". If
we realize that the patch failed to fix the original issue (or a less
obvious issue of the same issue), then I agree the "cat is out of the
bag"; the patch already exposed the vulnerability, so waiting more would
be counter-productive.

OTOH if the fix ends up helping humans recognize a *different*
vulnerability, or even *introduces* a new vulnerability, then that does
seem to deserve an embargo.

I think it depends on the extent of the information that the first patch
exposes. Unfortunately, that "extent" may be vague.

Laszlo

> From: Yao, Jiewen via groups.io<mailto:jiewen.yao=intel....@groups.io>
> Sent: Tuesday, August 18, 2020 6:12 AM
> To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>; 
> ler...@redhat.com<mailto:ler...@redhat.com>
> Cc: xiewen...@huawei.com<mailto:xiewen...@huawei.com>; Wang, Jian 
> J<mailto:jian.j.w...@intel.com>; 
> huangmin...@huawei.com<mailto:huangmin...@huawei.com>; 
> songdongku...@huawei.com<mailto:songdongku...@huawei.com>; Marvin 
> Häuser<mailto:mhaeu...@outlook.de>; Vitaly 
> Cheptsov<mailto:vit9...@protonmail.com>
> Subject: [EXTERNAL] Re: [edk2-devel] [PATCH EDK2 v2 1/1] 
> SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset
> 
> Hi Laszlo
> I agree with on your most points.
> It is all about *return of investment* or *risk control*. Like we cannot 
> pursue 100% security because we will bankrupt ourselves if so.
> 
> Here I just raise my concern.
> 1) If we alway allow developers’ low quality patch without test, the overall 
> quality will become lower and lower. Personally I have no confidence to help 
> catch all those issues. You are the role model on code review. But not all 
> people review the code like you. We need both expertise and time for code 
> review.
> 
> 2) I purposely separate security fix from non security one, because the 
> process is different. The embargo might be 6 months. What if we found the 
> security patch does not fix the problem after embargo? Unlike we send one 
> more patch tomorrow, we need wait for another 6 months...
> 
> thank you!
> Yao, Jiewen
> 
> 
>> 在 2020年8月18日，下午6:17，Laszlo Ersek <ler...@redhat.com> 写道：
>>
>> Hi Jiewen,
>>
>> (+Marvin, +Vitaly)
>>
>> On 08/18/20 01:23, Yao, Jiewen wrote:
>>>> -----Original Message-----
>>>> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo
>>>> Ersek
>>>> Sent: Tuesday, August 18, 2020 12:53 AM
>>>> To: Yao, Jiewen <jiewen....@intel.com>; devel@edk2.groups.io;
>>>> xiewen...@huawei.com; Wang, Jian J <jian.j.w...@intel.com>
>>>> Cc: huangmin...@huawei.com; songdongku...@huawei.com
>>>> Subject: Re: [edk2-devel] [PATCH EDK2 v2 1/1]
>>>> SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset
>>
>> [...]
>>
>>> However, I do think the producer is mandatory for a fix or at least a
>>> security fix.
>>> The owner to fix the issue should guarantee the patch is good.
>>> The owner shall never rely on the code reviewer to figure out if the
>>> patch is good and complete.
>>>
>>> I have some bad experience that bug owner just wrote a patch and tried
>>> to fix a problem, without any test.
>>> And it happened passed code review from someone who does not well
>>> understand the problem, but give rb based upon the time pressure.
>>> Later, the fix was approved to be useless.
>>>
>>> In my memory, at least 3 cases were security fix. They are found, just
>>> because they are sensitive, more people took a look later.
>>>    It was simple. It was one-line change.
>>>   But it has not test, and it was wrong.
>>> "It was ridiculous" -- commented by the people who find the so-called
>>> security fix does not fix the issue.
>>
>> Just because sloppy/rushed reviews exist, and just because reviewers
>> operate under time pressure, we should not automatically reject security
>> fixes that come without a reproducer.
>>
>> Some organizations do develop reproducers, but they never share them
>> publicly (for fear of abuse by others).
>>
>> But more importantly, in an open development project, a developer could
>> have time and expertise to contribute a fix, but not to create a
>> reproducer.
>>
>> - If we make contributing harder, fewer people will upstream their
>>  fixes.
>>
>> - If we make contributing harder, then contributions that do make it to
>>  the tree will be of higher quality.
>>
>> Both statements ring true to me -- so it's a tradeoff.
>>
>> (By "we", I mean the edk2 community.)
>>
>>>> Additionally, the exact statement that the bug report does make,
>>>> namely
>>>>
>>>>  it's possible to overflow Offset back to 0 causing an endless loop
>>>>
>>>> is wrong (as far as I can tell anyway). It is not "OffSet" that can
>>>> be overflowed to zero, but the *addend* that is added to OffSet can
>>>> be overflowed to zero. Therefore the infinite loop will arise because
>>>> OffSet remains stuck at its present value, and not because OffSet
>>>> will be re-set to zero.
>>>>
>>>> For the reasons, we can only speculate as to what the actual problem
>>>> is, unless Jian decides to join the discussion and clarifies what he
>>>> had in mind originally.
>>>
>>> [Jiewen] Would you please clarify what do you mean "we" here?
>>> If "we" means the bug dispatcher, it is totally OK. The dispatcher
>>> just assign the bug.
>>> If "we" means the developer assigned to fix the bug, it is NOT OK. The
>>> developer should take the responsibility to understand the problem.
>>
>> By "we", I mean the edk2 community.
>>
>>>> We can write a patch based on code analysis. It's possible to
>>>> identify integer overflows based on code analysis, and it's possible
>>>> to verify the correctness of fixes by code review. Obviously testing
>>>> is always good, but many times, constructing reproducers for such
>>>> issues that were found by code review, is difficult and time
>>>> consuming. We can say that we don't fix vulnerabilities without
>>>> reproducers, or we can say that we make an effort to fix them even if
>>>> all we have is code analysis (and not a reproducer).
>>>
>>> [Jiewen] I would say: yes and no.
>>> Yes, I agree with you that it might be difficult and time consuming to
>>> construct the reproducer.
>>> However, "obviously" is a subject term. Someone may think something is
>>> obvious, but other people does not.
>>> We should be clear the responsibility of the patch provider is to
>>> provide high quality patch.
>>> Having basic unit test is the best way to prove that the fix is good.
>>>
>>> I have seen bad cases when I ask for the test for patch, then the
>>> answer I got is: "I test the windows boot".
>>> But the test - windows boot - has nothing related to the patch. It
>>> only proves no regression, but cannot prove the issue described is
>>> resolved.
>>
>> Right. It would be ideal if every patch came with a unit test. But that
>> also means some folks will contribute less.
>>
>> Consider normal (not security) patches. We require that all function
>> return values be checked (unless it really doesn't matter if a function
>> call fails). If a function call fails, we need to roll back the actions
>> taken thus far. Release resources and so on. This is why we have the
>> "cascade of error handling labels" pattern.
>>
>> But, of course, we don't test every possible error path in the code. So
>> what's the solution there:
>>
>> - reject such patches that carefully construct the error paths, but do
>>  not provide unit tests with complete error path coverage?
>>
>> - say that we don't care about thorough error paths, so let's just hang,
>>  or leak resources, whenever something fails?
>>
>> Personally I prefer the detailed error paths. They need to be written
>> and reviewed carefully. And they can be accepted even if they are not
>> tested with complete coverage.
>>
>> Some people think otherwise; they say no untested (untestable) code
>> should ever be merged.
>>
>> Back to security patches -- creating reproducers usually requires a
>> setup (tools, expertise, time allocation etc) that is different from a
>> "normal" setup. It may require specialized binary format editors,
>> expertise in "penetration testing", and so on.
>>
>> I mostly know the C language rules that pertain to integer and buffer
>> overflows, so I can usually spot their violations in C code, and propose
>> fixes for them too. But I'm not a security researcher, so I don't write
>> exploits as a norm -- I don't even investigate, generally speaking, the
>> potential practical impact of "undefined behavior". When there's a
>> buffer overflow or integer overflow in the code, that's the *end* of the
>> story for me, while it's the *start* of the work for a security
>> researcher.
>>
>> When you require reproducers for all security patches, you restrict the
>> potential contributor pool to security researchers.
>>
>>> Let's think again in this case, if the patch provider does some basic
>>> unit test, he/she may find out the problem by himself/herself.
>>> That can save other people's time to review.
>>>
>>> I don't prefer to move the responsibility from patch provider to the
>>> code reviewer to check if the fix is good.
>>> Otherwise, the code reviewer may be overwhelmed.
>>>
>>> We may clarify and document the role and responsibility in EDKII
>>> clearly. Once that is ready, we can follow the rule.
>>> Before that is ready, in this particular case, I still prefer we have
>>> producer to prove the patch is good.
>>
>> OK, thanks for explaining.
>>
>> Given that I'm unable to create such a PE file (from scratch or by
>> modifying another one), I won't post the patches stand-alone.
>>
>>>> So the above paragraph concerns "correctness". Regarding
>>>> "completeness", I guarantee you that this patch does not fix *all*
>>>> problems related to PE parsing. (See the other BZ tickets.) It does
>>>> fix *one* issue with PE parsing. We can say that we try to fix such
>>>> issues gradually (give different CVE numbers to different issues, and
>>>> address them one at a time), or we can say that we rewrite PE parsing
>>>> from the ground up. (BTW: I have seriously attempted that in the
>>>> past, and I gave up, because the PE format is FUBAR.)
>>>
>>> [Jiewen] Maybe there is misunderstanding.
>>> I do not mean to let patch provider to fix all issue in PE parsing.
>>> Just like we cannot file one Bugzilla to fix all issue in EDKII - it
>>> is unfair.
>>>
>>> What I mean is that the patch provider should guarantee the
>>> correctness and completeness of the issue in the bug report.
>>>
>>> One faked bad example of correctness is:
>>>    A bug report is file to say: the code has overflow class A.
>>>    The factor is: the code has overflow class A at line X and line Y.
>>>    The patch only modified some code at line X, but the overflow
>>>    class A at line X still exists.
>>>
>>> One faked bad example of completeness is:
>>>    A bug report is file to say: the code has overflow class A.
>>>    The factor is: the code has overflow class A at line X and line Y.
>>>    The patch only fixed the overflow class A at line X but not line
>>>    Y.
>>>
>>> The patch provider should take responsibility to do that work
>>> seriously to find out issue in line X and line Y and fix them.
>>> He/she may choose to just fix line X and line Y. Rewrite is whole
>>> module is NOT required.
>>
>> I agree completely.
>>
>> My point was that we need the bug report to be precise, in the first
>> place. If the bug report doesn't clearly identify lines X and Y, we will
>> likely not get the completeness part right.
>>
>> "Clearly identify" may mean spelling out lines X and Y specifically. Or
>> it may mean defining "class A" sufficiently clearly that someone else
>> reading the affected function can find X and Y themselves.
>>
>>> If I can give some comment, I would think about the provide the fix in
>>> BasePeCoffLib.
>>
>> From a software design perspective, you are 100% right.
>>
>> Unfortunately, I can't do it.
>>
>> That's what I mentioned before -- I had tried rewriting BasePeCoffLib,
>> because in my opinion, BasePeCoffLib is unsalvageable in its current
>> form. And I gave up on the rewrite.
>>
>> Please see the following email. I sent it to some people off-list, on
>> 2020-Feb-14:
>>
>>> There are currently four (4) TianoCore security BZs (1957, 1990, 1993,
>>> 2215), embargoed, that describe various ways in which cunningly
>>> crafted PE images can evade Secure Boot verification.
>>>
>>> [...]
>>>
>>> Primarily, I just couldn't find my peace with the idea that fixing
>>> such PE/COFF parsing mistakes (integer overflows, buffer overflows)
>>> *must* be a one-by-one fixing game. I wanted an approach that would
>>> fix these *classes* of vulnerabilities, in PE/COFF parsing.
>>>
>>> So, beginnning of this February I returned to this topic, and spent
>>> two days on prototyping / researching a container / interval based
>>> approach. Here's one of the commit messages, as a way of explaining:
>>>
>>>    OvmfPkg/DxePeCoffValidatorLib: introduce CONTAINER type and helper funcs
>>>
>>>    For validating the well-formedness of a PE/COFF file, introduce the
>>>    CONTAINER type, and some workhorse functions. (The functions added in 
>>> this
>>>    patch will not be called directly from the code that will process PE/COFF
>>>    structures.)
>>>
>>>    The CONTAINER type describes a contiguous non-empty interval in a PE/COFF
>>>    file (on-disk representation, or in-memory representation). Containers 
>>> can
>>>    be nested. The data from scalar-sized containers can be read out, as part
>>>    of their creation. For on-disk representations of PE/COFF files, scalar
>>>    reads are permitted; for in-memory representations, no data access is
>>>    permitted (only CONTAINER tracking / nesting).
>>>
>>>    The goals of CONTAINER are the following:
>>>
>>>    - enforce the proper nesting of PE/COFF structures (structure boundaries
>>>      must not be crossed by runs of data);
>>>
>>>    - prevent integer overflows and buffer overflows;
>>>
>>>    - prevent zero-size structures;
>>>
>>>    - prevent infinite nesting by requiring proper sub-intervals;
>>>
>>>    - prevent internal PE/COFF pointers from aliasing each other (unless they
>>>      point at container and containee structures);
>>>
>>>    - terminate nesting at scalar-sized containers;
>>>
>>>    - assuming an array of pointers is processed in increasing element order,
>>>      enforce that the pointed-to objects are located at increasing offsets
>>>      too;
>>>
>>>    - assign human-readable names to PE/COFF structures and fields, for
>>>      debugging PE/COFF malformations.
>>>
>>> Because, several of the vulnerabilities exploited cross-directed and
>>> aliased internal pointers in PE/COFF files.
>>>
>>> Two days of delirious spec reading and coding later, and 2000+ lines
>>> later, I decided that my idea was unviable. The PE/COFF spec was so
>>> incredibly mis-designed and crufty that enforcing the *internal*
>>> consistency of all the size fields and the internal pointers would
>>> inevitably fall into one of the following categories:
>>>
>>> - the checks wouldn't be strict enough, and some nasty images would
>>>  slip through,
>>>
>>> - the checks would be too strict, and some quirky, but valid, images
>>>  would be unjustifiedly caught.
>>>
>>> So I gave up and I've accepted that it remains a whack-a-mole game.
>>> [...]
>>>
>>> (NB: I don't claim that ELF is not similarly brain-damaged.)
>>
>> So now, I've only considered contributing patches for bug#2215 because
>> the code in question resides in DxeImageVerificationLib, and *not* in
>> BasePeCoffLib. I'm not going to touch BasePeCoffLib -- in my opinion,
>> BasePeCoffLib is unfixable without a complete rewrite.
>>
>> I would *like* if BasePeCoffLib were fixable incrementally, but I just
>> don't see how that's possible.
>>
>> In support of my opinion, please open the following bugzilla ticket:
>>
>>  
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.tianocore.org%2Fshow_bug.cgi%3Fid%3D2643&amp;data=02%7C01%7Cbret.barkelew%40microsoft.com%7Cada52c41a3db41968b7408d843785e58%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637333531565237004&amp;sdata=C59ZmrHEONxeui4lkIo1RFTTLUOEuU8OQXUbWoS9JWc%3D&amp;reserved=0
>>
>> and search the comments (with the browser's in-page search feature, such
>> as Ctrl+F) for the following expression:
>>
>>  new PE loader
>>
>> I understand exactly what Vitaly and Marvin meant in those comments. :(
>>
>> Thanks,
>> Laszlo
>>
>>
>>
>>
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64422): https://edk2.groups.io/g/devel/message/64422
Mute This Topic: https://groups.io/mt/76165658/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH EDK2 v2 1/1] SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset

Reply via email to