Good day everyone,

First off, for your information, I'm sending from my new e-mail address from now on.

Please excuse me, I cannot read your entire thread right now, I will definitely make sure to catch up as soon as time permits, but I just wanted to confirm we are indeed working on a reimplementation of the PE loader. It involves correcting several security issues (which will be detailed as the patches are sent as anything else would be too much work for us right now), reducing code duplication (how often is the hashing algorithm duplicated across edk2? :) ) and a more or less experimental approach to formal verification. We plan to submit it this year, however please note that this is a low priority project and is not being worked on on a full-time basis.

Please let us know about your own plans so we do not end up duplicating work.

Best regards,
Marvin

Am 18.08.2020 um 12:17 schrieb Laszlo Ersek:
Hi Jiewen,

(+Marvin, +Vitaly)

On 08/18/20 01:23, Yao, Jiewen wrote:
-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo
Ersek
Sent: Tuesday, August 18, 2020 12:53 AM
To: Yao, Jiewen <jiewen....@intel.com>; devel@edk2.groups.io;
xiewen...@huawei.com; Wang, Jian J <jian.j.w...@intel.com>
Cc: huangmin...@huawei.com; songdongku...@huawei.com
Subject: Re: [edk2-devel] [PATCH EDK2 v2 1/1]
SecurityPkg/DxeImageVerificationLib:Enhanced verification of Offset

[...]

However, I do think the producer is mandatory for a fix or at least a
security fix.
The owner to fix the issue should guarantee the patch is good.
The owner shall never rely on the code reviewer to figure out if the
patch is good and complete.

I have some bad experience that bug owner just wrote a patch and tried
to fix a problem, without any test.
And it happened passed code review from someone who does not well
understand the problem, but give rb based upon the time pressure.
Later, the fix was approved to be useless.

In my memory, at least 3 cases were security fix. They are found, just
because they are sensitive, more people took a look later.
     It was simple. It was one-line change.
    But it has not test, and it was wrong.
"It was ridiculous" -- commented by the people who find the so-called
security fix does not fix the issue.

Just because sloppy/rushed reviews exist, and just because reviewers
operate under time pressure, we should not automatically reject security
fixes that come without a reproducer.

Some organizations do develop reproducers, but they never share them
publicly (for fear of abuse by others).

But more importantly, in an open development project, a developer could
have time and expertise to contribute a fix, but not to create a
reproducer.

- If we make contributing harder, fewer people will upstream their
   fixes.

- If we make contributing harder, then contributions that do make it to
   the tree will be of higher quality.

Both statements ring true to me -- so it's a tradeoff.

(By "we", I mean the edk2 community.)

Additionally, the exact statement that the bug report does make,
namely

   it's possible to overflow Offset back to 0 causing an endless loop

is wrong (as far as I can tell anyway). It is not "OffSet" that can
be overflowed to zero, but the *addend* that is added to OffSet can
be overflowed to zero. Therefore the infinite loop will arise because
OffSet remains stuck at its present value, and not because OffSet
will be re-set to zero.

For the reasons, we can only speculate as to what the actual problem
is, unless Jian decides to join the discussion and clarifies what he
had in mind originally.

[Jiewen] Would you please clarify what do you mean "we" here?
If "we" means the bug dispatcher, it is totally OK. The dispatcher
just assign the bug.
If "we" means the developer assigned to fix the bug, it is NOT OK. The
developer should take the responsibility to understand the problem.

By "we", I mean the edk2 community.

We can write a patch based on code analysis. It's possible to
identify integer overflows based on code analysis, and it's possible
to verify the correctness of fixes by code review. Obviously testing
is always good, but many times, constructing reproducers for such
issues that were found by code review, is difficult and time
consuming. We can say that we don't fix vulnerabilities without
reproducers, or we can say that we make an effort to fix them even if
all we have is code analysis (and not a reproducer).

[Jiewen] I would say: yes and no.
Yes, I agree with you that it might be difficult and time consuming to
construct the reproducer.
However, "obviously" is a subject term. Someone may think something is
obvious, but other people does not.
We should be clear the responsibility of the patch provider is to
provide high quality patch.
Having basic unit test is the best way to prove that the fix is good.

I have seen bad cases when I ask for the test for patch, then the
answer I got is: "I test the windows boot".
But the test - windows boot - has nothing related to the patch. It
only proves no regression, but cannot prove the issue described is
resolved.

Right. It would be ideal if every patch came with a unit test. But that
also means some folks will contribute less.

Consider normal (not security) patches. We require that all function
return values be checked (unless it really doesn't matter if a function
call fails). If a function call fails, we need to roll back the actions
taken thus far. Release resources and so on. This is why we have the
"cascade of error handling labels" pattern.

But, of course, we don't test every possible error path in the code. So
what's the solution there:

- reject such patches that carefully construct the error paths, but do
   not provide unit tests with complete error path coverage?

- say that we don't care about thorough error paths, so let's just hang,
   or leak resources, whenever something fails?

Personally I prefer the detailed error paths. They need to be written
and reviewed carefully. And they can be accepted even if they are not
tested with complete coverage.

Some people think otherwise; they say no untested (untestable) code
should ever be merged.

Back to security patches -- creating reproducers usually requires a
setup (tools, expertise, time allocation etc) that is different from a
"normal" setup. It may require specialized binary format editors,
expertise in "penetration testing", and so on.

I mostly know the C language rules that pertain to integer and buffer
overflows, so I can usually spot their violations in C code, and propose
fixes for them too. But I'm not a security researcher, so I don't write
exploits as a norm -- I don't even investigate, generally speaking, the
potential practical impact of "undefined behavior". When there's a
buffer overflow or integer overflow in the code, that's the *end* of the
story for me, while it's the *start* of the work for a security
researcher.

When you require reproducers for all security patches, you restrict the
potential contributor pool to security researchers.

Let's think again in this case, if the patch provider does some basic
unit test, he/she may find out the problem by himself/herself.
That can save other people's time to review.

I don't prefer to move the responsibility from patch provider to the
code reviewer to check if the fix is good.
Otherwise, the code reviewer may be overwhelmed.

We may clarify and document the role and responsibility in EDKII
clearly. Once that is ready, we can follow the rule.
Before that is ready, in this particular case, I still prefer we have
producer to prove the patch is good.

OK, thanks for explaining.

Given that I'm unable to create such a PE file (from scratch or by
modifying another one), I won't post the patches stand-alone.

So the above paragraph concerns "correctness". Regarding
"completeness", I guarantee you that this patch does not fix *all*
problems related to PE parsing. (See the other BZ tickets.) It does
fix *one* issue with PE parsing. We can say that we try to fix such
issues gradually (give different CVE numbers to different issues, and
address them one at a time), or we can say that we rewrite PE parsing
from the ground up. (BTW: I have seriously attempted that in the
past, and I gave up, because the PE format is FUBAR.)

[Jiewen] Maybe there is misunderstanding.
I do not mean to let patch provider to fix all issue in PE parsing.
Just like we cannot file one Bugzilla to fix all issue in EDKII - it
is unfair.

What I mean is that the patch provider should guarantee the
correctness and completeness of the issue in the bug report.

One faked bad example of correctness is:
     A bug report is file to say: the code has overflow class A.
     The factor is: the code has overflow class A at line X and line Y.
     The patch only modified some code at line X, but the overflow
     class A at line X still exists.

One faked bad example of completeness is:
     A bug report is file to say: the code has overflow class A.
     The factor is: the code has overflow class A at line X and line Y.
     The patch only fixed the overflow class A at line X but not line
     Y.

The patch provider should take responsibility to do that work
seriously to find out issue in line X and line Y and fix them.
He/she may choose to just fix line X and line Y. Rewrite is whole
module is NOT required.

I agree completely.

My point was that we need the bug report to be precise, in the first
place. If the bug report doesn't clearly identify lines X and Y, we will
likely not get the completeness part right.

"Clearly identify" may mean spelling out lines X and Y specifically. Or
it may mean defining "class A" sufficiently clearly that someone else
reading the affected function can find X and Y themselves.

If I can give some comment, I would think about the provide the fix in
BasePeCoffLib.

 From a software design perspective, you are 100% right.

Unfortunately, I can't do it.

That's what I mentioned before -- I had tried rewriting BasePeCoffLib,
because in my opinion, BasePeCoffLib is unsalvageable in its current
form. And I gave up on the rewrite.

Please see the following email. I sent it to some people off-list, on
2020-Feb-14:

There are currently four (4) TianoCore security BZs (1957, 1990, 1993,
2215), embargoed, that describe various ways in which cunningly
crafted PE images can evade Secure Boot verification.

[...]

Primarily, I just couldn't find my peace with the idea that fixing
such PE/COFF parsing mistakes (integer overflows, buffer overflows)
*must* be a one-by-one fixing game. I wanted an approach that would
fix these *classes* of vulnerabilities, in PE/COFF parsing.

So, beginnning of this February I returned to this topic, and spent
two days on prototyping / researching a container / interval based
approach. Here's one of the commit messages, as a way of explaining:

     OvmfPkg/DxePeCoffValidatorLib: introduce CONTAINER type and helper funcs

     For validating the well-formedness of a PE/COFF file, introduce the
     CONTAINER type, and some workhorse functions. (The functions added in this
     patch will not be called directly from the code that will process PE/COFF
     structures.)

     The CONTAINER type describes a contiguous non-empty interval in a PE/COFF
     file (on-disk representation, or in-memory representation). Containers can
     be nested. The data from scalar-sized containers can be read out, as part
     of their creation. For on-disk representations of PE/COFF files, scalar
     reads are permitted; for in-memory representations, no data access is
     permitted (only CONTAINER tracking / nesting).

     The goals of CONTAINER are the following:

     - enforce the proper nesting of PE/COFF structures (structure boundaries
       must not be crossed by runs of data);

     - prevent integer overflows and buffer overflows;

     - prevent zero-size structures;

     - prevent infinite nesting by requiring proper sub-intervals;

     - prevent internal PE/COFF pointers from aliasing each other (unless they
       point at container and containee structures);

     - terminate nesting at scalar-sized containers;

     - assuming an array of pointers is processed in increasing element order,
       enforce that the pointed-to objects are located at increasing offsets
       too;

     - assign human-readable names to PE/COFF structures and fields, for
       debugging PE/COFF malformations.

Because, several of the vulnerabilities exploited cross-directed and
aliased internal pointers in PE/COFF files.

Two days of delirious spec reading and coding later, and 2000+ lines
later, I decided that my idea was unviable. The PE/COFF spec was so
incredibly mis-designed and crufty that enforcing the *internal*
consistency of all the size fields and the internal pointers would
inevitably fall into one of the following categories:

- the checks wouldn't be strict enough, and some nasty images would
   slip through,

- the checks would be too strict, and some quirky, but valid, images
   would be unjustifiedly caught.

So I gave up and I've accepted that it remains a whack-a-mole game.
[...]

(NB: I don't claim that ELF is not similarly brain-damaged.)

So now, I've only considered contributing patches for bug#2215 because
the code in question resides in DxeImageVerificationLib, and *not* in
BasePeCoffLib. I'm not going to touch BasePeCoffLib -- in my opinion,
BasePeCoffLib is unfixable without a complete rewrite.

I would *like* if BasePeCoffLib were fixable incrementally, but I just
don't see how that's possible.

In support of my opinion, please open the following bugzilla ticket:

   https://bugzilla.tianocore.org/show_bug.cgi?id=2643

and search the comments (with the browser's in-page search feature, such
as Ctrl+F) for the following expression:

   new PE loader

I understand exactly what Vitaly and Marvin meant in those comments. :(

Thanks,
Laszlo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64372): https://edk2.groups.io/g/devel/message/64372
Mute This Topic: https://groups.io/mt/76165658/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to