It seems likely then that verification only works on the subset of expressiveness that can be cowritten in the free and the constrained language. i.e. we aren't verifying c at all.
On Mon, Jul 7, 2014 at 2:17 PM, Thomas Dullien <thomasdull...@google.com> wrote: > My 2c, but my view at the moment is that none of our static analyzers / > theorem provers can > be usefully applied to legacy code. > > The successes of formal verification are usually when the checkers & the > code are co-developed; > e.g. the code is checked on each build while it is written. > > I haven't had time to look at the LZ bug in detail yet, realistically I > may get around to it in August; > once I get around I'll write down how hard I think it would be to detect > this with an automated checker. > > Cheers, > Halvar > > > On Mon, Jul 7, 2014 at 2:04 PM, Meredith L. Patterson <clonea...@gmail.com > > wrote: > >> No promises, but the samples look like ones that Escher ( >> http://www.eschertech.com/index.php) could find. I'm evaluating Escher >> for verifying parts of Hammer (since there are sound principles for what >> "correct" means in regard to the parsing algorithms it implements), and can >> tell you more after I've had some more experience with it. >> >> Cheers, >> --mlp >> >> >> On Mon, Jul 7, 2014 at 1:56 PM, Dan Kaminsky <d...@doxpara.com> wrote: >> >>> Do we have a theorem prover / static analyzer that finds this bug? >>> >>> We sure have a lot that miss it. >>> >>> >>> On Monday, July 7, 2014, <d...@geer.org> wrote: >>> >>>> >>>> This is a verbatim transcript of what appeared on Peter Neumann's >>>> RISKS Digest; note Henry Baker's leadoff comment. >>>> >>>> -----------------8<------------cut-here------------8<----------------- >>>> >>>> Date: Fri, 27 Jun 2014 06:11:41 -0700 >>>> From: Henry Baker <hbak...@pipeline.com> >>>> Subject: Buffer overflows in 20-year-old LZ decompression code >>>> (Don A. Bailey) >>>> >>>> FYI -- It's time to start using theorem provers on *all* code; if you >>>> can't >>>> convince the theorem prover re buffer overflows, you'll have to insert >>>> executable code to explicitly check. HB >>>> >>>> Thursday, June 26, 2014 >>>> Raising Lazarus - The 20 Year Old Bug that Went to Mars >>>> >>>> http://blog.securitymouse.com/2014/06/raising-lazarus-20-year-old-bug-that.html >>>> >>>> It's rare that you come across a bug so subtle that it can last for two >>>> decades. But, that's exactly what has happened with the >>>> Lempel-Ziv-Oberhumer (LZO) algorithm. Initially written in 1994, Markus >>>> Oberhumer designed a sophisticated and extremely efficient compression >>>> algorithm so elegant and well architected that it outperforms zlib and >>>> bzip >>>> by four or five times their decompression speed. >>>> >>>> As a result, Markus has made a successful and well deserved career out >>>> of >>>> optimizing code for various platforms. I was impressed to find out >>>> that his >>>> LZO algorithm has gone to the planet Mars on NASA devices multiple >>>> times! >>>> Most recently, LZO has touched down on the red planet within the Mars >>>> Curiosity Rover, which just celebrated its first martian anniversary on >>>> Tuesday. >>>> >>>> Because of the speed and efficiency of the algorithm, LZO has made its >>>> way >>>> into both proprietary and open source projects world-wide. It's has >>>> lived >>>> in automotive systems, airplanes, and other embedded systems for over a >>>> decade. The algorithm has even made its way into projects we use on a >>>> daily >>>> basis, such as OpenVPN, MPlayer2, Libav, FFmpeg, the Linux kernel, >>>> Juniper >>>> Junos, and much, much, more. >>>> >>>> In the past few years, LZO has gained traction in file systems as well. >>>> LZO >>>> can be used in the Linux kernel within btrfs, squashfs, jffs2, and >>>> ubifs. A >>>> recent variant of the algorithm, LZ4, is used for compression in ZFS for >>>> Solaris, Illumos, and FreeBSD. >>>> >>>> LZO is even enabled in kernels for Samsung Android devices to increase >>>> kernel loading speed and improve the user experience, as noted in the >>>> Android Hacker's Handbook. >>>> >>>> With its popularity increasing, Lempel-Ziv-Oberhumer has been rewritten >>>> by >>>> many engineering firms for both closed and open systems. These >>>> rewrites, >>>> however, have always been based on Oberhumer's core open source >>>> implementation. As a result, they all inherited a subtle integer >>>> overflow. >>>> Even LZ4 has the same exact bug, but changed very slightly. >>>> >>>> Engineered Genetics >>>> >>>> Code reuse is a normal part of engineering, and is something we do every >>>> day. But, it can be dangerous. By reusing code that is known to work >>>> well, >>>> especially in highly optimized algorithms, projects can become subject >>>> to >>>> vulnerabilities in what is perceived as trusted code. Auditing highly >>>> optimized algorithms is a fragile endeavor. It is very easy to break >>>> these >>>> types of algorithms. Therefore, reused code that is highly specialized >>>> is >>>> often presumed safe because of its age, its proven efficiency, and its >>>> fragility. >>>> >>>> This creates a sort of digital DNA, a digital genetic footprint that >>>> can be >>>> traced over time. Though there are certainly many instances of >>>> proprietary >>>> variants of LZO and LZ4, the following six implementations are >>>> available in >>>> open source software >>>> >>>> Oberhumer LZO (core/reference open source implementation) >>>> Linux kernel's LZO implementation >>>> Libav's LZO implementation >>>> FFmpeg's LZO implementation >>>> Linux kernel's LZ4 implementation >>>> LZ4 core/reference implementation >>>> >>>> Despite each implementation of the algorithm being noticeably different, >>>> each variant is vulnerable in the exact same way. Let's take a look at >>>> a >>>> version of the algorithm that is easy to read online, the Linux kernel >>>> implementation found here. >>>> >>>> In all variants of LZ[O4], the vulnerability occurs when processing a >>>> Literal Run. This is a chunk of compressed data that isn't compressed >>>> at >>>> all. Literals are uncompressed bytes that the user decided, for >>>> whatever >>>> reason, should not be compressed. A Literal Run is signaled by a state >>>> machine in LZO, and by a Mask in LZ4. >>>> >>>> 56 if (likely(state == 0)) { >>>> 57 if (unlikely(t == 0)) { >>>> 58 while (unlikely(*ip == 0)) { >>>> 59 t += 255; >>>> 60 ip++; >>>> 61 NEED_IP(1); >>>> 62 } >>>> 63 t += 15 + *ip++; >>>> 64 } >>>> 65 t += 3; >>>> >>>> In the above sample, the integer overflow is evident. The variable 't' >>>> is >>>> incremented by 255 every time the compression payload contains a nil >>>> byte >>>> (0x00) when a Literal Run is detected. Regardless of whether the >>>> variable >>>> 't' is signed or unsigned, 255 will be added to it. The only check is >>>> to >>>> ensure that the input buffer contains another byte. This means that >>>> 't' can >>>> accumulate until it is a very large unsigned integer. If 't' is a 32bit >>>> integer, it only takes approximately sixteen (16) megabytes of zeroes to >>>> generate a sufficiently large value for 't'. Though 't' can overflow >>>> here, >>>> this is not where the attack occurs. There is another more important >>>> overflow just below this chunk of code. >>>> >>>> 66 copy_literal_run: >>>> 67 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) >>>> 68 if (likely(HAVE_IP(t + 15) && >>>> HAVE_OP(t + 15))) { >>>> 69 const unsigned char *ie = ip >>>> + t; >>>> 70 unsigned char *oe = op + t; >>>> 71 do { >>>> 72 COPY8(op, ip); >>>> 73 op += 8; >>>> 74 ip += 8; >>>> 75 COPY8(op, ip); >>>> 76 op += 8; >>>> 77 ip += 8; >>>> 78 } while (ip < ie); >>>> 79 ip = ie; >>>> 80 op = oe; >>>> 81 } else >>>> 82 #endif >>>> >>>> Above, we see the "copy_literal_run" chunk of code. This is the >>>> section of >>>> the LZO algorithm that uses the variable 't' as a size parameter. On >>>> line >>>> 68, the code ensures that the input buffer (IP) and output buffer (OP) >>>> are >>>> large enough to contain 't' bytes. However, in the Linux kernel >>>> implementation, they pad by 15 bytes to ensure the 16 byte copy does not >>>> overflow either buffer. This is where things fail. >>>> >>>> The macros HAVE_IP and HAVE_OP validate that 't' bytes are available in >>>> the >>>> respective buffer. But, before the macro is called, the expression (t >>>> + 15) >>>> is evaluated. If the value of 't' is large enough, this expression will >>>> cause an integer overflow. The attacker can make this expression >>>> result in >>>> a value of zero (0) through fourteen (14) by forcing 't' to equal the >>>> values >>>> -15 to -1, respectively. This means that the HAVE macros will always >>>> believe that enough space is available in both input and output buffers. >>>> >>>> On line 70, the pointer 'oe' will now point to before the 'op' buffer, >>>> potentially pointing to memory prior to the start of the output buffer. >>>> The >>>> subsequent code will copy sixteen (16) bytes from the input pointer to >>>> the >>>> output pointer, which does nothing as these pointers should point to a >>>> "safe" location in memory. However, there are two side effects here >>>> that >>>> the attacker must abuse: lines 78 and 80. >>>> >>>> Because 'ie' will always have an address lower in memory than 'ip', the >>>> loop >>>> is immediately broken after the first sixteen (16) byte copy. This >>>> means >>>> that the value 't' did not cause a crash in the copy loop, making this >>>> copy >>>> essentially a no-op from the attacker's point of view. Most >>>> importantly, on >>>> line 80 (and 79), the buffer pointer is set to the overflown pointer. >>>> This >>>> means that now, the output pointer points to memory outside of the >>>> bounds of >>>> the output buffer. The attacker now has the capability to corrupt >>>> memory, >>>> or at least cause a Denial of Service (DoS) by writing to an invalid >>>> memory >>>> page. >>>> >>>> The Impact of Raising Dead Code >>>> >>>> Each variant of the LZO and LZ4 implementation is vulnerable in slightly >>>> different ways. The attacker must construct a malicious payload to fit >>>> each >>>> particular implementation. One payload cannot be used to trigger more >>>> than >>>> a DoS on each implementation. Because of the slightly different >>>> overflow >>>> requirements, state machine subtleties, and overflow checks that must be >>>> bypassed, even a worldwide DoS is not a simple task. >>>> >>>> This results in completely different threats depending on the >>>> implementation >>>> of the algorithm, the underlying architecture, and the memory layout of >>>> the >>>> target application. Remote Code Execution (RCE) is possible on multiple >>>> architectures and platforms, but absolutely not all. Denial of Service >>>> is >>>> possible on most implementations, but not all. Adjacent Object >>>> Over-Write >>>> (OOW) is possible on many architectures. >>>> >>>> Lazarus raised from the dead >>>> >>>> Because the LZO algorithm is considered a library function, each >>>> specific >>>> implementation must be evaluated for risk, regardless of whether the >>>> algorithm used has been patched. Why? We are talking about code that >>>> has >>>> existed in the wild for two decades. The scope of this algorithm >>>> touches >>>> everything from embedded microcontrollers on the Mars Rover, mainframe >>>> operating systems, modern day desktops, and mobile phones. Engineers >>>> that >>>> have used LZO must evaluate the use case to identify whether or not the >>>> implementation is vulnerable, and in what format. >>>> >>>> Here is a list of impact based on each library. Implementations, or use >>>> cases of each library may change the threat model enough to warrant >>>> reclassification. So, please have a variant audited by a skilled third >>>> party, such as <shameless plug>. >>>> >>>> Oberhumer LZO >>>> RCE: Impractical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: 64bit platforms are impractical for all attacks >>>> Linux kernel LZO >>>> RCE: Impractical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: Only i386/PowerPC are impacted at this time >>>> Libav LZO >>>> RCE: Practical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: ALL ARCHITECTURES/PLATFORMS are RCE capable >>>> FFmpeg LZO >>>> RCE: Practical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: ALL ARCHITECTURES/PLATFORMS are RCE capable >>>> Linux kernel LZ4 >>>> RCE: Practical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: 64bit architectures are NOT considered practical >>>> LZ4 >>>> RCE: Practical >>>> DoS: Practical >>>> OOW: Practical >>>> NOTE: 64bit architectures are NOT considered practical >>>> >>>> For a bug report on each implementation, please visit the Lab Mouse >>>> Security's vulnerability site. >>>> >>>> How Do You Know If You're Vulnerable >>>> >>>> Projects Using LZO/LZ4 >>>> >>>> The easiest way to identify whether your specific implementation is >>>> vulnerable is to determine the maximum chunk size that is passed to the >>>> decompress routine. If buffers of sixteen (16) megabytes or more can be >>>> passed to the LZO or LZ4 decompress routine in one call, then >>>> exploitation >>>> of the integer overflow is possible. For example, ZFS constrains buffer >>>> sizes to 128k. So, even though they use a vulnerable implementation of >>>> LZ4, >>>> an attack is not possible without a second bug to bypass the buffer size >>>> constraint. >>>> >>>> The second easiest way is to identify the bit size of the count >>>> variable. >>>> If the count variable (for example, named 't' in the Linux kernel code >>>> shown >>>> above) is 64bit, it would take such a massive amount of data to trigger >>>> the >>>> overflow that the attack would likely be infeasible, regardless of how >>>> much >>>> data can be passed to the vulnerable function in one call. This is due >>>> to >>>> the fact that even modern computers do not have enough RAM available to >>>> store the data required to implement such an attack. >>>> >>>> However, there is a specific issue with the previous check. Validate >>>> that >>>> even if the count variable is 64bit in size, the value used is still >>>> 64bit >>>> when a length value is checked. If the actual length value is >>>> truncated to >>>> 32bits, the attack will still work with only sixteen (16) megabytes of >>>> data. >>>> >>>> Users >>>> >>>> All users of FFmpeg, Libav, and projects that depend on them, should >>>> consider themselves at risk to remote code execution. Period. Please >>>> update your software from the FFmpeg and Libav websites, or refrain from >>>> using these applications until your distribution has an adequate patch. >>>> >>>> It should be noted that certain Linux distributions package Mplayer2 >>>> with >>>> the base system by default. MPlayer2 is vulnerable to RCE "out of the >>>> box". >>>> If your distribution packages MPlayer2 by default, be sure to disable >>>> the >>>> embedded media player plugin (gecko-mediaplayer) for your browser. >>>> Firefox/Iceweasel, Chromium, Opera, Konqueror, and other Linux-based >>>> browsers are vulnerable to RCE regardless of the platform/architecture >>>> when >>>> an MPlayer2 plugin is enabled. >>>> >>>> Vendor Status >>>> >>>> Lab Mouse has reached out to and worked with each vendor of the >>>> vulnerable >>>> algorithm. As of today, June 26th, 2014, all LZO vendors have patches >>>> either available online, or will later today. Please update as soon as >>>> possible to minimize the existing threat surface. >>>> >>>> In the near future, Lab Mouse will publish a more technical blog on why >>>> and >>>> how RCE is possible using this bug. We consider that information to be >>>> imperative for both auditors and engineers, as it assists in >>>> identifying, >>>> classifying, and prioritizing a threat. However, that report will be >>>> released once the patches have been widely distributed for a sufficient >>>> amount of time. >>>> >>>> For more information, please visit our contact page. We are more than >>>> happy >>>> to help your team with their use case, or implementation of these >>>> algorithms. >>>> >>>> Summary >>>> >>>> Overall, this is how this bug release breaks down. >>>> >>>> Vendors have patches ready or released >>>> Distributions have been notified >>>> Vendors of proprietary variants have been notified (where they could >>>> be found) >>>> All bug reports can be found here >>>> RCE is not only possible but practical on all Libav/FFmpeg based >>>> projects >>>> All others are likely impractical to RCE, but still possible given a >>>> sufficiently skilled attacker >>>> >>>> It is always exciting to uncover a vulnerability as subtle as this >>>> issue, >>>> especially one that has persisted and propagated for two decades. But, >>>> it >>>> makes me pause and consider the way we look at engineering as a model. >>>> >>>> Speed and efficiency are imperatives for modern projects. We're >>>> building >>>> technology that touches our lives like never before. I know that most >>>> engineers strive to build not only elegant, but safe code. But, we >>>> still >>>> see security as a disparate discipline from engineering. Security and >>>> engineering could not be more tightly bound. Without engineering, you >>>> can't >>>> provide security to users. Without security, engineering cannot >>>> provide a >>>> stable and provable platform. >>>> >>>> Neil deGrasse Tyson famously claimed, God is in the gaps. There is a >>>> similar issue in engineering. The individual often sees stability >>>> where the >>>> individual doesn't have expertise. Our God is the algorithm. We >>>> "bless" >>>> certain pieces of code because we don't have the time or knowledge to >>>> evaluate it. When we, as engineers and analysts, take that >>>> perspective, we >>>> are doing a disservice to the people that use our projects and services. >>>> >>>> Often the best eyes are fresh or untrained eyes. The more we stop >>>> telling >>>> ourselves to step over the gaps in our code bases, the more holes we'll >>>> be >>>> able to fill. All it takes is one set of eyes to find a vulnerability, >>>> there is no level of expertise required to look and ask questions. Just >>>> look. Maybe you'll find the next 20 year old vulnerability. >>>> >>>> Thanks >>>> >>>> I'd like to thank the following people for their great assistance >>>> patching, >>>> coordinating, and advising on this issue: >>>> >>>> Greg Kroah-Hartman (Linux) >>>> Linus Torvalds (Linux) >>>> Kees Cook (Google) >>>> Xin LI (FreeBSD) >>>> Michael Niedermayer (FFmpeg) >>>> Luca Barbato (Libav/Gentoo) >>>> Markus Oberhumer >>>> Christopher J. Dorros (NASA MSL) >>>> Dan McDonald (Omniti) >>>> Yves-Alexis Perez (Debian) >>>> Kurt Seifried (Red Hat) >>>> Willy Tarreau (Linux) >>>> Solar Designer (Openwall) >>>> The US-CERT team >>>> The Oracle security team >>>> The GE security team >>>> Kelly Jackson Higgins (UBM) >>>> Steve Ragan (IDG Enterprise) >>>> Elinor Mills >>>> >>>> Feeling Guilty? >>>> >>>> Are you reading this post, thinking about all the administrators and >>>> engineers that are going to have to patch the LZO/LZ4 issue in your >>>> team's >>>> systems? Take some time to tell them how you feel with our hand >>>> crafted Lab >>>> Mouse Security custom Sympathy Card! >>>> >>>> Hand crafted with the finest bits and bytes, our Sympathy Card shows >>>> your >>>> engineer what they mean to you and your team. This is a limited run of >>>> cards, and will proudly display the Linux kernel LZO exploit written by >>>> Lab >>>> Mouse on the card. >>>> >>>> Best wishes, >>>> Don A. Bailey, Founder / CEO, @InfoSecMouse, Lab Mouse Security, 26 Jun >>>> 2014 >>>> >>>> _______________________________________________ >>>> langsec-discuss mailing list >>>> langsec-discuss@mail.langsec.org >>>> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >>>> >>> >>> _______________________________________________ >>> langsec-discuss mailing list >>> langsec-discuss@mail.langsec.org >>> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >>> >>> >> >> _______________________________________________ >> langsec-discuss mailing list >> langsec-discuss@mail.langsec.org >> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >> >> >
_______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss