I agree, actually. AEG is the counterexample to "there are no exploits". Though merely proving memory corruption tends to be enough.
On Mon, Jul 7, 2014 at 8:35 PM, Sergey Bratus <ser...@cs.dartmouth.edu> wrote: > > It seems likely then that verification only works on the subset of >> expressiveness that can be cowritten in the free and the constrained >> language. i.e. we aren't verifying c at all. >> > > A related thought: > > Julien Vanegue at the LangSec workshop (http://www.ieee-security.org/ > TC/SPW2014/papers/5103a209.PDF) made the point that proofs applied to any > piece of code may verify its correctness when inputs are within > specification, but tell us very little about what the code may compute when > what hits it is _outside_ specification. This style of verification is hard > -- but it also seems very necessary for security. > > For that reason, I also thought it significant when Avgerinos et al in > http://cacm.acm.org/magazines/2014/2/171687-automatic-exploit-generation/ > called automatic exploit generation a "verification task". > > Thanks, > > --Sergey > > > >> >> On Mon, Jul 7, 2014 at 2:17 PM, Thomas Dullien <thomasdull...@google.com> >> wrote: >> >> My 2c, but my view at the moment is that none of our static analyzers / >>> theorem provers can >>> be usefully applied to legacy code. >>> >>> The successes of formal verification are usually when the checkers & the >>> code are co-developed; >>> e.g. the code is checked on each build while it is written. >>> >>> I haven't had time to look at the LZ bug in detail yet, realistically I >>> may get around to it in August; >>> once I get around I'll write down how hard I think it would be to detect >>> this with an automated checker. >>> >>> Cheers, >>> Halvar >>> >>> >>> On Mon, Jul 7, 2014 at 2:04 PM, Meredith L. Patterson < >>> clonea...@gmail.com >>> >>>> wrote: >>>> >>> >>> No promises, but the samples look like ones that Escher ( >>>> http://www.eschertech.com/index.php) could find. I'm evaluating Escher >>>> for verifying parts of Hammer (since there are sound principles for what >>>> "correct" means in regard to the parsing algorithms it implements), and >>>> can >>>> tell you more after I've had some more experience with it. >>>> >>>> Cheers, >>>> --mlp >>>> >>>> >>>> On Mon, Jul 7, 2014 at 1:56 PM, Dan Kaminsky <d...@doxpara.com> wrote: >>>> >>>> Do we have a theorem prover / static analyzer that finds this bug? >>>>> >>>>> We sure have a lot that miss it. >>>>> >>>>> >>>>> On Monday, July 7, 2014, <d...@geer.org> wrote: >>>>> >>>>> >>>>>> This is a verbatim transcript of what appeared on Peter Neumann's >>>>>> RISKS Digest; note Henry Baker's leadoff comment. >>>>>> >>>>>> -----------------8<------------cut-here------------8<------- >>>>>> ---------- >>>>>> >>>>>> Date: Fri, 27 Jun 2014 06:11:41 -0700 >>>>>> From: Henry Baker <hbak...@pipeline.com> >>>>>> Subject: Buffer overflows in 20-year-old LZ decompression code >>>>>> (Don A. Bailey) >>>>>> >>>>>> FYI -- It's time to start using theorem provers on *all* code; if you >>>>>> can't >>>>>> convince the theorem prover re buffer overflows, you'll have to insert >>>>>> executable code to explicitly check. HB >>>>>> >>>>>> Thursday, June 26, 2014 >>>>>> Raising Lazarus - The 20 Year Old Bug that Went to Mars >>>>>> >>>>>> http://blog.securitymouse.com/2014/06/raising-lazarus-20- >>>>>> year-old-bug-that.html >>>>>> >>>>>> It's rare that you come across a bug so subtle that it can last for >>>>>> two >>>>>> decades. But, that's exactly what has happened with the >>>>>> Lempel-Ziv-Oberhumer (LZO) algorithm. Initially written in 1994, >>>>>> Markus >>>>>> Oberhumer designed a sophisticated and extremely efficient compression >>>>>> algorithm so elegant and well architected that it outperforms zlib and >>>>>> bzip >>>>>> by four or five times their decompression speed. >>>>>> >>>>>> As a result, Markus has made a successful and well deserved career out >>>>>> of >>>>>> optimizing code for various platforms. I was impressed to find out >>>>>> that his >>>>>> LZO algorithm has gone to the planet Mars on NASA devices multiple >>>>>> times! >>>>>> Most recently, LZO has touched down on the red planet within the Mars >>>>>> Curiosity Rover, which just celebrated its first martian anniversary >>>>>> on >>>>>> Tuesday. >>>>>> >>>>>> Because of the speed and efficiency of the algorithm, LZO has made its >>>>>> way >>>>>> into both proprietary and open source projects world-wide. It's has >>>>>> lived >>>>>> in automotive systems, airplanes, and other embedded systems for over >>>>>> a >>>>>> decade. The algorithm has even made its way into projects we use on a >>>>>> daily >>>>>> basis, such as OpenVPN, MPlayer2, Libav, FFmpeg, the Linux kernel, >>>>>> Juniper >>>>>> Junos, and much, much, more. >>>>>> >>>>>> In the past few years, LZO has gained traction in file systems as >>>>>> well. >>>>>> LZO >>>>>> can be used in the Linux kernel within btrfs, squashfs, jffs2, and >>>>>> ubifs. A >>>>>> recent variant of the algorithm, LZ4, is used for compression in ZFS >>>>>> for >>>>>> Solaris, Illumos, and FreeBSD. >>>>>> >>>>>> LZO is even enabled in kernels for Samsung Android devices to increase >>>>>> kernel loading speed and improve the user experience, as noted in the >>>>>> Android Hacker's Handbook. >>>>>> >>>>>> With its popularity increasing, Lempel-Ziv-Oberhumer has been >>>>>> rewritten >>>>>> by >>>>>> many engineering firms for both closed and open systems. These >>>>>> rewrites, >>>>>> however, have always been based on Oberhumer's core open source >>>>>> implementation. As a result, they all inherited a subtle integer >>>>>> overflow. >>>>>> Even LZ4 has the same exact bug, but changed very slightly. >>>>>> >>>>>> Engineered Genetics >>>>>> >>>>>> Code reuse is a normal part of engineering, and is something we do >>>>>> every >>>>>> day. But, it can be dangerous. By reusing code that is known to work >>>>>> well, >>>>>> especially in highly optimized algorithms, projects can become subject >>>>>> to >>>>>> vulnerabilities in what is perceived as trusted code. Auditing highly >>>>>> optimized algorithms is a fragile endeavor. It is very easy to break >>>>>> these >>>>>> types of algorithms. Therefore, reused code that is highly >>>>>> specialized >>>>>> is >>>>>> often presumed safe because of its age, its proven efficiency, and its >>>>>> fragility. >>>>>> >>>>>> This creates a sort of digital DNA, a digital genetic footprint that >>>>>> can be >>>>>> traced over time. Though there are certainly many instances of >>>>>> proprietary >>>>>> variants of LZO and LZ4, the following six implementations are >>>>>> available in >>>>>> open source software >>>>>> >>>>>> Oberhumer LZO (core/reference open source implementation) >>>>>> Linux kernel's LZO implementation >>>>>> Libav's LZO implementation >>>>>> FFmpeg's LZO implementation >>>>>> Linux kernel's LZ4 implementation >>>>>> LZ4 core/reference implementation >>>>>> >>>>>> Despite each implementation of the algorithm being noticeably >>>>>> different, >>>>>> each variant is vulnerable in the exact same way. Let's take a look >>>>>> at >>>>>> a >>>>>> version of the algorithm that is easy to read online, the Linux kernel >>>>>> implementation found here. >>>>>> >>>>>> In all variants of LZ[O4], the vulnerability occurs when processing a >>>>>> Literal Run. This is a chunk of compressed data that isn't compressed >>>>>> at >>>>>> all. Literals are uncompressed bytes that the user decided, for >>>>>> whatever >>>>>> reason, should not be compressed. A Literal Run is signaled by a >>>>>> state >>>>>> machine in LZO, and by a Mask in LZ4. >>>>>> >>>>>> 56 if (likely(state == 0)) { >>>>>> 57 if (unlikely(t == 0)) { >>>>>> 58 while (unlikely(*ip == >>>>>> 0)) { >>>>>> 59 t += 255; >>>>>> 60 ip++; >>>>>> 61 NEED_IP(1); >>>>>> 62 } >>>>>> 63 t += 15 + *ip++; >>>>>> 64 } >>>>>> 65 t += 3; >>>>>> >>>>>> In the above sample, the integer overflow is evident. The variable >>>>>> 't' >>>>>> is >>>>>> incremented by 255 every time the compression payload contains a nil >>>>>> byte >>>>>> (0x00) when a Literal Run is detected. Regardless of whether the >>>>>> variable >>>>>> 't' is signed or unsigned, 255 will be added to it. The only check is >>>>>> to >>>>>> ensure that the input buffer contains another byte. This means that >>>>>> 't' can >>>>>> accumulate until it is a very large unsigned integer. If 't' is a >>>>>> 32bit >>>>>> integer, it only takes approximately sixteen (16) megabytes of zeroes >>>>>> to >>>>>> generate a sufficiently large value for 't'. Though 't' can overflow >>>>>> here, >>>>>> this is not where the attack occurs. There is another more important >>>>>> overflow just below this chunk of code. >>>>>> >>>>>> 66 copy_literal_run: >>>>>> 67 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) >>>>>> 68 if (likely(HAVE_IP(t + 15) && >>>>>> HAVE_OP(t + 15))) { >>>>>> 69 const unsigned char *ie = >>>>>> ip >>>>>> + t; >>>>>> 70 unsigned char *oe = op + t; >>>>>> 71 do { >>>>>> 72 COPY8(op, ip); >>>>>> 73 op += 8; >>>>>> 74 ip += 8; >>>>>> 75 COPY8(op, ip); >>>>>> 76 op += 8; >>>>>> 77 ip += 8; >>>>>> 78 } while (ip < ie); >>>>>> 79 ip = ie; >>>>>> 80 op = oe; >>>>>> 81 } else >>>>>> 82 #endif >>>>>> >>>>>> Above, we see the "copy_literal_run" chunk of code. This is the >>>>>> section of >>>>>> the LZO algorithm that uses the variable 't' as a size parameter. On >>>>>> line >>>>>> 68, the code ensures that the input buffer (IP) and output buffer (OP) >>>>>> are >>>>>> large enough to contain 't' bytes. However, in the Linux kernel >>>>>> implementation, they pad by 15 bytes to ensure the 16 byte copy does >>>>>> not >>>>>> overflow either buffer. This is where things fail. >>>>>> >>>>>> The macros HAVE_IP and HAVE_OP validate that 't' bytes are available >>>>>> in >>>>>> the >>>>>> respective buffer. But, before the macro is called, the expression (t >>>>>> + 15) >>>>>> is evaluated. If the value of 't' is large enough, this expression >>>>>> will >>>>>> cause an integer overflow. The attacker can make this expression >>>>>> result in >>>>>> a value of zero (0) through fourteen (14) by forcing 't' to equal the >>>>>> values >>>>>> -15 to -1, respectively. This means that the HAVE macros will always >>>>>> believe that enough space is available in both input and output >>>>>> buffers. >>>>>> >>>>>> On line 70, the pointer 'oe' will now point to before the 'op' buffer, >>>>>> potentially pointing to memory prior to the start of the output >>>>>> buffer. >>>>>> The >>>>>> subsequent code will copy sixteen (16) bytes from the input pointer to >>>>>> the >>>>>> output pointer, which does nothing as these pointers should point to a >>>>>> "safe" location in memory. However, there are two side effects here >>>>>> that >>>>>> the attacker must abuse: lines 78 and 80. >>>>>> >>>>>> Because 'ie' will always have an address lower in memory than 'ip', >>>>>> the >>>>>> loop >>>>>> is immediately broken after the first sixteen (16) byte copy. This >>>>>> means >>>>>> that the value 't' did not cause a crash in the copy loop, making this >>>>>> copy >>>>>> essentially a no-op from the attacker's point of view. Most >>>>>> importantly, on >>>>>> line 80 (and 79), the buffer pointer is set to the overflown pointer. >>>>>> This >>>>>> means that now, the output pointer points to memory outside of the >>>>>> bounds of >>>>>> the output buffer. The attacker now has the capability to corrupt >>>>>> memory, >>>>>> or at least cause a Denial of Service (DoS) by writing to an invalid >>>>>> memory >>>>>> page. >>>>>> >>>>>> The Impact of Raising Dead Code >>>>>> >>>>>> Each variant of the LZO and LZ4 implementation is vulnerable in >>>>>> slightly >>>>>> different ways. The attacker must construct a malicious payload to >>>>>> fit >>>>>> each >>>>>> particular implementation. One payload cannot be used to trigger more >>>>>> than >>>>>> a DoS on each implementation. Because of the slightly different >>>>>> overflow >>>>>> requirements, state machine subtleties, and overflow checks that must >>>>>> be >>>>>> bypassed, even a worldwide DoS is not a simple task. >>>>>> >>>>>> This results in completely different threats depending on the >>>>>> implementation >>>>>> of the algorithm, the underlying architecture, and the memory layout >>>>>> of >>>>>> the >>>>>> target application. Remote Code Execution (RCE) is possible on >>>>>> multiple >>>>>> architectures and platforms, but absolutely not all. Denial of >>>>>> Service >>>>>> is >>>>>> possible on most implementations, but not all. Adjacent Object >>>>>> Over-Write >>>>>> (OOW) is possible on many architectures. >>>>>> >>>>>> Lazarus raised from the dead >>>>>> >>>>>> Because the LZO algorithm is considered a library function, each >>>>>> specific >>>>>> implementation must be evaluated for risk, regardless of whether the >>>>>> algorithm used has been patched. Why? We are talking about code that >>>>>> has >>>>>> existed in the wild for two decades. The scope of this algorithm >>>>>> touches >>>>>> everything from embedded microcontrollers on the Mars Rover, mainframe >>>>>> operating systems, modern day desktops, and mobile phones. Engineers >>>>>> that >>>>>> have used LZO must evaluate the use case to identify whether or not >>>>>> the >>>>>> implementation is vulnerable, and in what format. >>>>>> >>>>>> Here is a list of impact based on each library. Implementations, or >>>>>> use >>>>>> cases of each library may change the threat model enough to warrant >>>>>> reclassification. So, please have a variant audited by a skilled >>>>>> third >>>>>> party, such as <shameless plug>. >>>>>> >>>>>> Oberhumer LZO >>>>>> RCE: Impractical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: 64bit platforms are impractical for all attacks >>>>>> Linux kernel LZO >>>>>> RCE: Impractical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: Only i386/PowerPC are impacted at this time >>>>>> Libav LZO >>>>>> RCE: Practical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: ALL ARCHITECTURES/PLATFORMS are RCE capable >>>>>> FFmpeg LZO >>>>>> RCE: Practical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: ALL ARCHITECTURES/PLATFORMS are RCE capable >>>>>> Linux kernel LZ4 >>>>>> RCE: Practical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: 64bit architectures are NOT considered practical >>>>>> LZ4 >>>>>> RCE: Practical >>>>>> DoS: Practical >>>>>> OOW: Practical >>>>>> NOTE: 64bit architectures are NOT considered practical >>>>>> >>>>>> For a bug report on each implementation, please visit the Lab Mouse >>>>>> Security's vulnerability site. >>>>>> >>>>>> How Do You Know If You're Vulnerable >>>>>> >>>>>> Projects Using LZO/LZ4 >>>>>> >>>>>> The easiest way to identify whether your specific implementation is >>>>>> vulnerable is to determine the maximum chunk size that is passed to >>>>>> the >>>>>> decompress routine. If buffers of sixteen (16) megabytes or more can >>>>>> be >>>>>> passed to the LZO or LZ4 decompress routine in one call, then >>>>>> exploitation >>>>>> of the integer overflow is possible. For example, ZFS constrains >>>>>> buffer >>>>>> sizes to 128k. So, even though they use a vulnerable implementation >>>>>> of >>>>>> LZ4, >>>>>> an attack is not possible without a second bug to bypass the buffer >>>>>> size >>>>>> constraint. >>>>>> >>>>>> The second easiest way is to identify the bit size of the count >>>>>> variable. >>>>>> If the count variable (for example, named 't' in the Linux kernel code >>>>>> shown >>>>>> above) is 64bit, it would take such a massive amount of data to >>>>>> trigger >>>>>> the >>>>>> overflow that the attack would likely be infeasible, regardless of how >>>>>> much >>>>>> data can be passed to the vulnerable function in one call. This is >>>>>> due >>>>>> to >>>>>> the fact that even modern computers do not have enough RAM available >>>>>> to >>>>>> store the data required to implement such an attack. >>>>>> >>>>>> However, there is a specific issue with the previous check. Validate >>>>>> that >>>>>> even if the count variable is 64bit in size, the value used is still >>>>>> 64bit >>>>>> when a length value is checked. If the actual length value is >>>>>> truncated to >>>>>> 32bits, the attack will still work with only sixteen (16) megabytes of >>>>>> data. >>>>>> >>>>>> Users >>>>>> >>>>>> All users of FFmpeg, Libav, and projects that depend on them, should >>>>>> consider themselves at risk to remote code execution. Period. Please >>>>>> update your software from the FFmpeg and Libav websites, or refrain >>>>>> from >>>>>> using these applications until your distribution has an adequate >>>>>> patch. >>>>>> >>>>>> It should be noted that certain Linux distributions package Mplayer2 >>>>>> with >>>>>> the base system by default. MPlayer2 is vulnerable to RCE "out of the >>>>>> box". >>>>>> If your distribution packages MPlayer2 by default, be sure to disable >>>>>> the >>>>>> embedded media player plugin (gecko-mediaplayer) for your browser. >>>>>> Firefox/Iceweasel, Chromium, Opera, Konqueror, and other Linux-based >>>>>> browsers are vulnerable to RCE regardless of the platform/architecture >>>>>> when >>>>>> an MPlayer2 plugin is enabled. >>>>>> >>>>>> Vendor Status >>>>>> >>>>>> Lab Mouse has reached out to and worked with each vendor of the >>>>>> vulnerable >>>>>> algorithm. As of today, June 26th, 2014, all LZO vendors have patches >>>>>> either available online, or will later today. Please update as soon >>>>>> as >>>>>> possible to minimize the existing threat surface. >>>>>> >>>>>> In the near future, Lab Mouse will publish a more technical blog on >>>>>> why >>>>>> and >>>>>> how RCE is possible using this bug. We consider that information to >>>>>> be >>>>>> imperative for both auditors and engineers, as it assists in >>>>>> identifying, >>>>>> classifying, and prioritizing a threat. However, that report will be >>>>>> released once the patches have been widely distributed for a >>>>>> sufficient >>>>>> amount of time. >>>>>> >>>>>> For more information, please visit our contact page. We are more than >>>>>> happy >>>>>> to help your team with their use case, or implementation of these >>>>>> algorithms. >>>>>> >>>>>> Summary >>>>>> >>>>>> Overall, this is how this bug release breaks down. >>>>>> >>>>>> Vendors have patches ready or released >>>>>> Distributions have been notified >>>>>> Vendors of proprietary variants have been notified (where they could >>>>>> be found) >>>>>> All bug reports can be found here >>>>>> RCE is not only possible but practical on all Libav/FFmpeg based >>>>>> projects >>>>>> All others are likely impractical to RCE, but still possible given a >>>>>> sufficiently skilled attacker >>>>>> >>>>>> It is always exciting to uncover a vulnerability as subtle as this >>>>>> issue, >>>>>> especially one that has persisted and propagated for two decades. >>>>>> But, >>>>>> it >>>>>> makes me pause and consider the way we look at engineering as a model. >>>>>> >>>>>> Speed and efficiency are imperatives for modern projects. We're >>>>>> building >>>>>> technology that touches our lives like never before. I know that most >>>>>> engineers strive to build not only elegant, but safe code. But, we >>>>>> still >>>>>> see security as a disparate discipline from engineering. Security and >>>>>> engineering could not be more tightly bound. Without engineering, you >>>>>> can't >>>>>> provide security to users. Without security, engineering cannot >>>>>> provide a >>>>>> stable and provable platform. >>>>>> >>>>>> Neil deGrasse Tyson famously claimed, God is in the gaps. There is a >>>>>> similar issue in engineering. The individual often sees stability >>>>>> where the >>>>>> individual doesn't have expertise. Our God is the algorithm. We >>>>>> "bless" >>>>>> certain pieces of code because we don't have the time or knowledge to >>>>>> evaluate it. When we, as engineers and analysts, take that >>>>>> perspective, we >>>>>> are doing a disservice to the people that use our projects and >>>>>> services. >>>>>> >>>>>> Often the best eyes are fresh or untrained eyes. The more we stop >>>>>> telling >>>>>> ourselves to step over the gaps in our code bases, the more holes >>>>>> we'll >>>>>> be >>>>>> able to fill. All it takes is one set of eyes to find a >>>>>> vulnerability, >>>>>> there is no level of expertise required to look and ask questions. >>>>>> Just >>>>>> look. Maybe you'll find the next 20 year old vulnerability. >>>>>> >>>>>> Thanks >>>>>> >>>>>> I'd like to thank the following people for their great assistance >>>>>> patching, >>>>>> coordinating, and advising on this issue: >>>>>> >>>>>> Greg Kroah-Hartman (Linux) >>>>>> Linus Torvalds (Linux) >>>>>> Kees Cook (Google) >>>>>> Xin LI (FreeBSD) >>>>>> Michael Niedermayer (FFmpeg) >>>>>> Luca Barbato (Libav/Gentoo) >>>>>> Markus Oberhumer >>>>>> Christopher J. Dorros (NASA MSL) >>>>>> Dan McDonald (Omniti) >>>>>> Yves-Alexis Perez (Debian) >>>>>> Kurt Seifried (Red Hat) >>>>>> Willy Tarreau (Linux) >>>>>> Solar Designer (Openwall) >>>>>> The US-CERT team >>>>>> The Oracle security team >>>>>> The GE security team >>>>>> Kelly Jackson Higgins (UBM) >>>>>> Steve Ragan (IDG Enterprise) >>>>>> Elinor Mills >>>>>> >>>>>> Feeling Guilty? >>>>>> >>>>>> Are you reading this post, thinking about all the administrators and >>>>>> engineers that are going to have to patch the LZO/LZ4 issue in your >>>>>> team's >>>>>> systems? Take some time to tell them how you feel with our hand >>>>>> crafted Lab >>>>>> Mouse Security custom Sympathy Card! >>>>>> >>>>>> Hand crafted with the finest bits and bytes, our Sympathy Card shows >>>>>> your >>>>>> engineer what they mean to you and your team. This is a limited run >>>>>> of >>>>>> cards, and will proudly display the Linux kernel LZO exploit written >>>>>> by >>>>>> Lab >>>>>> Mouse on the card. >>>>>> >>>>>> Best wishes, >>>>>> Don A. Bailey, Founder / CEO, @InfoSecMouse, Lab Mouse Security, 26 >>>>>> Jun >>>>>> 2014 >>>>>> >>>>>> _______________________________________________ >>>>>> langsec-discuss mailing list >>>>>> langsec-discuss@mail.langsec.org >>>>>> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> langsec-discuss mailing list >>>>> langsec-discuss@mail.langsec.org >>>>> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> langsec-discuss mailing list >>>> langsec-discuss@mail.langsec.org >>>> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss >>>> >>>> >>>> >>> >>
_______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss