Re: having source code for your CPU chip -- NOT
At 09:02 23/09/1999 -0400, Steven M. Bellovin wrote: By example, I could verify the machine code for IDEA, but not PGP and certainly not your favorite version of UNIX. Actually, while there are bugs and security holes, it's pretty certain that V6 Unix didn't have any crypto trapdoors ... and you can now own your very own source code license for early Unix including C compiler, complete with source for a PDP-11 emulator to run it on... this might come in handy one day as a stable, recreatable base. See http://minnie.cs.adfa.edu.au/PUPS , the PDP and Unix Preservation Society. [Of course, what guarantees does one have about the provenance of the code? --Perry] regards, Greg. Greg Rose INTERNET: [EMAIL PROTECTED] Qualcomm Australia VOICE: +61-2-9181-4851 FAX: +61-2-9181-5470 Suite 410, Birkenhead Point, http://people.qualcomm.com/ggr/ Drummoyne NSW 2047 232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C
Re: having source code for your CPU chip -- NOT
Date: Thu, 23 Sep 1999 18:38:57 -0400 (EDT) From: Eli Brandt [EMAIL PROTECTED] Arnold Reinhold wrote: Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable I'd be interested in seeing a proof. Otherwise Herr Goedel has nothing to do with this. That sure smells undecidable to me. Any non-trivial predicate P on Turing machines (non-trivial meaning that both P and not-P are non-empty) is undecidable by Rice's Theorem. There are technical issues in encoding onto the tape all possible interactions with the world -- the theorem doesn't apply if some inputs are deemed illegal -- but, hey, it's all countably infinite; re-encode with the naturals. I don't think Rice's Theorem applies in this case because it's not a language property. The non-trivial predicate should be on r.e. languages, not on Turing machines. For example, determining whether a Turing machine accepts a string consisting of 3 symbols is undecidable by Rice's Theorem (since some r.e. languages contain strings of 3 symbols and others don't), but determining whether a Turing machine has 3 states is not. That's not to say that it's not undecideable! Just that Rice's Theorem is insufficient to prove it (unless you can somehow reformulate it as a language property). Instead you might try directly reducing some undecideable problem to it. (For those who don't know the terminology, r.e. = recursively enumerable = accepted by a Turing machine.)
Re: having source code for your CPU chip -- NOT
At 6:38 PM -0400 9/23/99, Eli Brandt wrote: Arnold Reinhold wrote: Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable I'd be interested in seeing a proof. Otherwise Herr Goedel has nothing to do with this. That sure smells undecidable to me. Any non-trivial predicate P on Turing machines (non-trivial meaning that both P and not-P are non-empty) is undecidable by Rice's Theorem. There are technical issues in encoding onto the tape all possible interactions with the world -- the theorem doesn't apply if some inputs are deemed illegal -- but, hey, it's all countably infinite; re-encode with the naturals. I am not asking about the class of all Turing machines, just one particular lump of object code OC that happens to be a compiler--say the C++ compiler on the Red Hat 6.0 distribution. And the question is whether OC, which was compiled from a particular source file SF, only contains code attributable to SF or contains some additional self-propagating code inserted by the compiler OCprev that produced OC. SF is trusted by assumption. Only OCprev is suspect. You can even modify SF to some extent, say to turn off optimization or to produce an auxiliary file that matches object code to source statements. If you can answer this question just one time then Thompson's attack can be defeated. This problem is not remotely undecidable. One way to solve it is to write a new compiler from scratch OCalt. Then see if (OCalt(SF))(SF) = OC. As others have pointed out, you can also attempt to ascribe each byte of object code to the source that generated it. The practical impact of this is not immediately apparent. All non-trivial theorem-proving about programs is futile in this same sense, but people do it, or try. They have a lot of difficulties less arcane than running into the pathological cases constructed to prove these undecidability theorems. Your argument reminds me of claims I always hear that nothing can be measured because of the Heisenberg Uncertainty principle. I do feel your pain. More like disgust, actually. Too many computer scientists engage in this sort of name dropping, citing theorems that generally apply only in some infinite limit and specifically do not apply to the case at hand. Most people in the industry, who think an uncountable cardinal has something to do with electing the Pope, take their pronouncements on faith and just give up. The fact is almost nothing in crypto is on a rigorous mathematical foundation. No one has proved that factoring or discrete logs are hard problems (at least not publicly). Yet almost all the difficulties in building trustable systems do not lie in undecidablility results but in human failings. The Open Source model, in my opinion, is the best hope of creating trusted systems we have. Designing a process that can demonstrate that a given system is totally derived from a body of published code is an important goal. The Thompson paper is a often cited as proof that this cannot be done. It deserves to be debunked. Arnold Reinhold
Re: having source code for your CPU chip -- NOT
Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable I'd be interested in seeing a proof. That sure smells undecidable to me. Any non-trivial predicate P on Turing machines (non-trivial meaning that both P and not-P are non-empty) is undecidable by Rice's Theorem. [...] There are no Turing machines. Real computers are finite, and real source codes are finite. I'm sure that if you set a limit on the length of the source code which is recognized by the supposed trap, a sufficiently large FSM can decide in a finite time whether there's a trap. __ Matt Crawford[EMAIL PROTECTED] Fermilab "A5.1.5.2.7.1. Remove all classified and CCI boards from the COMSEC equipment, thoroughly smash them with a hammer or an ax, and scatter the pieces."
Re: having source code for your CPU chip -- NOT
There are no Turing machines. Real computers are finite, and real source codes are finite. I'm sure that if you set a limit on the length of the source code which is recognized by the supposed trap, a sufficiently large FSM can decide in a finite time whether there's a trap. mere finiteness doesn't help much in practice if you're up against algorithms which take time exponential in some parameter (like the size of the 'trap' region) which is likely to get even moderately sized.. - Bill
Re: having source code for your CPU chip -- NOT
Ray Hirschfeld wrote: That sure smells undecidable to me. Any non-trivial predicate P on Turing machines (non-trivial meaning that both P and not-P are non-empty) is undecidable by Rice's Theorem. There are technical issues in encoding onto the tape all possible interactions with the world -- the theorem doesn't apply if some inputs are deemed illegal -- but, hey, it's all countably infinite; re-encode with the naturals. I don't think Rice's Theorem applies in this case because it's not a language property. The non-trivial predicate should be on r.e. languages, not on Turing machines. Yeah, I was trying to avoid bringing in the term "recursively enumerable", and I tripped myself up. The "technical issue" I mumbled about was supposed to address this, but that reduction is in the wrong direction: I need to show that if you can answer Thompson(M) you can answer Thompson'(L), not the other way around. Which, when one stops waving one's hands and thinks about it, is not going to happen, not without drawing on some information about Thompson(). -- Eli Brandt | [EMAIL PROTECTED] | http://www.cs.cmu.edu/~eli/
Re: having source code for your CPU chip -- NOT
Arnold Reinhold wrote: Arnold Reinhold wrote: Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable [...] I am not asking about the class of all Turing machines, just one particular lump of object code OC that happens to be a compiler--say the C++ compiler on the Red Hat 6.0 distribution. I must admit, even though you said "whether a given lump" I assumed the decision procedure was supposed to exist uniformly over lumps. If you swap the quantifiers, sure, for any program there exists a decision procedure -- either the decider that always says "yes" or the one that always says "no". (Okay, I see why you don't find theory relevant.) -- Eli Brandt | [EMAIL PROTECTED] | http://www.cs.cmu.edu/~eli/
Re: having source code for your CPU chip -- NOT
At 06:38 PM 9/23/99 -0400, Eli Brandt wrote: Arnold Reinhold wrote: Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable I'd be interested in seeing a proof. Otherwise Herr Goedel has nothing to do with this. That sure smells undecidable to me. Any non-trivial predicate P on Folks, its not that complex. Thompson suggested that the trojan compiler subvert only login and compiler code. The trojan would not subvert generic code; no funky Godelian self-referential stuff is necessary. Of course, this was before the 90s: the net, downloadable code, chronically lame OSes made other approaches much easier.
Re: having source code for your CPU chip -- NOT
For the truly paranoid: it is perfectly possible to boostrap a working Forth environment *by hand*, whether by hand assembly and flashing the resulting image, or by porting eForth (or any Forths written in C) to an embedded target. You simply can't fit any Trojan in there: a minimal Forth OS can fit into just 2 k, typical environments take 12..16 kBytes. Of course you're abandoning any GNU/Unix compatibility, but the intrinsic rewards of a Forth environment can be considerable -- I don't know of any more productive system. Is there a crypto code library out there? Greg Rose writes: At 09:02 23/09/1999 -0400, Steven M. Bellovin wrote: By example, I could verify the machine code for IDEA, but not PGP and certainly not your favorite version of UNIX. Actually, while there are bugs and security holes, it's pretty certain that V6 Unix didn't have any crypto trapdoors ... and you can now own your very own source code license for early Unix including C compiler, complete with source for a PDP-11 emulator to run it on... this might come in handy one day as a stable, recreatable base. See http://minnie.cs.adfa.edu.au/PUPS , the PDP and Unix Preservation Society. [Of course, what guarantees does one have about the provenance of the code? --Perry]
Re: having source code for your CPU chip -- NOT
At 9:26 AM -0700 9/22/99, Bill Frantz wrote: My own approach would be to audit the generated code. In KeyKOS/370, we "solved" the problem by using an assembler which was written before KeyKOS was designed. (N.B. KeyKOS/370 was written in 370 Assembler H). Yeah, but 370 Assembler H had a very extensive macro facility and you could hide all kinds of wierd stuff in 370 code. Not too many folk left around who can read it. I have a copy of Decus C (Open Source PDP-11 C) lying around and wrote enough of its compiler and code generator to know what it can and cannot do, in case anyone is interested. The entire source code of the C compiler is small enough to sight-verify in about a man-month. A "Small C" compiler (see early issues of Dr. Dobbs) can be implemented in about 3 man months and ought to be good enough for crypto work. Martin Minow [EMAIL PROTECTED] [And then how do you trust your assembler? Or the compiler and assembler you compiled the C compiler on? And the linker? If you really try hard enough on all this, you find your self smack dab in front of Kurt Goedel's door, and he tends to have unpleasant news for visitors who come to him looking for solace. And of course, once you've done all this lovely work, the NSA comes in and puts a microscopic bug into your keyboard cable in the night, or replaces your hand verified assembler executables, or... I suggest that in practical terms, one has to set some reasonable limits on what one is willing to do to overcome risk. Paranoia is a potential source of infinite work, but there is only a finite amount of work one can do in a given lifetime. That is not to say that *some* paranoia isn't of value, but perfect paranoia results in a perfect absence of progress on one's projects. --Perry]
Re: having source code for your CPU chip -- NOT
At 10:26 PM -0700 9/22/99, Martin Minow wrote: At 9:26 AM -0700 9/22/99, Bill Frantz wrote: My own approach would be to audit the generated code. In KeyKOS/370, we "solved" the problem by using an assembler which was written before KeyKOS was designed. (N.B. KeyKOS/370 was written in 370 Assembler H). Yeah, but 370 Assembler H had a very extensive macro facility and you could hide all kinds of wierd stuff in 370 code. Not too many folk left around who can read it. The big advantage of Assembler for detecting compiler Trojans is that the object code has a simple relation to the source code. It is quite straight forward to see if the assembler has inserted/deleted code ala Thompson. It is certainly not impossible to do this for a higher level language such as C. If the compiler allows you to associate source line numbers with offsets in the object code it is even easier. You "merely" have to look at the object code line-for-line and see if it is a reasonable result of the source code. I have occasionally had to do almost exactly this when line-by-line stepping of a C program was too course grained for the bug I was seeking. - Bill Frantz | The availability and use of secure encryption may | Periwinkle | offer an opportunity to reclaim some portion of | Consulting | the privacy we have lost. - B. FLETCHER, Circuit Judge|
Re: having source code for your CPU chip -- NOT
Arnold Reinhold wrote: Perry, if you really believe that the question of whether a given lump of object code contains a Thompson Trap is formally undecidable I'd be interested in seeing a proof. Otherwise Herr Goedel has nothing to do with this. That sure smells undecidable to me. Any non-trivial predicate P on Turing machines (non-trivial meaning that both P and not-P are non-empty) is undecidable by Rice's Theorem. There are technical issues in encoding onto the tape all possible interactions with the world -- the theorem doesn't apply if some inputs are deemed illegal -- but, hey, it's all countably infinite; re-encode with the naturals. The practical impact of this is not immediately apparent. All non-trivial theorem-proving about programs is futile in this same sense, but people do it, or try. They have a lot of difficulties less arcane than running into the pathological cases constructed to prove these undecidability theorems. Your argument reminds me of claims I always hear that nothing can be measured because of the Heisenberg Uncertainty principle. I do feel your pain. -- Eli Brandt | [EMAIL PROTECTED] | http://www.cs.cmu.edu/~eli/
Re: having source code for your CPU chip -- NOT
First of all, I've always thought that Thompson's paper was excessively defeatist. It should be possible to bootstrap an open source C++ compiler from a simple C subset and then have several grouts independently produce subset C compilers in assembler code. Or compile it on ancient machines, e.g. PDP-11 using original compilers. The bad guys can't have anticipated that far ahead. [You sound like Hilbert, hoping that the consistency and completeness of the Principia could be proven using "only" finitistic methods. --Perry] Having said that, perhaps the Open CPU requires open CAD software as well. It should be possible to verify masks against the design and physical chips against the mask. Finally, I wonder if it might be possible to implement a RISC CPU on an FPGA. It might even be possible to write a CPU generator that made layout choices at random so that it would be unlikely that any two FPGA CPU would have the same layout. This would make it hard to build a trap door into the FPGA itself. Arnold Reinhold At 6:58 PM -0700 9/19/99, John Gilmore wrote: On the other hand, having the actual CPU source, we could stop worrying about Intel's ID gaffs, and RNG support, and "know" it is built correctly. Even if you designed the chip and contracted out the fabrication, you will not know that it is built correctly. Even if you ran the fab and shuttled the wafers from machine to machine yourself. I have done design verification for complex chips (in the SPARCstation-1 and -2). You can certainly test that it does everything you designed it to do. You can't test for the *absence* of backdoors or trojan horses. If someone jiggered your CAD software to insert circuitry that turns on the supervisor bit for one instruction if you execute seventeen ADDs in a row, you'll never find it unless someone points you at it. (And it won't be in your "source code", only in your physical circuitry. You could find it in the photographic masks, or in a chip, nowhere else.) Remember Ken Thompson's _Reflections on Trusting Trust_: http://www.acm.org/classics/sep95/ It's a very short paper, readable by everyone on the list. Read it now! You'll be shocked. John
Re: having source code for your CPU chip -- NOT
On the other hand, having the actual CPU source, we could stop worrying about Intel's ID gaffs, and RNG support, and "know" it is built correctly. Even if you designed the chip and contracted out the fabrication, you will not know that it is built correctly. Even if you ran the fab and shuttled the wafers from machine to machine yourself. I have done design verification for complex chips (in the SPARCstation-1 and -2). You can certainly test that it does everything you designed it to do. You can't test for the *absence* of backdoors or trojan horses. If someone jiggered your CAD software to insert circuitry that turns on the supervisor bit for one instruction if you execute seventeen ADDs in a row, you'll never find it unless someone points you at it. (And it won't be in your "source code", only in your physical circuitry. You could find it in the photographic masks, or in a chip, nowhere else.) Remember Ken Thompson's _Reflections on Trusting Trust_: http://www.acm.org/classics/sep95/ It's a very short paper, readable by everyone on the list. Read it now! You'll be shocked. John