Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Greg Rose

At 09:02 23/09/1999 -0400, Steven M. Bellovin wrote:
By example, I 
could verify the machine code for IDEA, but not PGP and certainly not your 
favorite version of UNIX.

Actually, while there are bugs and security holes, it's pretty certain that
V6 Unix didn't have any crypto trapdoors ... and you can now own your very
own source code license for early Unix including C compiler, complete with
source for a PDP-11 emulator to run it on... this might come in handy one
day as a stable, recreatable base.

See http://minnie.cs.adfa.edu.au/PUPS , the PDP and Unix Preservation Society.

[Of course, what guarantees does one have about the provenance of the
code? --Perry]

regards,
Greg.

Greg Rose   INTERNET: [EMAIL PROTECTED]
Qualcomm Australia  VOICE:  +61-2-9181-4851   FAX: +61-2-9181-5470
Suite 410, Birkenhead Point,   http://people.qualcomm.com/ggr/ 
Drummoyne NSW 2047  232B EC8F 44C6 C853 D68F  E107 E6BF CD2F 1081 A37C



Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Ray Hirschfeld

 Date: Thu, 23 Sep 1999 18:38:57 -0400 (EDT)
 From: Eli Brandt [EMAIL PROTECTED]
 
 Arnold Reinhold wrote:
  Perry, if you really believe that the question of whether a given 
  lump of object code contains a Thompson Trap is formally undecidable 
  I'd be interested in seeing a proof. Otherwise Herr Goedel has 
  nothing to do with this.
 
 That sure smells undecidable to me.  Any non-trivial predicate P on
 Turing machines (non-trivial meaning that both P and not-P are
 non-empty) is undecidable by Rice's Theorem.  There are technical
 issues in encoding onto the tape all possible interactions with the
 world -- the theorem doesn't apply if some inputs are deemed illegal
 -- but, hey, it's all countably infinite; re-encode with the naturals.

I don't think Rice's Theorem applies in this case because it's not a
language property.  The non-trivial predicate should be on
r.e. languages, not on Turing machines.  For example, determining
whether a Turing machine accepts a string consisting of 3 symbols is
undecidable by Rice's Theorem (since some r.e. languages contain
strings of 3 symbols and others don't), but determining whether a
Turing machine has 3 states is not.

That's not to say that it's not undecideable!  Just that Rice's
Theorem is insufficient to prove it (unless you can somehow
reformulate it as a language property).  Instead you might try
directly reducing some undecideable problem to it.

(For those who don't know the terminology, r.e. = recursively
enumerable = accepted by a Turing machine.)



Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Arnold Reinhold

At 6:38 PM -0400 9/23/99, Eli Brandt wrote:
Arnold Reinhold wrote:
  Perry, if you really believe that the question of whether a given
  lump of object code contains a Thompson Trap is formally undecidable
  I'd be interested in seeing a proof. Otherwise Herr Goedel has
  nothing to do with this.

That sure smells undecidable to me.  Any non-trivial predicate P on
Turing machines (non-trivial meaning that both P and not-P are
non-empty) is undecidable by Rice's Theorem.  There are technical
issues in encoding onto the tape all possible interactions with the
world -- the theorem doesn't apply if some inputs are deemed illegal
-- but, hey, it's all countably infinite; re-encode with the naturals.


I am not asking about the class of all Turing machines, just one 
particular lump of object code OC that happens to be a compiler--say 
the C++ compiler on the Red Hat 6.0 distribution. And the question is 
whether OC, which was compiled from a particular source file SF, only 
contains code attributable to SF or contains some additional 
self-propagating code inserted by the compiler OCprev that produced 
OC. SF is trusted by assumption.  Only OCprev is suspect. You can 
even modify SF to some extent, say to turn off optimization or to 
produce an auxiliary file that matches object code to source 
statements.

If you can answer this question just one time then Thompson's attack 
can be defeated. This problem is not remotely undecidable. One way to 
solve it is to write a new compiler from scratch OCalt. Then see if 
(OCalt(SF))(SF) = OC.  As others have pointed out, you can also 
attempt to ascribe each byte of object code to the source that 
generated it.


The practical impact of this is not immediately apparent.  All
non-trivial theorem-proving about programs is futile in this same
sense, but people do it, or try.  They have a lot of difficulties less
arcane than running into the pathological cases constructed to prove
these undecidability theorems.

  Your argument reminds me of claims I always
  hear that nothing can be measured because of the Heisenberg
  Uncertainty principle.

I do feel your pain.

More like disgust, actually. Too many computer scientists engage in 
this sort of name dropping, citing theorems that generally apply only 
in some infinite limit and specifically do not apply to the case at 
hand. Most people in the industry, who think an uncountable cardinal 
has something to do with electing the Pope, take their pronouncements 
on faith and just give up.

The fact is almost nothing in crypto is on a rigorous mathematical 
foundation. No one has proved that factoring or discrete logs are 
hard problems (at least not publicly). Yet almost all the 
difficulties in building trustable systems do not lie in 
undecidablility results but in human failings.

The Open Source model, in my opinion, is the best hope of creating 
trusted systems we have. Designing a process that can demonstrate 
that a given system is totally derived from a body of published code 
is an important goal. The Thompson paper is a often cited as proof 
that this cannot be done. It deserves to be debunked.

Arnold Reinhold




Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Matt Crawford

  Perry, if you really believe that the question of whether a given 
  lump of object code contains a Thompson Trap is formally undecidable 
  I'd be interested in seeing a proof.
 
 That sure smells undecidable to me.  Any non-trivial predicate P on
 Turing machines (non-trivial meaning that both P and not-P are
 non-empty) is undecidable by Rice's Theorem.   [...]

There are no Turing machines.  Real computers are finite, and real
source codes are finite.  I'm sure that if you set a limit on the
length of the source code which is recognized by the supposed trap, a
sufficiently large FSM can decide in a finite time whether there's a
trap.
__
Matt Crawford[EMAIL PROTECTED] Fermilab
"A5.1.5.2.7.1. Remove all classified and CCI boards from the COMSEC equipment,
thoroughly smash them with a hammer or an ax, and scatter the pieces."





Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Bill Sommerfeld

 There are no Turing machines.  Real computers are finite, and real
 source codes are finite.  I'm sure that if you set a limit on the
 length of the source code which is recognized by the supposed trap, a
 sufficiently large FSM can decide in a finite time whether there's a
 trap.

mere finiteness doesn't help much in practice if you're up against
algorithms which take time exponential in some parameter (like the
size of the 'trap' region) which is likely to get even moderately
sized..

- Bill



Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Eli Brandt

Ray Hirschfeld wrote:
  That sure smells undecidable to me.  Any non-trivial predicate P on
  Turing machines (non-trivial meaning that both P and not-P are
  non-empty) is undecidable by Rice's Theorem.  There are technical
  issues in encoding onto the tape all possible interactions with the
  world -- the theorem doesn't apply if some inputs are deemed illegal
  -- but, hey, it's all countably infinite; re-encode with the naturals.
 
 I don't think Rice's Theorem applies in this case because it's not a
 language property.  The non-trivial predicate should be on
 r.e. languages, not on Turing machines.

Yeah, I was trying to avoid bringing in the term "recursively
enumerable", and I tripped myself up.  The "technical issue" I mumbled
about was supposed to address this, but that reduction is in the wrong
direction: I need to show that if you can answer Thompson(M) you can
answer Thompson'(L), not the other way around.  Which, when one stops
waving one's hands and thinks about it, is not going to happen, not
without drawing on some information about Thompson().

-- 
 Eli Brandt  |  [EMAIL PROTECTED]  |  http://www.cs.cmu.edu/~eli/



Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Eli Brandt

Arnold Reinhold wrote:
 Arnold Reinhold wrote:
   Perry, if you really believe that the question of whether a given
   lump of object code contains a Thompson Trap is formally undecidable
[...]
 I am not asking about the class of all Turing machines, just one 
 particular lump of object code OC that happens to be a compiler--say 
 the C++ compiler on the Red Hat 6.0 distribution.

I must admit, even though you said "whether a given lump" I assumed
the decision procedure was supposed to exist uniformly over lumps.  
If you swap the quantifiers, sure, for any program there exists a
decision procedure -- either the decider that always says "yes" or the
one that always says "no".  (Okay, I see why you don't find theory
relevant.)

-- 
 Eli Brandt  |  [EMAIL PROTECTED]  |  http://www.cs.cmu.edu/~eli/



Re: having source code for your CPU chip -- NOT

1999-09-24 Thread David Honig

At 06:38 PM 9/23/99 -0400, Eli Brandt wrote:
Arnold Reinhold wrote:
 Perry, if you really believe that the question of whether a given 
 lump of object code contains a Thompson Trap is formally undecidable 
 I'd be interested in seeing a proof. Otherwise Herr Goedel has 
 nothing to do with this.

That sure smells undecidable to me.  Any non-trivial predicate P on

Folks, its not that complex.  Thompson suggested that
the trojan compiler subvert only login and compiler code.   The trojan
would not subvert generic
code; no funky Godelian self-referential stuff is necessary.  Of course,
this was before the 90s: the net, downloadable
code, chronically lame OSes made other approaches much easier.













  







Re: having source code for your CPU chip -- NOT

1999-09-24 Thread Eugene Leitl


For the truly paranoid: it is perfectly possible to boostrap a working
Forth environment *by hand*, whether by hand assembly and flashing the
resulting image, or by porting eForth (or any Forths written in C) to
an embedded target.

You simply can't fit any Trojan in there: a minimal Forth OS can fit
into just 2 k, typical environments take 12..16 kBytes. Of course
you're abandoning any GNU/Unix compatibility, but the intrinsic
rewards of a Forth environment can be considerable -- I don't know of
any more productive system.

Is there a crypto code library out there?

Greg Rose writes:
  At 09:02 23/09/1999 -0400, Steven M. Bellovin wrote:
  By example, I 
  could verify the machine code for IDEA, but not PGP and certainly not your 
  favorite version of UNIX.
  
  Actually, while there are bugs and security holes, it's pretty certain that
  V6 Unix didn't have any crypto trapdoors ... and you can now own your very
  own source code license for early Unix including C compiler, complete with
  source for a PDP-11 emulator to run it on... this might come in handy one
  day as a stable, recreatable base.
  
  See http://minnie.cs.adfa.edu.au/PUPS , the PDP and Unix Preservation Society.
  
  [Of course, what guarantees does one have about the provenance of the
  code? --Perry]



Re: having source code for your CPU chip -- NOT

1999-09-23 Thread Martin Minow

At 9:26 AM -0700 9/22/99, Bill Frantz wrote:

My own approach would be to audit the generated code.  In KeyKOS/370, we
"solved" the problem by using an assembler which was written before KeyKOS
was designed.  (N.B. KeyKOS/370 was written in 370 Assembler H).


Yeah, but 370 Assembler H had a very extensive macro facility and
you could hide all kinds of wierd stuff in 370 code. Not too many
folk left around who can read it.

I have a copy of Decus C (Open Source PDP-11 C) lying around and
wrote enough of its compiler and code generator to know what it can
and cannot do, in case anyone is interested. The entire source code
of the C compiler is small enough to sight-verify in about a man-month.
A "Small C" compiler (see early issues of Dr. Dobbs) can be implemented
in about 3 man months and ought to be good enough for crypto work.

Martin Minow
[EMAIL PROTECTED]


[And then how do you trust your assembler? Or the compiler and
assembler you compiled the C compiler on? And the linker? If you
really try hard enough on all this, you find your self smack dab in
front of Kurt Goedel's door, and he tends to have unpleasant news for
visitors who come to him looking for solace.

And of course, once you've done all this lovely work, the NSA comes in
and puts a microscopic bug into your keyboard cable in the night, or
replaces your hand verified assembler executables, or...

I suggest that in practical terms, one has to set some reasonable
limits on what one is willing to do to overcome risk. Paranoia is a
potential source of infinite work, but there is only a finite amount
of work one can do in a given lifetime. That is not to say that *some*
paranoia isn't of value, but perfect paranoia results in a perfect
absence of progress on one's projects.

   --Perry]



Re: having source code for your CPU chip -- NOT

1999-09-23 Thread Bill Frantz

At 10:26 PM -0700 9/22/99, Martin Minow wrote:
At 9:26 AM -0700 9/22/99, Bill Frantz wrote:

My own approach would be to audit the generated code.  In KeyKOS/370, we
"solved" the problem by using an assembler which was written before KeyKOS
was designed.  (N.B. KeyKOS/370 was written in 370 Assembler H).


Yeah, but 370 Assembler H had a very extensive macro facility and
you could hide all kinds of wierd stuff in 370 code. Not too many
folk left around who can read it.

The big advantage of Assembler for detecting compiler Trojans is that the
object code has a simple relation to the source code.  It is quite straight
forward to see if the assembler has inserted/deleted code ala Thompson.

It is certainly not impossible to do this for a higher level language such
as C.  If the compiler allows you to associate source line numbers with
offsets in the object code it is even easier.  You "merely" have to look at
the object code line-for-line and see if it is a reasonable result of the
source code.  I have occasionally had to do almost exactly this when
line-by-line stepping of a C program was too course grained for the bug I
was seeking.


-
Bill Frantz | The availability and use of secure encryption may |
Periwinkle  | offer an opportunity to reclaim some portion of   |
Consulting  | the privacy we have lost. - B. FLETCHER, Circuit Judge|





Re: having source code for your CPU chip -- NOT

1999-09-23 Thread Eli Brandt

Arnold Reinhold wrote:
 Perry, if you really believe that the question of whether a given 
 lump of object code contains a Thompson Trap is formally undecidable 
 I'd be interested in seeing a proof. Otherwise Herr Goedel has 
 nothing to do with this.

That sure smells undecidable to me.  Any non-trivial predicate P on
Turing machines (non-trivial meaning that both P and not-P are
non-empty) is undecidable by Rice's Theorem.  There are technical
issues in encoding onto the tape all possible interactions with the
world -- the theorem doesn't apply if some inputs are deemed illegal
-- but, hey, it's all countably infinite; re-encode with the naturals.

The practical impact of this is not immediately apparent.  All
non-trivial theorem-proving about programs is futile in this same
sense, but people do it, or try.  They have a lot of difficulties less
arcane than running into the pathological cases constructed to prove
these undecidability theorems.

 Your argument reminds me of claims I always 
 hear that nothing can be measured because of the Heisenberg 
 Uncertainty principle.

I do feel your pain.

-- 
 Eli Brandt  |  [EMAIL PROTECTED]  |  http://www.cs.cmu.edu/~eli/



Re: having source code for your CPU chip -- NOT

1999-09-22 Thread Arnold Reinhold

First of all, I've always thought that Thompson's paper was 
excessively defeatist. It should be possible to bootstrap an open 
source C++ compiler from a simple C subset and then have several 
grouts independently produce subset C compilers in assembler code. Or 
compile it on ancient machines, e.g. PDP-11 using original compilers. 
The bad guys can't have anticipated that far ahead.

[You sound like Hilbert, hoping that the consistency and completeness
of the Principia could be proven using "only" finitistic
methods. --Perry]

Having said that, perhaps the Open CPU requires open CAD software as 
well. It should be possible to verify masks against the design and 
physical chips against the mask.

Finally, I wonder if it might be possible to implement a RISC CPU on 
an FPGA. It might even be possible to write a CPU generator that made 
layout choices at random so that it would be unlikely that any two 
FPGA CPU would have the same layout. This would make it hard to build 
a trap door into the FPGA itself.

Arnold Reinhold

At 6:58 PM -0700 9/19/99, John Gilmore wrote:
  On the other hand, having the actual CPU source, we could stop worrying
  about Intel's ID gaffs, and RNG support, and "know" it is built correctly.

Even if you designed the chip and contracted out the fabrication,
you will not know that it is built correctly.  Even if you ran the fab
and shuttled the wafers from machine to machine yourself.

I have done design verification for complex chips (in the SPARCstation-1
and -2).  You can certainly test that it does everything you designed it
to do.  You can't test for the *absence* of backdoors or trojan horses.
If someone jiggered your CAD software to insert circuitry that turns on
the supervisor bit for one instruction if you execute seventeen ADDs in
a row, you'll never find it unless someone points you at it.  (And it
won't be in your "source code", only in your physical circuitry.  You
could find it in the photographic masks, or in a chip, nowhere else.)

Remember Ken Thompson's _Reflections on Trusting Trust_:

   http://www.acm.org/classics/sep95/

It's a very short paper, readable by everyone on the list.  Read it now!
You'll be shocked.

   John




Re: having source code for your CPU chip -- NOT

1999-09-19 Thread John Gilmore

 On the other hand, having the actual CPU source, we could stop worrying
 about Intel's ID gaffs, and RNG support, and "know" it is built correctly.

Even if you designed the chip and contracted out the fabrication,
you will not know that it is built correctly.  Even if you ran the fab
and shuttled the wafers from machine to machine yourself.

I have done design verification for complex chips (in the SPARCstation-1
and -2).  You can certainly test that it does everything you designed it
to do.  You can't test for the *absence* of backdoors or trojan horses.
If someone jiggered your CAD software to insert circuitry that turns on
the supervisor bit for one instruction if you execute seventeen ADDs in
a row, you'll never find it unless someone points you at it.  (And it
won't be in your "source code", only in your physical circuitry.  You
could find it in the photographic masks, or in a chip, nowhere else.)

Remember Ken Thompson's _Reflections on Trusting Trust_:

http://www.acm.org/classics/sep95/

It's a very short paper, readable by everyone on the list.  Read it now!
You'll be shocked.

John