Re: [Cryptography] prism-proof email in the degenerate case

2013-10-10 Thread John Denker
On 10/10/2013 02:20 PM, Ray Dillinger wrote:

 split the message stream
 into channels when it gets to be more than, say, 2GB per day.

That's fine, in the case where the traffic is heavy.

We should also discuss the opposite case:

*) If the traffic is light, the servers should generate cover traffic.

*) Each server should publish a public key for /dev/null so that
 users can send cover traffic upstream to the server, without
 worrying that it might waste downstream bandwidth.

 This is crucial for deniabililty:  If the rubber-hose guy accuses
 me of replying to ABC during the XYZ crisis, I can just shrug and 
 say it was cover traffic.


*) Messages should be sent in standard-sized packets, so that the
 message-length doesn't give away the game.

*) If large messages are common, it might help to have two streams:
 -- the pointer stream, and
 -- the bulk stream.

It would be necessary to do a trial-decode on every message in the
pointer stream, but when that succeeds, it yields a pilot message
containing the fingerprints of the packets that should be pulled 
out of the bulk stream.  The first few bytes of the packet should 
be a sufficient fingerprint.  This reduces the number of trial-
decryptions by a factor of roughly sizeof(message) / sizeof(packet).

From the keen-grasp-of-the-obvious department:

*) Forward Secrecy is important here.

The cryptography mailing list

[Cryptography] heterotic authority + web-of-trust + pinning

2013-09-28 Thread John Denker
Hash: SHA1

On 09/25/2013 04:59 AM, Peter Gutmann wrote:

 Something that can sign a new RSA-2048 sub-certificate is called a CA.  For 
 a browser, it'll have to be a trusted CA.  What I was asking you to explain 
 how the browsers are going to deal with over half a billion (source: Netcraft 
 web server survey) new CAs in the ecosystem when websites sign a new 

There are other ways of thinking about it that makes it seem not 
quite so bad.

There are many approaches to establishing trust.  Familiar examples
 *) The top-down authoritarian X.509 approach, such as we see in SSL.
 *) The pinning-only approach, such as we see in SSH.
 *) The web-of-trust approach, such as we see in PGP.

Each of these has some security advantages and disadvantages.  Each 
has some convenience advantages and disadvantages.

My point for today is that one can combine these in ways that are 
heterotic, i.e. that show hybrid vigor.

 -- The example of combining the CA approach with pinning has already
  been mentioned.

 -- Let's now discuss how one might combine the CA approach with the 
  web-of-trust approach.  Here's one possible use-case:

  Suppose you have a HTTPS web site using a certificate that you bought
  from some godlike CA.  When it expires, you buy another to replace it.
  So far so good.

  However, it would be even better if you could use the old certificate
  to sign the new one.  This certifies that you /intend/ for there to be
  a continuation of the same security relationship.  [As a detail, in
  this approach, you want a certificate to have three stages of life:
  (1) active, in normal use, (2) retired, not in active use, but still
  valid for signing its successor, and (3) expired, not used at all.]

  This is like PGP in the sense that the new certificate has multiple
  signatures, one from the top-down CA and one from the predecessor.
  The idea of having multiple signatures is foreign to the heirarchical
  authoritarian X.509 way of thinking, but I don't see any reason why
  this would be hard to do.

 -- Similar heterotic thinking applies to SSH.  Suppose I want to replace 
  my old host key with another.  It would be nice to use the old one to
  /sign/ the new one, so that a legitimate replacement doesn't look like
  a MITM attack.  (In theory, you could validate a new SSH keypair by
  distributing the fingerprint via SSL or PGP, which reduces it to a
  problem previously solved ... but that's labor-intensive, and AFAICT
  hardly anybody but me ever bothers to do it.)

 Something that can sign a new RSA-2048 sub-certificate is called a CA. 

You could call it that, but you could just call it a /signer/.  PGP 
has already demonstrated that you can have millions upon millions of 
signers. In the use-case sketched above, we don't even need a keyserver.
The web site just offers its public key, plus a certifiate signed by 
the CA, plus another certificate signed by the predecessor key.

   For end-to-end security of email, where it may be that neither end
   is a server, some sort of keyserver is probably necessary.  This
   seems like a manageable problem.

We agree that half a billion CAs would be too many, if they all had the
power to sign anything and everything.  Forsooth, my system already has 
321 certificates in /etc/ssl/certs, and that seems like waaay too many, 
IMHO.  That's because the adversay needs to subvert only one of them, 
and the adversary gets to pick and choose.

On the other hand, if we think in terms of a /signer/ with much more
limited power, perhaps only the power to countersign a successor cert
that has already been signed by a CA, that sounds to me like a good 
thing, not a bad thing.

Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -

The cryptography mailing list

Re: [Cryptography] real random numbers

2013-09-15 Thread John Denker
Previously I said we need to speak more carefully about these
things.  Let me start by taking my own advice:

Alas on 09/14/2013 12:29 PM, I wrote:
 a) In the linux random device, /any/ user can mix stuff into the
 driver's pool.  This is a non-privileged operation.  The idea is that
 it can't hurt and it might help.  So far so good. b) Contributions of
 the type just mentioned do *not* increase the driver's estimate of
 the entropy in the pool.  If you want to increase the
 entropy-estimate, you need to issue a privileged ioctl.

 ... step (a) cannot get anybody into trouble.  Step (b) gets you into
 trouble if you claim credit for more entropy than was actually

Actually it's one step more complicated than that.  Step (a) 
causes problems if you /underestimate/ the entropy content of
what you contributed.  The problem is that the end-user
application will try to read from the RNG and will stall
due to insufficient entropy available.

Step (b) has the opposite problem: You get into trouble if 
you /overestimate/ the entropy of what you have contributed.
This causes insidious security problems, because your allegedly 
random numbers are not as random as you think.

On 09/14/2013 03:12 PM, John Kelsey wrote:

 Your first two categories are talking about the distribution of 
 entropy--we assume some unpredictability exists, and we want to 
 quantify it in terms of bits of entropy per bit of output.  That's a 
 useful distinction to make, and as you said, if you can get even a 
 little entropy per bit and know how much you're getting, you can get 
 something very close to ideal random bits out.
 Your second two categories are talking about different kinds of 
 sources--completely deterministic, or things that can have
 randomness but don't always.  That leaves out sources that always
 have a particular amount of entropy (or at least are always expected

That very much depends on what you mean by expected.
 -- An ill-founded expectation is little more than a wild guess,
  and it is not useful for critical applications.
 ++ OTOH a well-founded statistical expectation value is just
  what we need, and it moves the source firmly out of the
  squish category.

I say again, a squish is not reliably predictable /and/ not
reliably unpredictable.  If you have *any* trustworthy nonzero
lower bound on the entropy content, it's not a squish.

On the other hand, again and again people latch onto something
that is not reliably predictable, call it random, and try
to do something with it without establishing any such lower
bound.  This has led to disaster again and again.

There is a ocean of difference between not reliably predictable
and reliably unpredictable.

 I'd say even the squish category can be useful in two way
 a.  If you have sensible mechanisms for collecting entropy, they 
 can't hurt and sometimes help.  For example, if you sample an 
 external clock, most of the time, the answer may be deterministic, 
 but once in awhile, you may get some actual entropy, in the sense 
 that the clock drift is sufficient that the sampled value could have 
 one of two values, and an attacker can't know which.

However, alas, the good guys don't know how much either, so they 
don't know much to take credit for.  An underestimate causes the 
RNG to stall, and an overestimate means the output is not as random
as it should be.  I vehemently recommend against risking either of
these failures.

I emphasize that there are two operations that must be considered
 1) Mixing stuff into the driver's pool, and
 2) taking credit for it ... the right amount of credit.

One without the other is strictly amateur hour.

 b.  If you sample enough squishes, you may accumulate a lot of 

You might, or you might not.  In an adversarial situation, this
is begging for trouble.  I vehemently recommend against this.

 Some ring oscillator designs are built like this, hoping to 
 occasionally sample the transition in value on one of the 

Hope is not an algorithm.

 The idea is that the rest of the behavior of the oscillators might
 possibly be predicted by an attacker, but what value gets read when
 you sample a value that's transitioning between a 0 and a 1 is really
 random, changed by thermal noise.

So quantify the thermal noise already.  It sounds like you are
using the oscillator as a crude digitizer, digitizing the thermal
noise, which is the first step in the right direction.  The next 
step to come up with a hard lower bound on the entropy density.

OTOH when you plug in the actual numbers, you will probably find 
that the oscillator is incredibly inefficient compared to a 

My main point is, there is a perfectly reasonable formalism for 
analyzing these things, so that hope is not required.

Secondarily, there is a huge industry mass-producing soundcards
at a very low price.  Very often, a soundcard is build into the
mainboard, whether you ask for it or not.  So in 

Re: [Cryptography] real random numbers

2013-09-13 Thread John Denker
Executive summary:

The soundcard on one of my machines runs at 192000 Hz.  My beat-up 
old laptop runs at 96000.  An antique server runs at only 48000. 
There are two channels and several bits of entropy per sample.
That's /at least/ a hundred thousand bits per second of real 
industrial-strength entropy -- the kind that cannot be cracked, 
not by the NSA, not by anybody, ever.

Because of the recent surge in interest, I started working on a 
new version of turbid, the software than manages the soundcard 
and collects the entropy.  Please give me another week or so.

The interesting point is that you rally want to rely on the
laws of physics.  Testing the output of a RNG can give an upper 
bound on the amount of entropy, but what we need is a lower bound, 
and only physics can provide that.  The physics only works if 
you /calibrate/ the noise source.  A major selling point of turbid
is the calibration procedure.  I'm working to make that easier for 
non-experts to use.

Concerning radioactive sources:

My friend Simplicio is an armchair cryptographer.  He has a proposal 
to replace triple-DES with quadruple-rot13.  He figures that since it
is more complicated and more esoteric, it must be better.

Simplicio uses physics ideas in the same way.  He thinks radioactivity 
is the One True Source of randomness.  He figures that since it is
more complicated and more esoteric, it must be better.

In fact, anybody who knows the first thing about the physics involved
knows that quantum noise and thermal noise are two parts of the same
elephant.  Specifically, there is only one physical process, as shown
by figure 1 here:
Quantum noise is the low-temperature asymptote, and thermal noise is
the high-temperature asymptote of the /same/ physical process.

So ... could we please stop talking about radioactive random number
generators and quantum random number generators?  It's embarrassing.

It is true but irrelevant that somebody could attempt a denial-of-service
attack against a thermal-noise generator by pouring liquid nitrogen
over it.  This is irrelevant several times over because:
 a) Any decrease in temperature would be readily detectable, and the 
  RNG could continue to function.  Its productivity would go down by
  a factor of 4, but that's all.
 b) It would be far more effective to pour liquid nitrogen over other
  parts of the computer, leading to complete failure.
 c) It would be even more effective (and more permanent) to pour sulfuric 
  acid over the computer.
 d) Et cetera.

The point is, if the attacker can get that close to your computer, you 
have far more things to worry about than the temperature of your noise 
source.  Mathematical cryptographers should keep in mind the proverb 
that says: If you don't have physical security, you don't have security.

To say the same thing in more positive terms:  If you have any halfway-
reasonable physical security, a thermal noise source is just fine, 
guaranteed by the laws of physics.

In practice, the nonidealities associated with radioactive noise are 
far greater than with thermal noise sources ... not to mention the cost 
and convenience issues.

As I have been saying for more than 10 years, several hundred thousand 
bits per second of industrial-strength entropy is plenty for a wide
range of practical applications.  If anybody needs more than that, we
can discuss it ... but in any case, there are a *lot* of services out 
there that would overnight become much more secure if they started 
using a good source of truly random bits.

The main tricky case is a virtual private server hosted in the cloud.
You can't add a real soundcard to a virtual machine.  My recommendation 
for such a machine is to use a high-quality PRNG and re-seed it at 
frequent intervals.  This is a chicken-and-egg situation:
 a) If you have /enough/ randomness stored onboard the VPS, you can 
  set up a secure pipe to a trusted randomness server somewhere else,
  and get more randomness that way.
 b) OTOH if the VPS gets pwned once, it might be pwned forever, because 
  the bad guys can watch the new random bits coming in, at which point
  the bits are no longer random.
 c) On the third hand, if the bad guys drop even one packet, ever,
  you can recover at that point.
 d) I reckon none of this is worth worrying about too much, because
  at some point the bad guys just strong-arm the hosting provider
  and capture your entire virtual machine.
The cryptography mailing list

[Cryptography] auditing a hardware RNG

2013-09-09 Thread John Denker
On 09/05/2013 05:11 PM, Perry E. Metzger wrote:

  A hardware generator can have
 horrible flaws that are hard to detect without a lot of data from many

Can you be more specific?  What flaws?

On 09/08/2013 08:42 PM, James A. Donald wrote:

 It is hard, perhaps impossible, to have test suite that makes sure
 that your entropy collection works.

Yes, it's impossible, but that's the answer to the wrong
question.  See below.

On 09/08/2013 01:51 PM, Perry E. Metzger wrote:

 I'll repeat the same observation I've made a lot: Dorothy Denning's
 description of the Clipper chip key insertion ceremony described the
 keys as being generated deterministically using an iterated block
 cipher. I can't find the reference, but I'm pretty sure that when she
 was asked why, the rationale was that an iterated block cipher can be
 audited, and a hardware randomness source cannot.

Let's assume she actually said that.

-- The fact that she said it does not make it true.  That is,
 the fact that she didn't know how to do the audit does not 
 mean it cannot be done.
-- We agree that her claim has been repeated a lot.  However,
 repetition does not make it true.

So, if anybody still wants to claim a HRNG cannot be audited,
we have to ask:
 *) How do you know?
 *) How sure are you?
 *) Have you tried?
 *) The last time you tried, what went wrong?


Just to remind everybody where I'm coming from, I have been saying
for many many years that mere /testing/ is nowhere near sufficient 
to validate a RNG (hardware or otherwise).  You are welcome to do 
as much testing as you like, provided you keep in mind Dykstra's 
   Testing can show the presence of bugs;
   testing can never show the absence of bugs.

As applied to the RNG problem:
   Testing can provide an upper bound on the entropy.
   What we need is a lower bound, which testing cannot provide.

If you want to know how much entropy there is a given source, we
agree it would be hard to measure the entropy /directly/.  So, as
Henny Youngman would say:  Don't do that.  Instead, measure three
physical properties that are easy to measure to high accuracy, and 
then calculate the entropy via the second law of thermodynamics.

You can build a fine hardware RNG using
 a) A physical source such as a resistor.
 b) A hash function.
 c) Some software to glue it all together.

I rank these components according to likelihood of failure
(under attack or otherwise) as follows:
  (c)  (b)  (a).
That is to say, the hardware part of the hardware RNG is the
/last/ thing I would expect to exhibit an undetectable failure.
If you want the next level of detail:
 a1) Electronic components can fail, but this is very unlikely
  and an undetectable failure is even more unlikely.  The
  computer has billions of components, only a handful of which
  are in the entropy-collecting circuit.  Failures can be
 a2) The correctness of the second law of thermodynamics is 
  very much better established than the correctness of any
  cryptologic hash.
 b) The hash in a HRNG is less likely to fail than the hash
  in a PRNG, because we are placing milder demands on it.
 c) The glue in the HRNG can be audited in the same way as 
  in any other random number generator.

Furthermore, every PRNG will fail miserably if you fail to
seed it properly.  This is a verrry common failure mode.
You /need/ some sort of HRNG for seeding.  Anybody who uses 
deterministic means to obtain random numbers is, of course, 
living in sin.  (John von Neumann)

  Tangential remark:  As for the Clipper key ceremony,
  that doesn't increase the credibility of anybody 
  involved.  I can think of vastly better ways of
  generating trusted, bias-proof, tamper-proof keys.

Bottom line: As H.E. Fosdick would say: 
  Just because you find somebody who doesn't know 
  how to do the audit doesn't mean it cannot be done.

The cryptography mailing list

Re: [Cryptography] tamper-evident crypto?

2013-09-06 Thread John Denker
Hash: SHA1

On 09/05/2013 06:48 PM, Richard Clayton wrote:
 so you'd probably fail to observe any background activity that tested
 whether this information was plausible or not  and then some chance
 event would occur that caused someone from Law Enforcement (or even a
 furnace maintenance technician) to have to look in the basement.

Well, I'm sure /somebody/ on this list is clever enough to 
arrange countersurveillance and counterintrusion measures...
  a) especially given that detecting surveillance and/or
   intrusion is the whole point of the exercise;
  b) especially given that we have all the time in the world 
   to arrange boatloads of nanny-cams and silent alarms etc.,
   arranging everything in advance, before provoking the 
  c) especially given that we know it's a trap, and the
   opponent probably isn't expecting a trap;
  d) especially given that the opponent has a track record
   of being sometimes lazy ... for instance by swearing that 
   the fruits of illegal wiretaps came from a confidential
   informant who has been reliable in the past and using that
   as the basis for a search warrant, at which point you've
   got them for perjury as well as illegal wiretapping,
   *and* you know your information security is broken;
  e) especially given that we get to run this operation
   more than once.

 (assuming that the NSA considered this [kiddie porn]
  important enough to pursue)
  *) If they don't like that flavor of bait, we can give
   them something else.  For example, it is known that 
   there is a large-diameter pipeline from the NSA to the
  *) Again:  We get to run this operation more than once.  

I repeat the question from the very beginning of this thread:
Shouldn't this be part of the /ongoing/ validation of any 
data security scheme?

There's a rule that says that you shouldn't claim a crypto
system is secure unless it has been subjected to serious
cryptanalysis.  I'm just taking the next step in this
direction.  If you want to know whether or not the system
is broken, /measure/ whether or not it is broken.

One of the rules in science, business, military planning,
et cetera is to consider /all/ the plausible hypotheses.
Once you consider the possibility that your data security
is broken, the obvious next step is to design an experiment
to /measure/ how much breakage there is.

Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -

The cryptography mailing list

Re: [Cryptography] tamper-evident crypto? (was: BULLRUN)

2013-09-05 Thread John Denker
I don't have any hard information or even any speculation about
BULLRUN, but I have an observation and a question:

Traditionally it has been very hard to exploit a break without 
giving away the fact that you've broken in.  So there are two 
fairly impressive parts to the recent reports:  (a) Breaking 
some modern, widely-used crypto, and (b) not getting caught 
for a rather long time.

To say the same thing the other way, I was always amazed that the
Nazis were unable to figure out that their crypto was broken during 
WWII.  There were experiments they could have done, such as sending
out a few U-boats under strict radio silence and comparing their 
longevity to others.

So my question is:  What would we have to do to produce /tamper-evident/
data security?

As a preliminary outline of the sort of thing I'm talking about, you
could send an encrypted message that says 
  The people at 1313 Mockingbird Lane have an 
   enormous kiddie porn studio in their basement.
and then watch closely.  See how long it takes until they get raided.

Obviously I'm leaving out a lot of details here, but I hope the idea
is clear:  It's a type of honeypot, adapted to detecting whether the
crypto is broken.

Shouldn't something like this be part of the ongoing validation of 
any data security system?

Also . on 09/05/2013 04:35 PM, Perry E. Metzger wrote:

 A d20 has a bit more than 4 bits of entropy. I can get 256 bits with
 64 die rolls, or, if I have eight dice, 16 rolls of the group.

You can get a lot more entropy than that from your sound card, a
lot more conveniently.

  If I mistype when entering the info, no harm is caused. 

I'm not so sure about that.  Typos are not random, and history proves 
that seemingly minor mistakes can be exploited.

 The generator can
 be easily tested for correct behavior if it is simply a block cipher.

I wouldn't have said that.

As Dykstra was fond of saying:
   Testing can show the presence of bugs;
   testing can never show the absence of bugs.

The cryptography mailing list

Re: [Cryptography] Snowden fabricated digital keys to get access to NSA servers?

2013-06-29 Thread John Denker
On 06/28/2013 04:00 PM, John Gilmore wrote:

 Let's try some speculation about what this phrase,
 fabricating digital keys, might mean.

Here's one hypothesis to consider.
 a) The so-called digital key was not any sort of decryption key.
 b) The files were available on the NSA machines in the clear.
 c) The files were protected only by something like the Unix file
  protection mechanism ... or the SELinux Mandatory Access Controls.
 d) The digital key might have been not much more than a userID
  and password, plus maybe a dongle, allowing him to log in as a 
  shadow member of some group that was supposed to have access to 
  the files.


Crypto is great for protecting stuff while it is being transmitted
or being stored offline ... but when the stuff is in active use, 
the temptation is to make a cleartext working copy.  Then anybody
who can attach a thumb drive and can get past the access controls
can grab whatever he wants.

It is against NSA policy to attach a thumb drive.  I betcha some
folks really want to know how he did that without getting caught.

The cryptography mailing list

Re: Disk encryption advice...

2010-10-08 Thread John Denker
On 10/08/2010 04:27 PM, Perry E. Metzger wrote:
 I have a client with the following problem. They would like to
 encrypt all of their Windows workstation drives, but if they do that,
 the machines require manual intervention to enter a key on every
 reboot. Why is this a problem? Because installations and upgrades of
 many kinds of Windows software require multiple reboots, and they
 don't want to have to manually intervene on every machine in their
 buildings in order to push out software and patches.
 (The general threat model in question is reasonably sane -- they
 would like drives to be harmless when machines are disposed of or if
 they're stolen by ordinary thieves, but on the network and available
 for administration the rest of the time.)
 Does anyone have a reasonable solution for this?

1) Thanks for being explicit about the threat model and objectives.
 As is so often the case, what the client wants is probably not
 exactly what the client is asking for.

2) In this case, I reckon the client would be content to encrypt
 _everything of value_ on the drive ... even if this is not quite
 the entire drive.

 In particular: On every normal boot, the machine boots into a
 preliminary kernel that uses key A to request key B over the
 network.  Key B is the disk key, which then gets stored in some 
 guaranteed-volatile place that will survive a chain-boot but
 not survive anything else.  Then the pre-kernel chain-boots

 To be clear:  The entire Windows partition is encrypted, but 
 the pre-kernel lives in a tiny partition of its own that is not

 If the machine is stolen, it is immediately harmless because it 
 is no longer connected to the network.  If the machine is to be
 disposed of *or* if theft is detected or suspected, then the 
 keyserver that hands out disk-keys will stop serving the key 
 for that machine, so even if it is reconnected to the network 
 somehow it is still harmless.  For icing on the cake, the keyserver
 can check IP addresses, virtual circuits, etc. ... to check that 
 a machine that is supposed to be on the secure wired network has 
 not suddenly relocated to the insecure wireless network that serves
 the lobby and leaks out into the parking lot.

 If the network is down and somebody wants to boot his machine
 anyway, he can type in the key by hand.

3) The same effect can be achieved using a hypervisor / VM approach,
 rather than chain-booting.  The same effect can be achieved by
 messing with grub, although that would be more work.  The same
 effect can be achieved by messing with coreboot, but that would
 be even more work.

4) If the customer absolutely insists that the entire Windows drive
 be encrypted, just add another drive, perhaps a very small flash drive.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: customizing Live CD images

2010-09-10 Thread John Denker
On 08/02/2010 10:47 PM, I wrote:
 We have been discussing the importance of a unique random-seed
 file each system.  This is important even for systems that boot 
 from read-only media such as CD.
 To make this somewhat more practical, I have written a script
 to remix a .iso image so as to add one or more last-minute files.
 The leading application (but probably not the only application)
 is adding random-seed files.

I just now put up new-and-improved versions of these
software tools, and a web page summarizing the situation.


 *) Worst-case customization time:  about a minute on my laptop.
 *) Best case:  Less than 1 second.
 *) The best case is the typical case, if you are making many disks.
 *) The checksums in md5sum.txt are properly recomputed.
 *) Better usage messages, better comments.
 *) The supporting tools are now more robust and more generally useful:
-- Hacking ISO 9660 images
-- Hacking md5sum lists.

Comments and suggestions are welcome.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: Randomness, Quantum Mechanics - and Cryptography

2010-09-07 Thread John Denker
On 09/07/2010 10:21 AM, Marsh Ray wrote:

 If anybody can think of a practical attack against the randomness
 of a thermal noise source, please let us know.  By practical I
 mean to exclude attacks that use such stupendous resources that
 it would be far easier to attack other elements of the system.
 Blast it with RF for one.

1) This is not an argument in favor of quantum noise over
thermal noise, because the same attack would be at least
as effective against quantum noise.

2) You can shield things so as to make this attack very,
very difficult.

3) The attack is detectable long before it is effective,
whereupon you can shut down the RNG, so it is at best a
DoS attack.  And then you have to compare it against
other brute-force DoS attacks, such as shooting the
computer with an AK-47.

 Typically the natural thermal noise amounts to just a few millivolts,
 and so requires a relatively sensitive A/D converter. This makes it
 susceptible to injected unnatural noise overloading the conversion and
 changing most of the output bits to predictable values.

Even the cheapest of consumer-grade converters has 16 bits of
resolution, which is enough to resolve the thermal noise and
still have _two or three orders of magnitude_ of headroom.  If
you are really worried about this, studio-grade stuff is still
quite affordable, and has even more headroom and better shielding.

How much RF are we talking about here?  At some point you can
undoubtedly DoS the RNG ... but I suspect the same amount of
RF would fry most of the computers, phones, and ipods in the 

Is the RF attack in any way preferable to the AK-47 attack?

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: Randomness, Quantum Mechanics - and Cryptography

2010-09-07 Thread John Denker
On 09/07/2010 11:19 AM, Perry E. Metzger wrote:
  2) You can shield things so as to make this attack very,
  very difficult.
 I suspect that for some apps like smart cards that might be hard.
 OTOH, it might be straightforward to detect the attempt.

We should take the belt-and-suspenders approach:
 a) Do some reasonable amount of shielding, and
 b) detect the attack.

Detecting the attack is utterly straightforward.  

The primary defense is to close the loop around 
the noise-generating element.  That is, we inject 
a known calibration signal on top of the noise ... 
and use that to constantly check that the input
channel gain and bandwidth are correct.

The true noise level depends only on gain, bandwidth,
temperature, and resistance.  Blasting the system
with RF will not lower the temperature, so that's
not a threat.  So unless you have a scenario where
the RF lowers the resistance, lowers the gain,
and/or lowers the bandwidth 
_in a way that the calibrator cannot detect_
then this attack does not rise above the level of
a brute-force DoS attack, in the same category as 
the AK-47 attack or the stomp-the-smart-card-to-dust

The calibrator idea relies on the fact that the
computer's i/o system has an o as well as an i.

Note that this defense is equally effective against
 *) Continuing attacks, where a continuing RF blast
  drives the first stage amplifier into saturation, 
  without necessarily doing irreversible damage, and
 *) One-shot attacks, where a super-large blast does
  irreversible damage to the amplifier.

Secondary defenses, if you want to go to the trouble,
include putting a canary in the coal mine, i.e.
implementing a second sensor with a different gain,
bandwidth, and resistance.  I reckon that attacking
one sensor and getting away with it is only possible
on a set of measure zero, but the chance of attacking
two non-identical sensors without either one of them 
noticing is a set of measure zero squared.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

customizing Live CD images (was: urandom etc.)

2010-08-03 Thread John Denker
We have been discussing the importance of a unique random-seed
file each system.  This is important even forsystems that boot 
from read-only media such as CD.

To make this somewhat more practical, I have written a script
to remix a .iso image so as to add one or more last-minute files.
The leading application (but probably not the only application)
is adding random-seed files.

The script can be found at

This version is literally two orders of magnitude more 
efficient than the rough pre-alpha version that I put up
yesterday ... and it solves a more general problem, insofar
as random-seed files are not the only things it can handle.

Early-boot software is outside my zone of comfort, let
alone expertise, so I reckon somebody who is friends with
Casper could make further improvements ... but at least 
for now this script serves as an existence proof to show 
 a) the PRNG situation is not hopeless, even for read-only
   media; and
 b) it is possible to remix Live CD images automatically
   and somewhat efficiently.

I think by taking two steps we can achieve a worthwhile
improvement in security:
 -- each system should have its own unique random-seed
  file, with contents not known to the attackers; and
 -- the init.d/urandom script should seed the PRNG 
  using date +%s.%N  (as well as the random-seed file).

Neither step is worth nearly as much without the other,
but the two of them together seem quite worthwhile.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: init.d/urandom : saving random-seed

2010-08-02 Thread John Denker
On 07/31/2010 09:00 PM, Jerry Leichter wrote:

 I wouldn't recommend this for high-value security, but then if you're
 dealing with high-value information, there's really no excuse for not
 having and using a source of true random bits.

Yes indeed!

 On the question of what to do if we can't be sure the saved seed file
 might be reused:  Stir in the date and time and anything else that might
 vary - even if it's readily guessable/detectable - along with the seed
 file.  [1]

  This adds minimal entropy, but detecting that a seed file has
 been re-used will be quite challenging.  A directed attack can probably
 succeed, but if you consider the case of a large number of nodes that
 reboot here and there and that, at random and not too often, re-use a
 seed file, then detecting those reboots with stale seed files seems like
 a rather hard problem.  (Detecting them *quickly* will be even harder,
 so active attacks - as opposed to passive attacks that can be made on
 recorded data - will probably be out of the question.)

I've been thinking about that.  That approach might be even *better*
than it first appears.

By way of background, recall that a good design for the central part
of a PRNG is:
   output = hash(key, counter) [2]
  -- the hash is your favorite cryptographically strong hash function;
  -- the counter is just a plain old counter, with enough bits
   to ensure that it will never wrap around; and
  -- the key is unique to this instance of the PRNG, is unknown to
   the attackers, and has enough bits to rule out dictionary attacks.
  -- There should be some scheme for reseeding the key every so 
   often, using real entropy from somewhere. This is outside of what
   I call the central part of the PRNG, so let's defer discussion
   of this point for a few moments.

Note that this works even though the counter has no entropy at all.
It works even if the attacker knows the counter values exactly.

This is crucial to an analysis of idea [1], because I am not sure
that the date/time string has any useful amount of entropy.  Let's
be clear: if the attacker knows what time it is, the data/time
string contains no entropy at all.

Now, if all we need is a /counter/ then the merit of idea [1] goes
up dramatically.

I reckon date +%s.%N makes a fine counter.

Note that date is /bin/date (not /usr/bin/date) so it is usable
very early in the boot process, as desired.

This requires that all boxes have a working Real Time Clock, which
seems like a reasonable requirement.

Security demands that the key in equation [2] be unique on a
machine-by-machine  basis.  This means that if I want my live CD 
to be secure, I cannot simply download a standard .iso image and 
burn it to CD.  I need to
 -- download the .iso image
 -- give it a random-seed file with something unique, preferably 
  from the output of a good TRNG, and
 -- then burn it to disk.

I have very preliminary draft of a script to install a random-seed
file into an Ubunto live-CD image.
Suggestions for improvement would be welcome.


So, let's summarize the recommended procedure as I understand it.
There are two modes, which I call Mode A and Mode B.

In both modes:

 *) there needs to be a random-seed file.  The contents must be
  unique and unknown to the attackers.
 *) /dev/urandom should block or throw an error if it used before
  it is seeded
 *) early in the boot process, the PRNG should be seeded using
  the random-seed file _and_ date +%s.%N.  This should happen
   -- after the random-seed file becomes readable
   -- after the Real Time Clock is available
   -- as soon thereafter as convenient, and
   -- before there is any need to use the output of /dev/urandom
 *) This is all that is necessary for Mode B, which provides a
  /modest/ level of security for a /modest/ level of exposure.  
  As a first rough guess I suggest limiting exposure to 1000 
  hours of operation or 1000 reboots, whichever comes first.
 *) Mode A is the same as Mode B, but has no exposure limits
  because the random-seed is replaced before the Mode-B limits
  are reached.  
 *) It is nice to update the  random-seed on every reboot.  
  This should happen
   -- after the random-seed file becomes writable
   -- as soon thereafter as convenient, to minimize the chance
that the system will crash before the update occurs.
 *) The random-seed file should be updated again during shutdown.
  This allows recovery from a situation where the random-seed
  file might have been compromised.
 *) Updating fails if some wiseguy mounts a filesystem in such a
  way that the random-seed file that gets updated is not the one
  that will be used for seeding the PRNG.  AFAICT Mode A depends
  on having the random-seed file in local writeable persistent
  storage, not on (say) a networked remote file system.  In some
  cases the init.d/urandom script would have to be customized to 
  locate the random-seed file on a 

Re: init.d/urandom : saving random-seed

2010-07-31 Thread John Denker
Hi Henrique --

This is to answer the excellent questions you asked at

Since that bug is now closed (as it should be), and since these
questions are only tangentially related to that bug anyway, I am 
emailing you directly.  Feel free to forward this as appropriate.

 1. How much data of unknown quality can we feed the random pool at boot,
before it causes damage (i.e. what is the threshold where we violate the
you are not goint to be any worse than you were before rule) ?

There is no possibility of making things worse.  It is like shuffling
a deck of cards:  If it is already shuffled, shuffling it some more
is provably harmless.

This property is a core design requirement of the PRNG, and has been
for ages.

Note that writing to /dev/random requires no privileges, which makes
sense in light of this property.

 2. How dangerous it is to feed the pool with stale seed data in the next
boot (i.e. in a failure mode where we do not regenerate the seed file) ?

As is so often the case in the security / crypto business, the answer
depends on your threat model.  The demands placed on a PRNG vary wildly
from one application to another.  Interesting use cases include:
 a) low-grade randomness: For non-adversarial applications such as Monte
  Carlo integration of a physics problem, almost any PRNG will do.  Even
  a LFSR would do, even though a LFSR can easily be cryptanalyzed.  The
  point is that nobody is going to bother attempting the cryptanalysis.
 b) current /dev/urandom: The consensus among experts is that /dev/urandom
  is routinely used in ways for which it is not suited.  See my previous
  email, or refer to
 c) high-grade randomness: For high-stakes adversarial applications, 
  including crypto and gaming, you really ought to use a TRNG not a
  PRNG.  In this case, no state is required and no seed is required, 
  so the question of how to preserve the state across reboots does not
  arise.  Constructive suggestion:  for high-grade applications, use

To repeat:  For serious applications, I wouldn't trust /dev/urandom at all, 
and details of how it gets seeded are mostly just re-arranging the deck 
chairs on the Titanic.  The question is not whether this-or-that seed
preservation policy is safe.  The most you can ask of a seed preservation
policy is that the PRNG after reboot will be _not worse_ than it was before.

Now, to answer the question:  A random-seed file should never be reused.
Never ever.

Reusing the random-seed file makes the PRNG very much worse than it would
otherwise be.  By way of illustration, suppose you are using the computer
to help you play battleship or go fish against a ten-year-old opponent.
If you use the same 'random' numbers after every reboot, the opponent is
going to notice.  You are going to lose.  In more-demanding situations,
against an opponent with more skill and more motivation, you are going to
lose even more miserably.

 3. What is the optimal size of the seed data based on the pool size ?

While we are on the subject, let me point out a bug in all recent versions
of init.d/urandom (including the current sid version as included in
initscripts_2.88dsf-11_amd64.deb) : 
  The poolsize as reported by /proc/sys/kernel/random/poolsize has units 
  of _bits_ whereas the random-seed filesize has units of _bytes_.  It is 
  a bug to directly compare these numbers, or to set one of them based on 
  the other.  There needs to be a conversion factor, perhaps something like
  (( DD_BYTES = ($POOLSIZE + 7)/8 ))

Now, to answer the question:  It suffices to make the random-seed file 
contain the same number of bits as the PRNG's internal state vector 
(poolsize).  Call this the BBJR size (baby-bear-just-right).

On the other hand, it is harmless to make the random-seed file larger than
it needs to be.

In contrast, using the size of the random-seed file to reset the PRNG's 
poolsize is a bad idea, especially if the random-seed file is (intentionally 
or otherwise) bigger or smaller than the BBJR size.

Semi-constructive pseudo-suggestion:  *IF* we want to keep track of the 
poolsize, it might make more sense to store it separately and explicitly, 
in its own file.  This would make the code simpler and more rational.

On the other hand, I'm not sure why there is *any* code in init.d/urandom
for saving or setting the poolsize.  Chez moi /proc/sys/kernel/random/poolsize
is read-only.  Indeed I would expect it to be read-only, since changing 
it would have drastic consequences for the internal operation of the PRNG, 
and looking at random.c I don't see any code to handle such a change.

So the real suggestion is to eliminate from the Linux init.d/urandom 
all of the code that tries to ascertain the size of the random-seed 
file and/or tries to set the poolsize.  (For non-Linux systems, the
situation may or may not be 

Re: Persisting /dev/random state across reboots

2010-07-29 Thread John Denker
On 07/29/2010 12:47 PM, Richard Salz wrote:

 At shutdown, a process copies /dev/random to /var/random-seed which is 
 used on reboots.  [1]

Actually it typically copies from /dev/urandom not /dev/random,
but we agree, the basic idea is to save a seed for use at the
next boot-up.

 Is this a good, bad, or shrug, whatever idea?

Before we can answer that, we must have a brief what's your 
threat model discussion.  As always, I define random to 
mean random enough for the application.  The demands vary 
wildly from application to application.  Interesting use cases
 a) low-grade randomness: For non-adversarial applications
  such as Monte Carlo integration of a physics problem, 
  almost any RNG will do.
 b) high-grade randomness: For high-stakes adversarial
  applications, including crypto and gaming, I wouldn't trust 
  /dev/urandom at all, and details of how it gets seeded are 
  just re-arranging the deck chairs on the Titanic.


A) For low-grade applications, procedure [1] is well suited.
It is clearly better than nothing, although it could be

A conspicuous weakness of procedure [1] is that gets skipped
if the machine goes down due to a software fault, or power
failure, or really anything other than an orderly shutdown.

Rather than writing the file at the last possible opportunity,
it would make at least as much sense to write it much earlier, 
perhaps immediately after boot-up.

B)  At the other extreme, for high-grade applications, /dev/random
is not (by itself) good enough, and asking how it gets seeded is 
just re-arranging deck chairs on the Titanic.

Tangential remark:  It would be nice if the random-seed file 
could be written in a way that did not deplete the amount of
entropy stored in /dev/random ... but given the existing linkage 
between /dev/random and /dev/urandom it cannot.  On the other 
hand, if you have reason to worry about this issue, you shouldn't 
be using /dev/urandom at all anyway.  Remember what von Neuman 
said about living in sin.

Tangential remark:  You could worry about how carefully we
need to read-protect the random-seed file (and all backups
thereof).  But again, if you are worried at that level of
detail, you shouldn't be using a PRNG anyway.  If it needs 
a seed, you are living in sin.

Constructive suggestion:  Use something like Turbid:
i.e. something that generates a steady stream of honest-to-
goodness entropy.

If you are not sure whether you need Turbid, you ought to use 
it.  It's cheap insurance.  The cost of implementing Turbid is 
very small compared to the cost of proving you don't need it.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: unattended reboot (was: clouds ...)

2009-08-03 Thread John Denker
On 08/01/2009 02:06 PM, Jerry Leichter wrote:

  A while back, I evaluated a
 technology that did it best to solve a basically insoluble problem:  How
 does a server, built on stock technology, keep secrets that it can use
 to authenticate with other servers after an unattended reboot?  

This problem is routinely solved in practice.

 tamper-resistant hardware that controls access to keys, anything the
 software can get at at boot, an attacker who steals a copy of a backup,
 say - can also get at.  

1a) Don't put the keys on the routine backups, and/or
1b) secure the backed-up keys as carefully as you secure the machine 

2) If the machine itself is not secure, you have already lost the
game and there's no hope of securing any keys or certificates on
that machine.

 So, the trick is to use a variety of
 measurements of the hardware - amount of memory, disk sizes, disk serial
 numbers, whatever you can think of that varies from machine to machine

I see no advantage in that.  The only halfway-useful property that 
such data has is that it is not backed up by ordinary off-the-shelf 
backup routines.  That's not an important advantage because it is 
easy to arrange for *any* data of your choosing to be not backed up.
 -- If you routinely back up files, put keys in a special file.  
 -- If you routinely back up entire partition, put keys in a special 
  partition (or outside any partition).  
 -- If you routinely mirror entire drives, put keys on a special drive.  
This is all stock technology.

Let's be clear:  If the attackers have penetrated the machine to the 
point where they can read the keys from a special file/partition/drive, 
they can read the hardware serial numbers etc. as well.

 .  Since hardware does need to be fixed or
 upgraded at times, a good implementation will use some kind of m
 unchanged out of n measurements algorithm.  

That makes life harder for the good guys, and makes life easier for
the bad guys.  Just putting the keys on disk is far more reliable
and practical, especially during hardware maintenance (scheduled or 

On top of all that, there is the very serious risk of a dictionary
attack against the hardware serial numbers.  There's nowhere near
enough entropy in the hardware serial numbers.  There is incomparably
more entropy in a special file/partition/drive.

 Virtualization changes all of this.

That's yet another reason for not taking the hardware serial number
approach.  In contrast, a special file/partition/drive can be virtualized 
in a direct and natural way.

Bottom line:  Relying on hardware serial numbers etc. to defend keys 
is not recommended.  Vastly more practical approaches are available.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to

Re: combining entropy

2008-10-28 Thread John Denker
On 10/28/2008 09:43 AM, Leichter, Jerry wrote:

 | We start with a group comprising N members (machines or 
 | persons).  Each of them, on demand, puts out a 160 bit 
 | word, called a member word.  We wish to combine these 
 | to form a single word, the group word, also 160 bits 
 | in length.
 This isn't enough.  Somehow, you have to state that the values emitted
 on demand in any given round i (where a round consists of exactly one
 demand on all N member and produces a single output result) cannot
 receive any input from any other members.  Otherwise, if N=2 and member
 0 produces true random values that member 1 can see before it responds
 to the demand it received, then member 1 can cause the final result to
 be anything it likes.

Perhaps an example will make it clear where I am coming
from.  Suppose I start with a deck of cards that has been 
randomly shuffled.  It can provide log2(52!) bits of 
entropy.  That's a little more than 225 bits.  Now suppose 
I have ten decks of cards all arranged alike.  You could 
set this up by shuffling one of them and then stacking 
the others to match ... or by some more symmetric process. 
In any case the result is symmetric w.r.t interchange of 
decks.  In this situation, I can choose any one of the 
decks and obtain 225 bits of entropy.  The funny thing 
is that if I choose N of the decks, I still get only 225 
bits of entropy, not N*225.

This can be summarized by saying that entropy is not an
extensive quantity in this situation.  The graph of
entropy versus N goes like this:

 225*   *   *   *   *

0   1   2   3   4   5  (# of decks)

The spooky aspect of this situation is the whack-a-mole 
aspect:  You cannot decide in advance which one of the 
decks has entropy and which N-1 of them do not.  That's 
the wrong question.  The first deck we choose to look 
at has 225 bits of entropy, and only then can we say 
that the other N-1 decks have zero additional entropy.

The original question spoke of trusted sources of
entropy, and I answered accordingly.  To the extent
that the sources are correlated, they were never eligible 
to be considered trusted sources of entropy.  To say 
the same thing the other way around, to the extent
that each source can be trusted to provide a certain
amount of entropy, it must be to that extent independent 
of the others.

It is possible for a source to be partially dependent
and partially independent.  For example, if you take
each of the ten aforementioned decks and cut the deck
randomly and independently, that means the first deck
we look at will provide 225 bits of entropy, and each
one thereafter will provide 5.7 bits of additional
entropy, since log2(52)=5.7.  So in this situation,
each deck can be /trusted/ to provide 5.7 bits of

In this situation, requiring each deck to have no
input from the other decks would be an overly strict
requirement.  We do not need full independence;  we
just need some independence, as quantified by the
provable lower bound on the entropy.

If you wanted, you could do a deeper analysis of this 
example, taking into account the fact that 5.7 is not 
the whole story.  It is easy to use 5.7 bits as a valid 
and trustworthy lower bound, but under some conditions 
more entropy is available, and can be quantified by
considering the _joint_ probability distribution and
computing the entropy of that distribution.  Meanwhile
the fact remains that under a wide range of practical 
conditions, it makes sense to engineer a randomness 
generator based on provable lower bounds, since that 
is good enough to get the job done, and a deeper
analysis would not be worth the trouble.

 If the issue is how to make sure you get out at least all the randomness
 that was there,

I'm going to ignore the At least.  It is very hard to 
get out more than you put in.

On a less trivial note:  The original question did not
require getting out every last bit of available randomness.
In situations where the sources might be partially
independent and partially dependent, that would be a 
very hard challenge, and I do not wish to accept that 

Dealing with provable lower bounds on the entropy is
more tractable, and sufficient for a wide range of
practical purposes.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: combining entropy

2008-10-27 Thread John Denker
On 10/25/2008 04:40 AM, IanG gave us some additional information.

Even so, it appears there is still some uncertainty as to
interpretation, i.e. some uncertainty as to the requirements
and objectives.

I hereby propose a new scenario.  It is detailed enough to
be amenable to formal analysis.  The hope is that it will
satisfy the requirements and objectives ... or at least
promote a more precise discussion thereof.

We start with a group comprising N members (machines or 
persons).  Each of them, on demand, puts out a 160 bit 
word, called a member word.  We wish to combine these 
to form a single word, the group word, also 160 bits 
in length.

We must find a combining function.  The primary objective
is to maximize the entropy of the group word.  A secondary
objective is computational simplicity.

The members can be categorized as follows:
 a) Some of them are abjectly broken.  Their outputs have
  zero entropy density.  Constant outputs and/or predictable
  outputs fall into this category.
 b) Some of them are malicious.  Their outputs may appear
  random, but are in fact predictable by our adversary.
 c) M of them have an entropy density greater than XX.
  As a concrete example, we consider the case where XX=50%, 
  i.e. 80 bits of entropy in a 160 bit word.
 d) Some of them could contain a high entropy density,
  very close to 100%.  For our example, we assume there
  are none of these;  otherwise the problem would be
  too easy.

If we do things properly, case (b) is no worse than case
(a), for reasons that will become apparent shortly, so
we can lump these cases together.

We don't know which generator falls into which category.
All we need to know is that M of the generators are
putting out useful entropy.

I recommend the following combining function:  concatenate
all N of the member words, and then feed that through a
hash function to produce the group word.  Since SHA-1 is
efficient and has a 160 bit output word, it will serve 
nicely in our example.

In the sub-case where M=1, the recommended hash-based
procedure produces a group word with 80 bits of entropy, 
i.e. a 50% entropy density, which is the best we can 
do.  In this sub-case, SHA-1 is no better than XOR.

As M increases, the entropy density of the output word
converges rather rapidly to 100%.  This is subject to
mild assumptions about the hash function actually working
as a hash function, i.e. not being grossly broken.

When M is greater than 1, the hash function approach
is much better than the XOR approach.  Here is an
easy proof:  Consider the case where each member in
category (c) puts out a 160 bit word consisting of 80 
totally random bits in the left half and 80 constant
bits in the right half.  XORing these together only
gets you to 80 bits of entropy in the group word,
whereas hashing is better by a factor of 2.  Actually
(2 minus epsilon) if you want to be fussy about it.

In the case where the entropy is evenly distributed
within the member word, i.e. 160 bits each with a 50%
entropy density, the result is more subtle:  The group
word will converge to 100% entropy density, but the
hash version converges _faster_ than the XOR version.
Here faster means you can get by with a smaller M.
Considerably smaller.  Also (!) beware that to get XOR 
to converge at all, this paragraph depends on some 
properties of the members that may be hard to realize 
in practice ... whereas the hash approach has no
such dependence.


To summarize:  In the special sub-case where M=1, XOR
is as good as it gets.  In all other cases I can think
of, the hash approach is much better.

My analysis applies to a specific set of requirements.
If somebody wants to discuss other requirements, please
be specific about what the requirements are.  There
are innumerable creeping features that we could discuss.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: combining entropy

2008-10-27 Thread John Denker
Alas on 10/25/2008 01:40 PM, I wrote:

 To summarize:  In the special sub-case where M=1, XOR
 is as good as it gets.  In all other cases I can think
 of, the hash approach is much better.

I should have said that in the special sub-case where 
the member word has entropy density XX=100% _or_ in 
the special sub-case where M=1, XOR is as good as it
gets.  In all other cases I can think of, the hash 
approach is much better.

(I excluded the XX=100% case earlier in the note, but
I should have included it in the summary.  Sorry.)

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: combining entropy

2008-10-25 Thread John Denker
On 10/24/2008 03:40 PM, Jack Lloyd wrote:

 Perhaps our seeming disagreement is due to a differing interpretation
 of 'trusted'. I took it to mean that at least one pool had a
 min-entropy above some security bound. You appear to have taken it to
 mean that it will be uniform random?

Thanks, that question advances the discussion.

The answer, however, is no, I did not assume 100% entropy
density.  Here is the critical assumption that I did make:

We consider the scenario where we started with N randomness
generators, but N-1 of them have failed.  One of them is
still working, but we don't know which one.

To say the same thing in more detail:  Suppose we start
with N generators, each of which puts out a 160 bit word
containing 80 bits of _trusted_ entropy.  That's a 50%
entropy density.

Here _trusted_ means we have a provable lower bound on the
entropy.  I assume this is the same as the aforementioned
min-entropy above some security bound.

We next consider the case where N-1 of the generators have 
failed, or can no longer be trusted, which is essentially the
same thing for present purposes.  Now we have N-1 generators 
putting out zero bits of trusted entropy, plus one generator 
putting out 80 bits of trusted entropy.  I emphasize that
these 80 bits of trusted entropy are necessarily uncorrelated
with anything happening on the other N-1 machines, for the
simple reason that they are uncorrelated with anything 
happening anywhere else in the universe ... otherwise they
would not qualify as trusted entropy.

XORing together all N of the 160 bit output words produces
a single 160 bit word containing 80 bits of trusted entropy.
Therefore, unless there is some requirement or objective
that I don't know about, the previously-stated conclusion

 XOR is a good-enough combining function,
 and nothing else would be any better.

XOR is provably correct because it is _reversible_ in the 
thermodynamic sense.  That means it cannot increase or 
decrease the entropy.


Obviously this numerical example generalizes to any entropy
density from zero to 100% inclusive.

To summarize:  The key assumptions are that we have N-1
broken generators and one working generator.  We don't
know which one is working, but we know that it is working 

For more about the theory and practice of high-entropy
randomness generators, see

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: combining entropy

2008-10-24 Thread John Denker
On 09/29/2008 05:13 AM, IanG wrote:
 My assumptions are:
  * I trust no single source of Random Numbers.
  * I trust at least one source of all the sources.
  * no particular difficulty with lossy combination.

 If I have N pools of entropy (all same size X) and I pool them
 together with XOR, is that as good as it gets?


The second assumption suffices to prove the result,
since (random bit) XOR (anything) is random.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: combining entropy

2008-10-24 Thread John Denker
On 10/24/2008 01:12 PM, Jack Lloyd wrote:

  is a very different statement from saying that
 lacking such an attacker, you can safely assume your 'pools of
 entropy' (to quote the original question) are independent in the
 information-theoretic sense.

The question, according to the original poster, is not 
whether it is safe to assume that one of the entropy
sources can be trusted.  Safe or not, the question explicitly 
assumed that one of the sources was trusted ... and asked 
what the consequences of that assumption would be.

In particular, evidently the scenario was that we started
with N high-entropy randomness generators, but N-1 of
them have failed.  One of them is still working, but we
don't know which one.

In that scenario, XOR is a good-enough combining function,
and nothing else would be any better.

If somebody wants to discuss a different scenario, please
clarify what the new scenario is.

Suggesting that the trusted source is correlated with one
of the other sources is quite contrary to the requirements
expressed in the original question.

That is to say, if the source is not independent, it was
never eligible to be a trusted entropy source.

If you want to quantify this, write down the _joint_ probability
distribution for all the sources, and calculate the entropy
of that distribution in the usual way.

1) There is _one_ very precise meaning for entropy that is 
well-established and conventional across a wide range of 
fields ... everything from kitchen appliances to cosmology.

2) Authors are allowed to define and redefine terms however
they please ... _provided_ they define any nonstandard terms
that they use.  Anybody who takes a well-established standard
term and uses it in a nonstandard way has a double-extra-special
duty to explain what he's doing.

I assume the original poster was using the term entropy
in the conventional, precise sense ... and until I hear
otherwise I will continue to do so.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Lava lamp random number generator made useful?

2008-09-21 Thread John Denker
On 09/20/2008 12:09 AM, IanG wrote:

 Does anyone know of a cheap USB random number source?

Is $7.59 cheap enough?

For that you get a USB audio adapter with mike jack, and
then you can run turbid(tm) to produce high-quality randomness.

Reference, including analytical paper plus code:

 As a meandering comment, it would be extremely good for us if we had
 cheap pocket random number sources of arguable quality [1].

If the above is not good enough, please explain.

 I've often thought that if we had an open source hardware design of
 a USB random number generator ... that cost a few pennies to add
 onto any other USB toy ... then we could ask the manufacturers to
 throw it in for laughs.  Something like a small mountable disk that
 returns randoms on every block read, so the interface is trivial.

I think the turbid solution is much better than a disk.
 -- Unlimited long-term capacity.
 -- Perfect forward secrecy, unlike a disk, unless you do a 
  really good job of erasing each block after use.
 -- Perfect secrecy in the other direction, period.
 Then, when it comes time to generate those special keys, we could
 simply plug it in, run it, clean up the output in software and use
 it.  Hey presto, all those nasty software and theoretical
 difficulties evaporate.

If the above is not good enough, please explain.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

On randomness

2008-07-31 Thread John Denker
In 1951, John von Neumann wrote:

 Any one who considers arithmetical methods of producing random digits
 is, of course, in a state of sin.

That may or may not be an overstatement.

IMHO it all depends on what is meant by random.  The only notion 
of randomness that I have found worthwhile is the notion of being 
_random enough for the purpose_.

  *) For some benign purposes, such as a Monte Carlo integration
   routine, a simple linear congruence generator might suffice.
   Such a generator would not withstand cryptanalysis for even
   a moment, but it will not be subjected to cryptanalysis, so
   we don't care.

  *) At the other extreme, there are many high-stakes business,
   military, and gambling applications where I would agree with 
   von Neumann, and would shun absolutely all PRNGs.  I would 
   rely exclusively on _hardware_ randomness generators, as
   detailed at:

   The /seeding/ of PRNGs is a notorious problem;  the idea of
   seeding one PRNG with another often reduces to the problem
   previously _not_ solved.  Sooner or later you need a source
   of high-grade randomness, not pseudo randomness, and sooner
   is better than later.

   For this reason, most so-called PRNGs are not really PRNGs
   after all, since their foundations are seen to rest on a 
   hardware randomness generator.  They are more usefully
   considered schemes for _stretching_ the randomness of the 
   underlying hardware.  I call them SRNGs, for stretched 
   randomness generators.

On 07/30/2008 12:22 PM, Pierre-Evariste Dagand wrote:

 I fear I was not clear: I don't see what is wrong in evaluating the
 quality of a random number generator with (an extensive set of)
 statistical tests.

How extensive?

To paraphrase Dykstra:  Testing may prove the absence of randomness,
but it cannot possibly prove the presence of randomness.

Testing for high-grade randomness is not just a hard problem; it is 
formally undecidable, in the same category as the halting problem.  
Reference: Chaitin.  
See also:

On 07/30/2008 01:33 PM, Ben Laurie replied:
 SHA-1(1), SHA-1(2), SHA-1(3), ... SHA-1(N) will look random, but clearly
 is not.

Quite so.

 But, then, there is a the chicken or the egg problem: how would you
 ensure that a *new* RNG is a good source of randomness ? (it's not a
 rhetorical questions, I'm curious about other approaches).
 By reviewing the algorithm and thinking hard.

Sometimes that's good enough, but sometimes it's not.  Or more to the
point, often the cost of thinking hard enough exceeds the cost of 
implementing a _hardware_ randomness generator that has a _provable_
_lower bound_ on its entropy(*).

To paraphrase the previous paraphrase:  Testing may provide an upper 
bound on the randomness, but it cannot possibly provide a useful lower 
bound.  In contrast, physics can provide a useful lower bound.

Saying that this-or-that test measures the randomness is highly 
misleading if you don't distinguish measuring an upper bound from 
measuring a lower bound.  The test that judged a DNS server to be
GREAT was making precisely this mistake.

  *) NB: Whereas I mean something rather vague by randomness (i.e. 
  random enough for the application) I mean something very specific 
  by entropy.

For details on all this, see
and in particular

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: how to check if your ISP's DNS servers are safe

2008-07-23 Thread John Denker
On 07/23/2008 12:44 AM, Steven M. Bellovin wrote:
 Niels Provos has a web page up with some javascript that automatically
 checks if your DNS caching server has been properly patched or not.

 It is worth telling people to try.

 Those who prefer command lines can try 
   dig +short TXT

Thanks, that's helpful.

Note that the command-line version accepts the @server option,
which is useful if you have to deal with a mess of primaries, 
secondaries, forwarders, et cetera:

   dig @NS1 +short TXT
   dig @NS2 +short TXT
   dig @NS3 +short TXT

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

defending against evil in all layers of hardware and software

2008-04-28 Thread John Denker
This is an important discussion  

The threats are real, and we need to defend against them.

We need to consider the _whole_ problem, top to bottom.  The
layers that could be subverted include, at a minimum:
 -- The cpu chip itself (which set off the current flurry of
 -- The boot rom.
 -- The BIOS code that lives on practically every card plugged
  into the PCI bus.
 -- Board-level stuff like memory controllers and I/O bridges.
 -- The operating system.
 -- Compilers, as Uncle Ken pointed out.
 -- Your secure application.
 -- Users.

As a particular example where PCs that we might wish to be secure 
are known to be under attack, consider electronic voting machines.  
In most cases there's a PC in there, running Windows CE.  Some
application software was provided by people with felony fraud
convictions.  Means, motive, and opportunity are all present.
There is ample evidence that problems have occurred.  These
are not confined to the Florida fiasco in 2000.  An example from 
2004 is the voting machine in Franklin County, Ohio that recorded 
4,258 votes for Bush when only 638 voters showed up.

This should not be an occasion for idly wringing our hands, nor 
sticking our head in the sand, nor looking under the lamp-post 
where the looking is easy.  We need to look at all of this stuff.  
And we can.  We are not defenseless.

As in all security, we need not aim for absolute security.  An 
often-useful approach is to do things that raise the cost to 
the attacker out of proportion to the cost to the defender.

For software, for firmware, and to some extent even for silicon
masks, SCM (source code management) systems, if used wisely, can
help a lot.  Consider for example a policy every delta to the 
software to be submitted by one person and tested by another
before being committed to the main branch of the project.  Both
the submitter and the tester would digitally sign the delta.  
This creates a long-tailed liability for anybody who tries to
sneak in a trojan  This is AFAIK the simplest defense against
high-grade attacks such as Ken's, which leave no long-term trace
in the source code (because the trojan is self-replicating).  
The point is that there is a long-term trace in the SCM logs.
We can make the logs effectively permanent and immutable.

Of course we should insist on an open-source boot ROM code:
The boot ROM should check the pgp signature of each PCI card's
BIOS code before letting it get control.  And then it should
check the pgp signature of the operating system before booting 
it.  I don't know of any machine that actually does this, but
it all seems perfectly doable.

Another line of defense involves closing the loop.  For example,
one could in principle find Ken's trojan by disassembling the
compiler and looking for code that doesn't seem to belong.
I have personally disassembled a couple of operating systems
(although this was back when operating systems were smaller
than they are now).

We can similarly close the loop on chips.  As others have pointed
out, silicon has no secrets.  A cost-effective way to check for 
trojans would be to buy more voting machines than necessary, and 
choose /at random/ a few of them to be torn down for testing.
(This has obvious analogies to sampling methods used in many
crypto algorithms.)  For starters, we grind open the CPU chips
and check that they are all the same.  That's much easier than
checking the detailed functionality of each one.  And we check
that the CPUs in the voting machines are the same as CPUs from 
another source, perhaps WAL*MART, on the theory that the attacker
finds it harder to subvert all of WAL*MART than to subvert just
a truckload of voting machines.

Checksumming the boot ROM in the torn-down machine is easy.  I
emphasize that we should *not* rely on asking a running machine
to checksum its own ROMs, because it is just too easy to subvert
the program that calculates the checksum.  To defend against
this, we tear down multiple machines, and give one randomly
selected ROM to the Democratic pollwatchers, one to the Republican 
pollwatchers, et cetera.  This way nobody needs to trusty anybody
else;  each guy is responsible for making sure _his_ checksummer
is OK.

All of this works hand-in-glove with old-fashioned procedural
security and physical security.  As the saying goes, if you
don't have physical security, you don't have security.  But
the converse is true, too:  Placing armed guards around a vault 
full of voting machines doesn't make the machines any less buggy 
than they were when they went into the vault. That's why we need 
a balanced approach that gets all the layers to work together.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL 

customs searching laptops, demanding passwords

2008-02-09 Thread John Denker
I quote from
  By Ellen Nakashima
  Washington Post Staff Writer  
  Thursday, February 7, 2008; A01

 The seizure of electronics at U.S. borders has prompted protests from
 travelers who say they now weigh the risk of traveling with sensitive
 or personal information on their laptops, cameras or cellphones. In
 some cases, companies have altered their policies to require
 employees to safeguard corporate secrets by clearing laptop hard
 drives before international travel.
 Today, the Electronic Frontier Foundation and Asian Law Caucus, two
 civil liberties groups in San Francisco, plan to file a lawsuit to
 force the government to disclose its policies on border searches,
 including which rules govern the seizing and copying of the contents
 of electronic devices.


Most of the underlying issue is not new;  a Joe Sharkey article
about customs seizures of laptops appeared in the NY Times back 
on October 24, 2006.  And it has been discussed on this list.
(The news hook here is the filing of the lawsuit.)

One wrinkle that was not previously reported is the bit about
customs officers demanding passwords.  That is something I
have thought about, off and on, and the more I think about it 
the more worrisome it seems.

A) Here's one particularly nasty scenario:  Long ago, the traveler
experimented with using an encrypting filesystem, perhaps the 
dm-crypt feature of Linux.  However, he decided it wasn't worth 
the trouble and forgot about it.  This includes forgetting the 
passphrase.  Now he's at the border, and customs is demanding 
the passphrase.   
 -- Just tell us the password.
 -- I forgot.
 -- No you didn't.
 -- Yes I did.
 -- You're lying.
 -- No I'm not.
 -- Yes you are.
 -- No I'm not.
 -- Just tell us the password.
 -- et cetera.

B) Another scenario:  Your employer adopts a policy requiring
you to use a blank laptop when traveling, as mentioned in 
the news article.  They also require you to use an encrypting
filesystem, even when not traveling.  They discover that the
easiest way to blankify your laptop is to overwrite the IVs
of the encrypting filesystem.  Now any and all passphrases
will fail in the same way:  they all look like wrong pass-
phrases.  Now are back to scenario (A), because customs might 
assume you're just lying about the passphrase.

C) Another scenario:  Customs confiscates the laptop.  They 
say that you won't get it back unless/until you give up the 

D) Tangential observation:  If they were being reasonable, they 
would confiscate at most the disk drive, and let you keep the 
rest of the hardware.  But they're under no obligation to be 

E) Remark:  The fundamental problem underlying this whole 
discussion is that the traveler is in a position where he has 
to prove his innocence ... which may not be possible, even if 
he is innocent.

The doctrine of innocent-until-proven-guilty does *not* apply
to customs searches.  Ditto for the doctrine of requiring
probable cause, search warrants, et cetera.

F) A good way (not the easiest way) to blankify a laptop
is to remove the hard disk and replace it with a brand-new
obviously-innocuous disk.  (Small, slow disks are very cheap.)
When you get home from your travels, you can undo the switch.

G) It is fun to think about a steganographic filesystem, with
the property that if you mount it with one passphrase you see
one set of files, while if you mount it with another passphrase
you see another set of files.  

The point here is that you give up one passphrase, they never
know if there is a second;  if you give up two passphrases,
they never know if there is a third, et cetera.

Note that we are talking about cryptologically-strong stego
here (as opposed to weak stego which falls into the category
of security-by-obscurity).

From an information-theory point of view this is perfectly 
straightforward;  solutions have been worked out in connection 
with code division multiplexing.  However, I reckon it would
have serious performance problems when applied to a hard disk.  
If anybody knows how to do this in practice, please speak up!

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

two-person login?

2008-01-29 Thread John Denker
Hi Folks --

I have been asked to opine on a system that requires a 
two-person login.  Some AIX documents refer to this as
a common method of increasing login security

However, I don't think it is very common;  I get only five hits

By way of not-very-apt analogy: 
 -- I am aware of business checks that require two signatures;
 -- I am aware that purchase orders are commonly signed by 
   two persons (the initiator and the approver); 
 -- I am aware of missile-launch procedures that require two
  persons using two keys;
 -- I am aware of software development procedures where a patch
  is submitted (and signed) by one person, tested (and signed
  off) by another, and committed to the official repository by 
  a third person.
 -- et cetera.

However, it is important to note that the aforementioned examples
all share an important property, as we see from the following:
  *) The parties give their approval _after_ the semantics has
   been fully determined.  It would defeat all security if two 
   signatures were attached to a blank check or a blank PO.
  *) As a related point, the approval is attached to a particular
   transaction.  The approver is not merely certifying that Joe
   is really Joe, and is generally allowed to write checks 
   (mere identification);  the approver is certifying that this 
   particular check is OK.
  *) To say the same thing another way, there is no pretexting.
   There is no pretext for turning the keys on the missile launcher
   unless you intend to launch a missile.  The semantics of the
   keys is clear.

We need to talk about threat models:
  a) The purveyors of the system in question don't have any clue
   as to what their threat model is.  I conjecture that they might
   be motivated by the non-apt analogies itemized above.
  b) In the system in question, there are myriad reasons why Joe
   would need to log in.  If Joe wanted to do something nefarious,
   it would take him less than a second to come up with a seemingly
   non-nefarious pretext.  When the approver approves Joe's login,
   the approver has no idea what the consequences of that approval
   will be.  The two-person login requires the approver to be
   present at login time, but does not require the approver to
   remain present, let alone take responsibility what Joe does 
   after login.
  c) The only threat model I can come up with is the case where
   Joe's password has been compromised, and nobody else's has.
   Two-person login would provide an extra layer of security
   in this case.  This threat is real, but there are other ways
   of dealing with this threat (e.g. two-factor authentication)
   ... so this seems like quite a lame justification for the
   two-person login.
  d) Or have I overlooked something?

From the foregoing, you might conclude that the two-person login
system is worthless security theater ... but harmless security
theater, and therefore not worth worrying about either way.

But the plot thickens.  The purveyors have implemented two-person
login in a way that manifestly /reduces/ security.  Details 
available on request.

So now I throw it open for discussion.  Is there any significant
value in two-person login?  That is, can you identify any threat 
that is alleviated by two-person login, that is not more wisely 
alleviated in some other way?

If so, is there any advice you can give on how to do this right?  
Any DOs and DONTs?

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: two-person login?

2008-01-29 Thread John Denker
On 01/29/2008 11:34 AM, The Fungi wrote:

 I don't think it's security theater at all, as long as established
 procedure backs up this implementation in a sane way. For example,
 in my professional life, we use this technique for commiting changes
 to high-priority systems. Procedure is that two security admins
 (each with half of a cryptographic key) collaborate on updates. Sure
 there's still the risk that one is nefarious and will socially
 engineer a back door in while his/her counterpart is watching, but
 that is not so much the risk we are attempting to thwart. The real
 goal is to reinforce policy which requires collaboration between
 administrators for major changes to these important systems.
 Technology can't effectively *force* procedure (ingenious people
 will always find a way around the better mousetrap), but it can help
 remind them how they are expected to interact with systems.

OK, that's clear and helpful.  Thanks.

The point I take away from this is that _procedure_ is primary
and fundamental.  Technology is secondary.  The two-person login 
is technology, and it is only icing on the procedural cake.
 -- If you have a good procedure, the two-person login might help 
  remind people to follow the procedure.
 -- If you don't have a good procedure, having a two-person login 
  will not magically create a good procedure.

This also gets back to what I said previously about semantics.  In
the situation The Fungi described, the semantics is clear.  A login
is tantamount to a commit, and the semantics of commit is clear.
Both parties make sure they understand the commit before they log
in.  Both parties log in, both parties hang around until the commit 
is complete, then both parties log out and leave.

  One question: If the goal is to have a two-person commit, why not
  implement a two-person commit, rather than using two-person login
  as a proxy?  For years there have been paperless office workflow
  systems that require two digital signatures on a purchase order.
  If you're going to use technology to support procedure, IMHO the
  technology should be tied as tightly as possible to the procedure.
  The semantics of login is IMHO very unclear, very open-ended, and
  it takes a lot of hand-waving to tie the login technology to the 
  commit semantics.

The foregoing makes sense, and is in extreme contrast to the situation
I am faced with, where Joe logs in with the help of Jane, and then
Jane leaves.  Jane has not the slightest control over what Joe does
while logged in.  I don't see a sane procedure here.  It seems Jane 
is signing a blank check.

It wouldn't be so bad if there were a development system separate
from the production system, but there isn't, so Joe spends all day
every day logged into the high security production system.  Joe
can commit anything he wishes.  There is no two-party review of the
commit, just two-party review of the login.

Just to rub salt in the wound, they've got it set up so that everybody
uses the Admin account.  There are N people who know the first half
of the Admin password, and M people who know the second half.  Besides
being an incredibly lame form of secret-splitting, this has the nasty
property that when Admin logs in, you don't even know who was involved.  
There are M*N/2 possibilities.  There is no accountability anywhere.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: 2008: The year of hack the vote?

2007-12-26 Thread John Denker
On 12/23/2007 08:24 PM, ' =JeffH ' wrote:

 2008: The year of hack the vote?

Shouldn't that be:
  2008: Another year of hack the vote yet again?

There is every reason to believe that the 2000 presidential
election was stolen.  A fair/honest/lawful election would
have made Al Gore the 43rd president.

There is every reason to believe the situation was even
worse in 2004.  If the election had been fair/honest/lawful
Kerry would have won be a wide margin.

Flipping Ohio's 20 electoral votes would have been sufficient
all by itself to flip the election from Kerry to Bush ... and 
there is plenty of evidence of widespread fraud in Ohio.  See 
e.g. the Conyers report,

And Ohio was only the tip of the iceberg;  there was large-
scale hanky-panky in Florida and many other states.

I like the book by  Prof. Steven F. Freeman  Joel Bleifuss,
  _Was the 2004 Presidential Election Stolen_?
Most of the crucial information can also be found on Freeman's 
web site
but the book is much better organized and easier to read.  The
book is dispassionate, scrupulous, and scientific ... which is
something you don't often see, especially in the political sphere.

Another book is by Mark Crispin Miller,
  _Fooled Again_
which is more passionate and less technical.  It takes a broader
view of the subject, and is far easier to read, especially for
readers who are not well-versed in statistics.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: PunchScan voting protocol

2007-12-15 Thread John Denker
On 12/13/2007 08:23 PM, Taral wrote:
 On 12/12/07, John Denker [EMAIL PROTECTED] wrote:
 Several important steps in the process must be carried out in
 secret, and if there is any leakage, there is unbounded potential
 for vote-buying and voter coercion.
 I've done quite a bit of work with this protocol. The protocol assumes
 the existence of an Election Authority. The Authority has the master
 keys required to generate certain data sets, and these keys give the
 Authority the ability to associate ballot numbers with votes. Note
 that this doesn't necessarily give the Authority the ability to
 associate people with votes.
 There are no per-ballot keys, so there is no partial exposure risk.
 It's all-or-nothing.
 1) It would be nice to see some serious cryptological protection
 of election processes and results.
 2b) In particular I don't think PunchScan really solves the
 whole problem.
 What is the whole problem? Please provide an attack model.

Well, that's the right question.  That's the sort of question
the punchscan team should be asking themselves, and answering
in more detail that I have heretofore seen.  What threats does
punchscan claim to defend against?  What threats does it leave
to be mitigated by other (non-punchscan) means?

As an example: Let's look at the plant where the ballots are
printed.  Suppose somebody attaches a tiny spy camera to
the frame of one of the printing presses, so as to obtain an
image of both parts of the two-part ballot (for some subset
of the ballots).

Obviously anybody who gets this information can defeat all the
cryptologic protections that the protocol is supposed to provide
(for that subset of the ballots).

  Note that the spy camera can be hiding in plain sight, in
  the guise of a security camera.  Many election-related
  facilities are /required/ to have security cameras.

  There's a difference between mathematical cryptology and real-
  world security.

 There are no per-ballot keys, so there is no partial exposure risk.
 It's all-or-nothing.

It's bad luck to prove things that aren't true.  I just gave an
example of a partial exposure risk, since some of the ballots
were seen by the spy camera and some weren't.

 The protocol assumes
 the existence of an Election Authority. 

Ah yes, but what is being assumed about the /properties/ of
this Election Authority?  Is the EA omnipresent and omnipotent,
like the FSM, or does it have boundaries and limitations?
For example, does it ever need to rely on employees or

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

PunchScan voting protocol

2007-12-13 Thread John Denker
Hi Folks --

I was wondering to what extent the folks on this list have taken
a look the PunchScan voting scheme:

The site makes the following claims:

 End-to-end cryptographic independent verification, or E2E, is a
  mechanism built into an election that allows voters to take a 
 piece of the ballot home with them as a receipt. This receipt 
 does not allow voters to prove to others how they voted, but it
  does permit them to:

 * Verify that they have properly indicated their votes to 
 election officials (cast-as-intended).
 * Verify with extremely high assurance that all votes were
 counted properly (counted-as-cast).

 Voters can check that their vote actually made it to the tally,
 and that the election was conducted fairly.

Those seem at first glance to be a decent set of claims, from
a public-policy point of view.  If somebody would prefer a
different set of claims, please explain.

PunchScan contains some nifty crypto, but IMHO this looks like
a classic case of too much crypto and not enough real security.

I am particularly skeptical of one of the FAQ-answers

Several important steps in the process must be carried out in
secret, and if there is any leakage, there is unbounded potential
for vote-buying and voter coercion.
  The Boss can go to each voter and make the usual silver-or-lead
  proposition:  Vote as I say, and then show me your voting receipt.
  I'll give you ten dollars.  But if I find out you voted against
  me, I'll kill you.

The voter cannot afford to take the chance that even a small
percentage of the ballot-keys leak out.

1) It would be nice to see some serious cryptological protection
of election processes and results.

2a) I don't think we're there yet.

2b) In particular I don't think PunchScan really solves the
whole problem.

3) I'd love to be wrong about item (2).  Does anybody see a way
to close the gaps?

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

electoral security by obscurity on trial

2007-12-06 Thread John Denker
For years, the Election Integrity Committee of the Pima County
Democratic Party has been trying to improve the security of the 
elections systems used in local elections. The results include:
 -- a dozen or so suggestions that they made were actually accepted
  and implemented by the county.
 -- there was a criminal investigation by the Attorney General's 
  office into the actions of Pima County Division of Elections 
  personnel, but no charges were brought.
 -- last but not least, they wanted access (after each election) to 
  the record of votes cast, but the county refused, leading to a
  lawsuit that has just now come to trial.

Background info on the trial:

Overview with links:

Report on day one, with links to statements and testimony:
  I note that the plaintiffs' opening statement actually
  used the term security through obscurity.

Report on day two, with links to testimony:

Election Security Report
  From the County Administrator to the Board of Supervisors

  The last 20 pages reproduce an article from 
The Information Technology  Innovation Foundation
Stop the Presses:  How Paper Trails Fail to Secure e-Voting

  which quite one-sidedly favors cryptologic solutions to all
  problems.  It suggests things like cut-and-choose and zero-
  knowledge proofs ... which makes for a dramatic contrast with
  the appalling unsophistication of the Diebold machines that
  are actually being used.


  Disclaimer:  One of the attorneys in the case is T. A. Denker.
  Yes, he is my brother.  No, I have not learned _anything_
  about the case from him.  I am not involved in this case ...
  except to the extent that as a voter I have a stake in the


This is not an issue for the trial, but I can't help noting
that for years the Australians have been using a Linux-based 
e-voting system, with all code open to public review:

which makes another dramatic contrast with Diebold's stated
need for secrecy.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: How the Greek cellphone network was tapped.

2007-07-16 Thread John Denker

On 07/10/2007 01:59 AM, Florian Weimer wrote:

It's also an open question whether network operators subject to
interception requirements can legally offer built-in E2E encryption
capabilities without backdoors.

I agree.  It's a tricky question;  see below

JI responded:

You probably meant device vendors, not network operators. 

We all agree we can make a distinction between telcos and phone HW
manufacturers.  But that may not be the relevant distinction.

I know in the US, and I imagine elsewhere, telcos buy phones from
the OEMs and then retail them to customers.  That makes them, in
the eyes of the law, both telecommunication carriers *and* device
vendors, even if they are not device OEMs.

The whole
*point* of E2E security is that network operators are not involved. If
they were, it wouldn't be end-to-end!

Well, that's logical, but who said the law has to be logical?

IANAL but AFAICT the most sweeping parts of the CALEA law apply
to telecommunication carriers as defined in section 1001:

Customer encryption is explicitly not included by the terms of
section 1002:
... unless the encryption was provided by the carrier and
the carrier possesses the information necessary to decrypt
the communication.

I repeat: ... unless the encryption was provided by the carrier
and the carrier possesses the information necessary to decrypt
the communication.

Following this line of thought leads to all sorts of illogical
conclusions, including:
 a) Arguably it might be OK to buy a backdoor-free crypto phone
  from the grocery store, but not OK to buy or lease it from
  the phone company.
 b) Arguably you could buy a phone from the telco with no
  crypto at all, and then take it to Orange County Choppers
  and have them install backdoor-free crypto.
 c) Arguably the OEM could have two product lines, one without
  backdoors, to be sold via telcos, and one without backdoors,
  to be sold otherwise.
 d) Arguably everybody is OK provided the telco doesn't have
  the keys.  Maybe you can use a crypto phone provided by a
  US telco if you have a high-assurance way of changing the
  keys to the back door as well as the front door.
 e) We all know the laws differ wildly from one jurisdiction
  to another ... and the laws can be changed at any time.

The cost of the second product line (item b) might not be too
much higher than the first product line (item a), since it
could be considered a /byproduct/, such that all the big
development costs are attributed to line (a) ... assuming
there is a market for crypto phones of any kind.

As to whether any such market will develop in the near future
is another interesting question.  The fact that only a tiny
fraction of present-day email is E2E encrypted is not an
encouraging sign.  (Email is easier to encrypt than voice.)

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Quantum Cryptography

2007-07-03 Thread John Denker

On 07/01/2007 05:55 AM, Peter Gutmann wrote:

One threat model (or at least failure mode) that's always concerned me deeply
about QC is that you have absolutely no way of checking whether it's working
as required.  With any other mechanism you can run test vectors through it,
run ongoing/continuous self-checks, and (in the case of some Type I crypto)
run dual units in parallel with one checking the other.  With QC you've just
got to hope that everything's working as intended.  That alone would be enough
to rule out its use as far as I'm concerned, I can't trust something that I
can't verify.

That's partly true, but there's more to the story.

Let's start by looking at the simple case, and then proceed to a more
sophisticated analysis:

By analogy:
 -- baseball pitchers should be evaluated on things like ERA, while
 -- football halfbacks should be evaluated on things like yard per carry,
 ... and not vice versa.

By that I mean:
 -- the integrity of DH depends fundamentally on the algorithm, so you
  should verify the algorithmic theory, and then verify that the box
  implements the algorithm correctly; while
 -- in the simple case, the integrity of quantum cryptography depends
  fundamentally on the physics, so you should verify the physics
  theoretically and then verify that the box implements the physics
 ... and not vice versa.

Don't complain that you cannot verify the physics the same way you
would verify the algorithm;  it's not a relevant complaint.

There are some beautiful operational checks that *can* be made on
a simple quantum crypto system.  For starters, you can insert a
smallish amount of attenuation in the link, as a model of attempted
eavesdropping.  The system should detect this, shut down, and raise
the red flag;  if it doesn't, you know it's broken.


A more sophisticated analysis takes into account the fact that in the
real world (as opposed to the ultra-specialized laboratory bench),
there is always some dissipation.  Therefore any attempt to do anything
resembling quantum crypto (or even quantum computing) in the real world
uses some sort of error correction.  (These error correction schemes are
some of the niftiest results in the whole quantum computation literature,
because they involve /analog/ error correction, whereas most previous
modern error-correcting codes had been very, very digital.)  So there is
some interesting genuine originality there, from a theory-of-computation

From a security standpoint though, this raises all sorts of messy issues.
We now have a box that is neither a pitcher nor a fullback, but some
weird chimera.  To validate it you would need to verify the physics *and*
verify the algorithms *and* verify the interaction between the two.

Needless to say, an algorithm intended for crypto requires much stricter
scrutiny than the same algorithm intended for ordinary computation.

In particular, the oft-repeated claim that quantum cryptography detects
eavesdropping may be true on the lab bench, but it does _not_ follow in
any simple way that a usable long-haul system will have the same property.


I agree with Steve that there is a difference between bona-fide early-stage
research and snake oil.

I did research in neural networks at a time when 90% of the published
papers in the field were absolute garbage, such as claims of solving
NP-hard problems in P time.
 -- When there are people who respect the difference between garbage and
  non-garbage, and are doing serious research, we should support that.
 -- When people try to publish garbage, and/or package garbage in shiny
  boxes and sell it to the government, we should call it for what it is.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Quantum Cryptography

2007-06-26 Thread John Denker

On 06/25/2007 08:23 PM, Greg Troxel wrote:

  1) Do you believe the physics?  (Most people who know physics seem to.)

Well, I do happen to know a thing or two about physics.  I know
 -- there is quite a lot you can do with quantum physics, and
 -- there is quite a lot you cannot do with quantum physics.

I also know that snake-oil salesmen can lie about the physics
just as easily as they lie about anything else.

Since it's not clear what is meant by THE physics, it would
be more meaningful to ask more-specific questions, namely:
 -- Do I believe in real physics?  Yes.
 -- Do I believe in what Dr. Duck says about physics?  Usually not.


One commonly-made claim about quantum cryptography is that
it can detect eavesdropping.  I reckon that's narrowly
true as stated.  The problem is, I don't know why I should
care.  The history of cryptography for most of the last 2000
years has been a cat and mouse game between the code makers
and the code breakers.  The consensus is that right now the
code makers have the upper hand.  As a result, Eve can eavesdrop
all she wants, and it won't do her a bit of good.

To say the same thing:  It appears that in this respect, quantum
cryptography takes a well-solved problem and solves it another
way at higher cost and lower throughput.  The cost/benefit ratio
is exceedingly unfavorable, and seems likely to remain so.

Meanwhile, it takes some less-well-solved problems and makes
them worse.  Consider for example traffic analysis.  Since
quantum encryption requires a dedicated hardware link from end
to end, there is no hope of disguising who is communicating
with whom.

I am reminded of a slide that Whit Diffie used in one of his
talks.  It showed a house that was supposed to be protected
by a picket fence.  The problem was that the so-called fence
consisted of a single picket, 4 inches wide and a mile high,
while the other 99.9% of the perimeter was unprotected.  Yes
sirree, no eavesdropper is going to hop over that picket!

One sometimes hears even stronger claims, but they are even
more easily refuted.  I've reviewed papers that claim quantum
mechanics solves the key distribution problem but in fact
they were using classical techniques to deal with all the
hard parts of the problem.  It reminds me of stone soup: if
the ingredients include broth, meat, vegetables, seasoning,
and a stone, I don't see why the stone should get credit for
the resulting soup.  Likewise, since a quantum key distribution
system is in no ways better and in some ways worse than a
classical system, I don't see why quantum cryptography
should get credit for solving the problem.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: SC-based link encryption

2007-01-05 Thread John Denker
On 01/05/2007 10:53 AM, Paul Hoffman wrote:

 You could take an IPsec stack and repurpose it down one layer in the
 stack. At least that way you'll know the security properties of what you

That is a Good Idea that can be used in a wide range of
situations.  Here is some additional detail:

This can be understood as follows:  Half of IPsec tunnel
mode can be described as IPIP encapsulation layered on
top of transport mode which does the encryption and
arranges for transport of the encrypted packets.

   The other half of IPsec is the SPDB, which is an
   important part of IPsec but is often underappreciated
   by non-experts.

So ... one obvious way forward is to do what might be
called L2sec (layer 2 security) in analogy to IPsec.
That is, do layer-2-in-IP encapsulation using GRE or
the like, and then layer that on top of IPsec transport

  Then you make some straightforward tweaks to the
  SPDB and you've something pretty nice.  As PH
  said, the security properties will be well known.

This may sound like overkill, but it is likely to be
/easier/ than anything else you can think of (not to
mention more secure and more richly featured).

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: gang uses crypto to hide identity theft databases

2006-12-24 Thread John Denker
On 12/22/2006 01:57 PM, Alex Alten wrote:

 I'm curious as to why the cops didn't just pull the plugs right away. 

Because that would be a Bad Idea.  In a halfway-well-designed
system, cutting the power would just do the secret-keepers' job
for them.

 It would probably
 take a while (minutes, hours?) to encrypt any significant amount of

That's why you don't do it that way.  If you want it to work, you
use an encrypting disk system so that everything on disk (including
swap) is encrypted all the time, and gets decrypted as needed when
it is read.

 Not to
 mention, where is the master key? 

It should be in volatile unswappable RAM.  Cutting the power is one
way (among many) to obliterate it.  Overwriting it with randomness
suffices if there is any chance that the RAM might be non-volatile.
The time and cost of obliterating a key are negligible.

 The guy couldn't have jumped up and typed
 in a pass phrase to generate it in handcuffs? 

That's another reason why you don't do it that way.

 Even if it got erased,
 it's image could
 be recovered from a disk or RAM.  My understanding is that even
 tamperproof cards
 one can get keys from them with the right equipment from the right folks.

Once something is gone from RAM, it's really, really gone.  The circuit
structure and the laws of thermodynamics ensure it.  No power on earth
can do anything about that.

There are, however, some things the cats can do to improve their chance of
success in this cat-and-mouse game.

  *) For starters, the cats must anticipate the possibility that the
   mice might try to secure their data.  The early-adopter mice benefit
   from a certain amount of security-through-obscurity, insofar as the
   cats have not heretofore fully appreciated the possibilities.

 *) The mice have a dilemma:  If they do not cache the passphrase somewhere,
  they will need to constantly re-enter it, which makes them vulnerable to
  shoulder-surfing, sophisticated key-loggers, unsophisticated rubber-hose
  methods, et cetera.  Conversely, if the mice do cache the passphrase for
  long periods of time, there is the possibility that the cats will capture
  the whole system intact, passphrase and all, and will be able to make a
  permanent copy of the passphrase before the system realizes that a compromise
  has occurred.  The cats can improve their chances by causing 
  power failures and seeing how the mice handle the ensuing passphrase issues.
  The mice can improve their odds by ensuring good physical security, ensuring
  personnel reliability, providing easy-to-use panic buttons, rotating their
  passphrases, and so forth.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-09-03 Thread John Denker
Dave Korn asked:

   Is it *necessarily* the case that /any/
 polynomial of log N /necessarily/ grows slower than N?


Hint:  L'Hôpital's rule.

 if P(x)==e^(2x)

That's not a polynomial.

x^Q is a polynomial.  Q^x is not.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Quantum RNG

2006-07-04 Thread John Denker
Andrea Pasquinucci wrote:
 Quantis is a physical random number generator exploiting an elementary 
 quantum optics process. Photons - light particles - are sent one by one 
 onto a semi-transparent mirror and detected. The exclusive events 
 (reflection - transmission) are associated to 0 - 1 bit values.
 Just curious of your opinion.

This is discussed at

Quantum processes are in some very narrow theoretical sense more
fundamentally random than other sources of randomness, such as
thermal noise ... but they are not better in any practical sense.

The basic quantum process is less sensitive to temperature than a purely
thermal process ... but temperature dependence is easily accounted for
in any practical situation, and -- more importantly -- there are all
sorts of other practical considerations (such as detector dead-time
issues) that make real quantum detectors far from ideal.

The devil is in the details, and obtaining the raw data from a quantum
process is nowhere near necessary and nowhere near sufficient to make
a good randomness generator.

I have no idea whether the quantis generator got the devilish details right
... but in any case, there are easier ways to make a generator that is just
as good, or better.

For details, see

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Unforgeable Blinded Credentials

2006-04-05 Thread John Denker

Hal Finney wrote in part:

... Attempts to embed sensitive secrets in credentials don't work because there 
are no sensitive
secrets today.  You could use credit card numbers or government ID numbers (like US SSN) but in 
practice such numbers are widely available to the black hat community.

The phrase there are no sensitive secrets today sounds very strange
by itself, and doesn't sound much better in context.

I assume the intended meaning was more along the lines of:
==   The set of things you want to keep secret has zero overlap with
==   the set of things you might want to use as an identifier.

Let me just remark that there's nothing new about this.  The notion of
a secret identifier is very widespread, but if you think about it, it
is completely absurd, and always has been.  For a fuller discussion, see:

which begins as follows:
]] I am reminded of a passage from /Buffy the Vampire Slayer/, in the episode Lie 
to Me:
]]  BILLY FORDHAM:  I know who you are!
]]  SPIKE:  I know who I am, too.  So what?
]] My point here is that it shouldn’t matter if somebody knows who I am. 
Suppose somebody can
]] describe me -- so what? Suppose somebody knows my date of birth, social 
security number, and
]] great-great-grandmother’s maiden name -- so what?
]] It’s only a problem if somebody uses that identifying information to spoof 
the _authorization_
]] for some transaction.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-26 Thread John Denker

In the context of

 0 occurs with probability 1/2
 each other number from 1 to 2^{160}+1 happens with
 probability 2^{-161}.

I wrote:

 This ... serves to illustrate, in an exaggerated way, the necessity
 of not assuming that the raw data words are IID (independent and identically

Correction:  IID isn't the issue here.  The raw data words described
above are IID.  That's not the problem.  The problem is that the
distribution is highly skewed.

Executive summary:  Small samples do not always exhibit average behavior.

Let me explain.  There is a very simple way to look at this.

Consider the expression for the entropy of the source,
   S := Sum_i P_i log(1/P_i)  [1]

where i runs over all symbols in the alphabet.

One may, with a little care, attribute log(1/P_i) bits of unpredictability
to the (i)th symbol in the alphabet.  Then S can be interpreted as the
appropriately weighted _average_ unpredictability per symbol.

In the example quoted above, the minimum log(1/P_i) is vastly smaller than
the average log(1/P_i) -- namely 1 versus 161.  So focussing on the average
is unlikely to solve all the world's problems.

In crypto applications (including RNG construction), a crude yet provably
correct approach is to rely on the minimum (not the average) per-symbol
unpredictability.  Using this approach, it would require 80 samples of the
given distribution to produce an output with 80 bits of entropy.

Things can only get better from here:
 -- With full knowledge of the source statistics, one can distinguish the large
  log(1/Pi) words from the small log(1/Pi) words.  This would allow the number
  of required samples to be closer to the typical value (2) than to the 
  value (80).  An example of this is the Huffman coding I discussed in my 
 -- With even modest knowledge of the given source, one can (by histogramming
  if nothing else) estimate the probabilities well enough to yield worthwhile
  improvements in efficiency.
 -- If you want to produce a long sequence of output words, as you often do, a
  reasonable-sized buffer is all you really need to come fairly close to optimal
  efficiency, namely only about 1 sample of the distribution, on average, per
  80-bit output word.  The chance of getting fooled can be made verrry small,
  at a modest cost in terms of buffer-size and extra samples of the 
  That is, the law of large numbers comes to your rescue sooner or later, 
  rather soon.
 -- It may be possible to engineer a different source with larger minimum

Bottom line:  Setting up a highly-skewed distribution and then drawing from
it only a single sample is not guaranteed to produce average behavior.  Duh,
no surprise there!

The source entropy S in equation [1] is a single number that characterizes the
average behavior.  For a small sample from a highly-skewed distribution, S
doesn't tell you everything you need to know.  This has no bearing on the
definition of entropy;  entropy is defined by equation [1].  It just means that
equation [1] by itself doesn't solve all the world's problems.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread John Denker

Hal Finney wrote:

This is true, in fact it is sometimes called the universal distribution
or universal measure.  In more detail, it is a distribution over all
finite-length strings.  The measure for a particular string X is defined
as the sum over all programs that output X of 1/2^L_i, where L_i is the
length of each such program.

There is no such that as the universal measure;  rather
there are lots of universal measures.  Universality is a
rather arcane property of the measure, the term doesn't mean
what most people think it means.
  -- Universal does NOT mean all-purpose.
  -- Universal does NOT mean general-purpose.
  -- There are many inequivalent universal distributions, most
   of which are not what you want for any given application.
  -- Certain things that are true asymptotically are not true
   for _any_ practical application.
  -- Ratio converging to 1 does not imply difference converging to 0.

This is probably not the right forum for cleaning out this Augean stable.


The universal measure for a string can also be thought of as the
probability that, given a fixed Universal Turing Machine (UTM), a randomly
chosen input program will output that string.

So this is clearly a probability distribution (with some technicalities
regarding issues of program lengths being glossed over here) as John
Denker says.  However to go from this to a notion of entropy is more

Not really questionable.  If you have a probability, you have an

Shannon entropy is defined over a probability distribution.  That is,
given a probability distribution we can find its Shannon entropy by
summing -p_i / log2(p_i) over all members of the distribution.  If we
approximate the universal distribution by 1/2^n for strings of length
n, this sum is clearly divergent.  So there is not really a meaningful
Shannon entropy for this probability distribution, or in other words
the entropy is infinite.

The entropy _per string_ is unbounded, as it jolly well should be for
random strings of unbounded length.  That's true but uninteresting.
A more interesting quantity is the _entropy density_ aka the entropy
_per symbol_ which for typical long random bit-strings is one bit of
entropy per bit of string-length.  Similarly if you have a string of
symbols over a 32-symbol alphabet, you would expect to see five bits
of entropy per symbol for a typical long random string.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread John Denker

Ed Gerck wrote:

In Physics, Thermodynamics, entropy is a potential [1].

That's true in classical (19th-century) thermodynamics, but not
true in modern physics, including statistical mechanics.  The
existence of superconductors and superfluids removes all doubt
about the absolute zero of entropy.  For details, see

As is usual for a potential, only *differences* in entropy
between different states can be measured. 

Not true.

These are quite general properties. 

They are neither general nor relevant to crypto.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

Aram Perez wrote:

* How do you measure entropy? I was under the (false) impression that  
Shannon gave a formula that measured the entropy of a message (or  
information stream).

Entropy is defined in terms of probability.  It is a measure of
how much you don't know about the situation.  If by message
you mean a particular message that you are looking at, it has
zero entropy.  It is what it is;  no probability required.

If by stream you mean a somewhat abstract notion of an ensemble
of messages or symbols that you can only imperfectly predict, then
it has some nonzero entropy.

* Can you measure the entropy of a random oracle? Or is that what  both 
Victor and Perry are saying is intractable?

That is a tricky question for a couple of reasons.
 a) It will never be completely answerable because the question hinges
  on the word random, which means different things to different people.
  Thoughtful experts use the word in multiple inconsistent ways.
 b) It also depends on just what you mean by measure.  Often it
  is possible to _ascertain_ the entropy of a source ... but direct
  empirical measurements of the output are usually not the best way.

* Are there units of entropy?

Yes.  It is naturally _dimensionless_, but dimensionless quantities
often have nontrivial units.  Commonly used units for entropy
 ++ bits
 ++ joules per kelvin.  One J/K equals 1.04×10^23 bits
For more on this, see

* What is the relationship between randomness and entropy?

They are neither exactly the same nor completely unrelated.
A pseudorandom sequence may be random enough for many
applications, yet has asymptotically zero entropy density.
A sequence of _odd_ bytes is obviously not entirely random,
yet may have considerable entropy density.

* (Apologies to the original poster) When the original poster  requested 
passphrases with more than 160 bits of entropy, what was he requesting?

When you apply a mathematical function to an ensemble of inputs, it
is common to find that the ensemble of outputs has less entropy than
the ensemble of inputs. A simple pigeonhole argument suffices to show
that a function whose output is represented in 160 bits cannot possibly
represent more than 160 bits of entropy per output.  So if you want
the ensemble of outputs to have more than 160 bits of entropy, it is
necessary to do something fancier than a single instance of SHA-1.

* Does processing an 8 character password with a process similar to  
PKCS#5 increase the entropy of the password?


* Can you add or increase entropy?

Shuffling a deck of cards increases the entropy of the deck.

For more on this, see

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

John Kelsey wrote:

As an aside, this whole discussion is confused by the fact that there
are a bunch of different domains in which entropy is defined.  The
algorithmic information theory sense of entropy (how long is the
shortest program that produces this sequence?) is miles away from the
information theory sense of entropy, and even that has a bunch of
different flavors.  

I would have said almost the opposite.  The great power and beauty
of the entropy idea is that it has the *same* meaning across many
domains.  Entropy is defined in terms of probability, period.  Any
other definition is either
  a) exactly equivalent,
  b) an approximation, valid under this-or-that restrictive conditions, or
  c) wrong.

With some slight fiddling to get the normalization right, 1/2
raised to the power of (program length) defines a probability
measure.  This may not be the probability you want, but it
is a probability, and you can plug it into the entropy definition.

The problem is almost never understanding the definition of
entropy.  The problem is almost always ascertaining what is
the correct probability measure applicable to the problem
at hand.

Don't get me started about universal probability measures.
That has an arcane narrow technical meaning that is verrry widely

0 occurs with probability 1/2
each other number from 1 to 2^{160}+1 happens with probability

The Shannon entropy on this distribution is 81.5 bits. 

I think it is very close to 81 bits, not 81.5, but that is a minor
point that doesn't change the conclusion:

 But if you
tried to sample it once to generate an 80-bit Skipjack key, half the
time, I'd guess your key on my first try.  

Yeah, but if I sample it N times, with high probability I can generate
a large number of very good keys.  This problem is faced by (and solved
by) any decent TRNG.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

I wrote:

With some slight fiddling to get the normalization right, 1/2
raised to the power of (program length) defines a probability
measure.  This may not be the probability you want, but it
is a probability, and you can plug it into the entropy definition.

John Kelsey wrote:

No, this isn't right for the algorithmic information theory meaning at

OK, in a moment we will have gone through four plies of no-it-isn't
yes-it-is no-it-isn't yes-it-is.  Let's get serious.  The axiomatic
definition of a measure is
  -- a mapping from sets to numbers
  -- positive
  -- additive on the countable union of disjoint sets

And a probability measure has the further property of being
  -- bounded above

I have checked that -- with due attention to trivial details --
.5 ^ (program length) satisfies this definition.  If you wish to
renew the assertion that there is no such probability measure, please
explain which of the axiomatic requirements is not met.  Please be

 For that measure, we can intelligently discuss the entropy of a
 specific random string, without reference to a probability model.

That's like saying we can talk about three-dimensional force, velocity,
and acceleration without reference to vectors.

Measure theory is the tried-and-true formalism for dealing with random
strings.  It would be spectacularly contrary to ignore the formalism,
and just plain wrong to say the formalism is inapplicable.

Indeed, what's the probability distribution of the sequence of bits
defined by Chaitin's Omega?  

Huh?  Omega is so obviously a probability that usually the word probability
is used in its definition.  See e.g.
I suppose a masochistic nitpicker could demand to see a proof that this word
is justified, but I'm not going to bother, for reasons given above.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread John Denker

Matt Crawford wrote:

I so often get irritated when non-physicists discuss entropy.  The  word 
is almost always misused. 

Yes, the term entropy is often misused ... and we have seen some
remarkably wacky misusage in this thread already.  However, physicists
do not have a monopoly on correct usage.  Claude S was not a physicist,
yet he definitely knew what he was talking about.  Conversely, I know
more than a few card-carrying physicists who have no real feel for what
entropy is.

I looked at Shannon's definition and  it is 
fine, from a physics point of view.  


But if you apply  thoughtfully to a 
single fixed sequence, you correctly get the answer  zero.

I agree with all that, except for the But.  Shannon well knew that
the entropy was zero in such a situation.

If your sequence is defined to be { 0, 1, 2, ..., 255 }, the  
probability of getting that sequence is 1 and of any other sequence,  
0.  Plug it in.


If you have a generator of 8-bit random numbers and every sample is  
independent and uniformly distributed, and you ran this for a  gazillion 
iterations and wrote to the list one day saying the special  sequence { 
0, 1, 2, ..., 255 } had appeared in the output, that's a  different 
story.  But still, we would talk about the entropy of your  generator, 
not of one particular sequence of outputs.

Yes.  Shannon called it the source entropy, i.e. the entropy of
the source, i.e. the entropy of the generator.

Perry Metzger wrote:

Usually, the best you can do is produce (bad) bounds, and sometimes
not even that.

Huh?  What's your metric for usually?  I'll agree as a matter of
principle that whatever you're doing, you can always do it wrong.
But that doesn't prevent me from doing it right.  I can use physics
to produce good bounds, that is,


The problem posed by the OP is trivial, and good solutions have already
been posted.  To recap: SHA-512 exists, and if that isn't good enough,
you can concatenate the output of several different one-way functions.
You can create new hash functions at the drop of a hat by prepending
something (a counter suffices) to the input to the hash.

Example:  result = hash(1 || pw)  ||  hash(2 || pw)  ||  hash(3 || pw)

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: GnuTLS (libgrypt really) and Postfix

2006-02-15 Thread John Denker

James A. Donald wrote:

The correct mechanism is exception handling.

Yes, I reckon there is a pretty wide consensus that exceptions
provide a satisfactory solution to the sort of problems being
discussed in this thread.

If caller has provided a mechanism to handle the failure, that
mechanism should catch the library generated exception.  If the caller
has provided no such mechanism, his program should terminate

OK ... although I wouldn't make a virtue of doing things ungracefully.

Unfortunately, there is no very portable support for exception
handling in C.   There is however support in C++, Corn, D, Delphi,
Objective-C, Java, Eiffel, Ocaml, Python, Common Lisp, SML, PHP and
all .NET CLS-compliant languages.

That raises the question of whether mission-critical applications
should be written in C.

It is straightforward but laborious to simulate exception-throwing
in C:
extern int errno;
/* try some stuff */
if (errno) return; /* return immediately on any error */
/* try some more stuff */
if (errno) return; /* return immediately on any error */
et cetera.
This is laborious and inelegant, but no more so than the other gajillion
things you need to do to provide any semblance of security in C, such as
computing the (N) argument to strncat(,,N).

In particular, if-errno-return is often much preferable to some other
tricks that have recently been suggested, such as raising SIGABRT or
passing a pointer to a function to be called when exceptional conditions
are detected.  The reason is that throwing an exception _pops_ the
stack until a catcher if found, whereas signals and function-calls just
push deeper into the stack ... they allow you to regain some control,
but they don't make it easy to regain control _at the right place_.

For completeness, let me say again that _exit() is just an exception
that is caught by the parent process.  If you are ever forced to deal
with a package that exits when it shouldn't, it is straightforward to
create a wrapper that does a fork, calls the package, and collects
the exit status.  That is, rather than:
rslt = nasty(a, b);   /* might trash the whole process */
we have:
rslt = forku(nasty, a, b);/* better */
if (errno) return;
But of course I hope you never have to face such a problem.  If at
all possible, use a language that supports exceptions.

Absent exception handling, mission critical tasks should have no
exceptions, which is best accomplished by the die-on-error standard.

No, that is not the best way nor even a good way, let alone a standard.

Halt on error is a tool for achieving error free code.  Error free
code is in fact achievable for really crucial applications.  The more
crucial the application, the more reason to write code that halts on

Halting on every exceptional condition is like amputating to cure
every headache.

Keep in mind Dykstra's dictum:  testing can perhaps show the presence
of bugs, but testing can never show the absence of bugs.

Exit-on-error is a _problem amplifier_.   It guarantees that small
problems detected during testing will not go unnoticed.  But this is
a very long way from being a guarantee of error-free code.
  a) Remember Dykstra's dictum.
  b) If you knew the code was error free, then by the Red Queen's logic
   you wouldn't even need to check for errors, and there would be no
   need for exit statements.  The contrapositive is that if you put
   checks in the code, you are implicitly admitting that your code
   might not be error free.  And there's the rub:  you cannot possibly
   prove that during deployment (as opposed to during testing) using
   a _problem amplifier_ won't make things worse rather than better.

Whatever happened to doing what's best for the customer?  Doing what's
most convenient for the programmer during testing, while making things
worse for the customer during deployment ... that seems remarkably

Last but not least, I object (again!) to the false dichotomy, i.e.
the allegation that exceptional conditions must either
  a) result in an abort, or
  b) go undetected.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: GnuTLS (libgrypt really) and Postfix

2006-02-13 Thread John Denker

David Wagner wrote:

This just shows the dangers of over-generalization.

One could make an even stronger statement about the dangers of
making assumptions that are not provably correct.

Of course, we have to decide which is more important: integrity,
or availability.  

That is a false dichotomy.

I suspect that in the overwhelming majority (perhaps
all) of the cases where libgcrypt is used, integrity is more important
than availability.  If that is true, well, if in doubt, it's better to
fail closed than to fail open.

Again:  False dichotomy is a fallacy, and has been recognized for
2000 years as such.  Showing that one extreme is bad does not
prove that the opposite extreme is good.

The whole point of my previous note was to argue for more nuanced
handling.  Progressing from one knee-jerk handing to a choice
between two knee-jerk handlings is not much progress.

I think the attitude that it's better to die than to risk letting an
attacker take control of the crypto library is defensible, in many cases.

Again, that's the wrong question;  it's not an either/or proposition.
We can agree that letting the attacker take control of the situation
is a Bad Thing, but it is preposterous to think that exiting is
provably correct in all situations where the library might be put
to use.

It is just plain arrogant for low-level code to arrogate to itself
a decision that rightfully belongs to higher-level code.

Werner Koch wrote in part:

Sure, for many APIs it is posssible to return an error code but this
requires that the caller properly checks error codes.  We have all
seen too many cases were return values are not checked and the
process goes ahead assuming that everything went well 

That is narrowly true as stated, but it does not prove that exiting
is the correct thing to do.

That might lead to an argument in favor of exceptions instead of error
codes, along the following lines:
 -- Naive code doesn't catch the exception.  However (unlike returned
  error codes) this does not cause the exception to be lost.
 -- The exception percolates up the call-tree until it is caught by
  some non-naive code (if any).
 -- If all the code is naive, then the uncaught exception terminates
  the process ... to the delight of the exit on error faction.
  However (!!!) unlike a plain old exit, throwing an exception leaves
  the door open for non-naive code to implement a nuanced response to
  the exceptional condition.

Again, enough false dichotomies already!  Just because error codes are
open to abuse doesn't mean exiting is the correct thing to do.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: thoughts on one time pads

2006-01-31 Thread John Denker

Anne  Lynn Wheeler wrote:

is there any more reason to destroy a daily key after it as been used
than before it has been used?

That's quite an amusing turn of phrase.  There are two ways to
interpret it:

*) If taken literally, the idea of destroying a key _before_ it is
 used is truly an ingenious way to ensure security.  Alas there is
 some degradation of functionality, but isn't that always the case?
 Also the cost of key distribution goes way down once you decide you
 will only distribute already-destroyed keys.

*) Perhaps the intent was to speak about _protecting_ keys before and
 after use.  That's somewhat trickier to do securely, and is more
 dependent on the threat model ... but offers vastly greater functionality.

 -- The best way to _protect_ a key after it has been used is to destroy

 -- For keys that have yet been used, a sufficient scheme (not the only
  scheme) for many purposes is to package the keys in a way that is
  tamper-resistant and verrry tamper-evident.

  The package must be tamper-evident in order to be secure. If there are
  signs of tampering, don't use the keys.

  The package must be at least somewhat tamper-resistant in order to
  protect the functionality against a too-easy DoS attack, i.e.
  superficial tampering.

one of the attacks on the stored-value gift cards has been to skim the
cards in the racks (before they've been activated) ... and check back
later to see which cards are gone.

That indicates a gross lack of tamper-evident packaging, as discussed
above.  The store should never have activated a card that came from a
package that had been tampered with.

Travis H. wrote:

What about degaussing?

That's even funnier.  Most CDs and DVDs are totally non-magnetic to begin
with.  Degaussing them is not going to have much effect.

There are, of course, NSA-approved degaussers for magnetic media, but
heretofore this thread hasn't been about magnetic media.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: thoughts on one time pads

2006-01-31 Thread John Denker

I forgot to mention in my previous message:

It is worth your time to read _Between Silk and Cyanide_.
That contains an example of somebody who thought really
hard about what his threat was, and came up with a system
to deal with the threat ... a system that ran counter to
the previous conventional wisdom.  It involved protecting
keys before use and destroying them after use.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: thoughts on one time pads

2006-01-27 Thread John Denker

Dave Howe wrote:

Hmm. can you selectively blank areas of CD-RW?

Sure, you can.  It isn't s much different from rewriting any
other type of disk.

There are various versions of getting rid of a disk file.
 1) Deletion:  Throwing away the pointer and putting the blocks back
  on the free list.  This is well known to be grossly insecure.
 2) Zeroizing the blocks in place (followed by deletion).  This
  is vastly better, but still not entirely secure, because there
  are typically stray remnants of the pattern sitting beside
  the nominal track, and a sufficiently-determined adversary
  may be able to recover them.
 3) Trashing the blocks, i.e. overwriting them in place with
  crypto-grade random numbers (followed by optional zeroizing,
  followed by deletion).  This makes it harder for anyone to
  recover strays.
 4) Half-track trashing.  This requires wizardly disk hardware,
  which shifts the head half a track either side of nominal,
  and *then* writes random numbers.  I might be persuaded that
  this really gets rid of strays.
 5) Grinding the disk to dust.  AFAIK this is the only NSA-approved
  method.  A suitable grinder costs about $1400.00.

  One drawback with this is that you have to destroy a whole
  disk at a time.  That's a problem, because if you have a
  whole disk full of daily keys, you want to destroy each
  day's key as soon as you are through using it.  There
  are ways around this, such as reading the disk into volatile
  RAM and then grinding the disk ... then you just have to make
  sure the RAM is neither more volatile nor less volatile than
  you wanted it to be.  That is, you use the disk for *distribution*
  but not necessarily for intermediate-term storage.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: permutations +- groups

2005-12-22 Thread John Denker

Ben Laurie wrote:

Good ciphers aren't permutations, though, are they? Because if they
were, they'd be groups, and that would be bad.

There are multiple misconceptions rolled together there.

1) All of the common block ciphers (good and otherwise) are permutations.
 To prove this, it suffices to require that ciphertext blocks be the
 same length as plaintext blocks, and that arbitrary text can be enciphered
 and (given the proper key) uniquely deciphered.  It's an elementary
 pigeon-hole argument:  each plaintext block must map to one and only one
 ciphertext block.

2) If you consider the set of all imaginable permutations, there will be
 ((2^B) factorial) elements, where B is the blocksize of the cipher.  Call
 this set U.  The set U is closed under composition, so in fact we necessarily
 have a group.  This is neither a good thing nor a bad thing;  it just is.

3) However, the group mentioned in item (2) is not what people mean when
 they talk about a cipher having the group property.  Let me illustrate
 using plain old DES.  There are at most 2^56 keys.  Therefore let us
 consider the set of all 2^56 /keyable/ permutations;  call this set K.
 This is a verrry small subset of the ((2^64) factorial) imaginable

4) The interesting question is whether K is closed under composition.  This
 is of particular interest if you are trying to cobble up an improved cipher
 with an N-times longer key, by layering N copies of the original cipher.
 This is guaranteed to be a waste of effort if K is closed under composition,
 i.e. if K is in fact a group, i.e. a subgroup of U.

 The converse does not hold;  showing that K is not closed does not suffice
 to show that layering is a good idea.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

packet traffic analysis

2005-10-31 Thread John Denker

Travis H. wrote:

Part of the problem is using a packet-switched network; if we had
circuit-based, then thwarting traffic analysis is easy; you just fill
the link with random garbage when not transmitting packets.  

OK so far ...

There are two problems with this; one, getting
enough random data, and two, distinguishing the padding from the real
data in a computationally efficient manner on the remote side without
giving away anything to someone analyzing your traffic.  I guess both
problems could be solved
by using synchronized PRNGs on both ends to generate the chaff. 

This is a poor statement of the problem(s), followed by a solution that
is neither necessary nor sufficient.

1) Let's assume we are encrypting the messages.  If not, the adversary
can read the messages without bothering with traffic analysis, so the
whole discussion of traffic analysis is moot.

2) Let's assume enough randomness is available to permit encryption
of the traffic ... in particular, enough randomness is available
_steady-state_ (without stockpiling) to meet even the _peak_ demand.
This is readily achievable with available technology.

3) As a consequence of (1) and (2), we can perfectly well use _nonrandom_
chaff.  If the encryption (item 1) is working, the adversary cannot tell
constants from anything else.  If we use chaff so that the steady-state
traffic is indistinguishable from the peak traffic, then (item 2) we
have enough randomness available;  TA-thwarting doesn't require anything

4) Let's consider -- temporarily -- the scenario where the encryption is
being done using IPsec.  This will serve to establish terminology and
expose some problems heretofore not mentioned.

4a) IPsec tunnel mode has inner headers that are more than sufficient
to distinguish chaff from other traffic.  (Addressing the chaff to UDP
port 9 will do nicely.)

4b) What is not so good is that IPsec is notorious for leaking information
about packet-length.  Trying to make chaff with a distribution of packet
sizes indistinguishable from your regular traffic is rarely feasible, so
we must consider other scenarios, somewhat like IPsec but with improved

5) Recall that IPsec tunnel mode can be approximately described as IPIP
encapsulation carried by IPsec transport mode.  If we abstract away the
details, we are left with a packet (called an envelope) that looks like

| outer header | inner header | payload |  [1]

where the inner header and payload (together called the contents of
the envelope) are encrypted.  (The + signs are meant to be opaque
to prying eyes.) The same picture can be used to describe not just
IPsec tunnel mode (i.e. IPIP over IPsec transport) but also GRE over
IPsec transport, and even PPPoE over IPsec transport.

Note:  All the following statements apply *after* any necessary
fragmentation has taken place.

The problem is that the size of the envelope (as described by the length
field in the outer header) is conventionally chosen to be /just/ big
enough to hold the contents.  This problem is quite fixable ... we just
need constant-sized envelopes!  The resulting picture is:

| outer header | inner header | payload | padding |[2]

where padding is conceptually different from chaff:  chaff means packets
inserted where there would have been no packet, while padding adjusts the
length of a packet that would have been sent anyway.

The padding is not considered part of the contents.  The decoding is
unambiguous, because the size of the contents is specified by the length
field in the inner header, which is unaffected by the padding.

This is a really, really tiny hack on top of existing protocols.

If your plaintext consists primarily of small packets, you should set the MTU
of the transporter to be small.   This will cause fragmentation of the
large packets, which is the price you have to pay.  Conversely, if your
plaintext consists primarily of large packets, you should make the MTU large.
This means that a lot of bandwidth will be wasted on padding if/when there
are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's
the price you have to pay to thwart traffic analysis.  (Sometimes you can
have two virtual circuits, one for big packets and one for small packets.
This degrades the max performance in both cases, but raises the minimum
performance in both cases.)
  Remark: FWIW, the MTU (max transmission unit) should just be called
  the TU in this case, because all transmissions have the same size now!

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: packet traffic analysis

2005-10-31 Thread John Denker

In the context of:

If your plaintext consists primarily of small packets, you should set the MTU
of the transporter to be small.   This will cause fragmentation of the
large packets, which is the price you have to pay.  Conversely, if your
plaintext consists primarily of large packets, you should make the MTU large.
This means that a lot of bandwidth will be wasted on padding if/when there
are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's
the price you have to pay to thwart traffic analysis.

Travis H. wrote:

I'm not so sure.  If we're talking about thwarting traffic on the link
level (real circuit) or on the virtual-circuit level, then you're
adding, on average, a half-packet latency whenever you want to send a
real packet. 

I very much doubt it.  Where did that factor of half come frome.

I don't see any reason why it's necessary to pay these costs if you
abandon the idea of generating only equal-length packets 

Ah, but if you generate unequal-length packets then they are
vulnerable to length-analysis, which is a form of traffic analysis.
I've seen analysis systems that do exactly this.  So the question is,
are you trying to thwart traffic analysis, or not?

I should point out that encrypting PRNG output may be pointless, 

*is* pointless, as previously discussed.

perhaps one optimization is to stop encrypting when switching on the

A better solution would be to leave the encryption on and use constants
(not PRNG output) for the chaff, as previously discussed.

Some minor details
involving resynchronizing when the PRNG happens to

The notion of synchronized PRNGs is IMHO crazy -- complicated as well as
utterly unnecessary.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: EDP (entropy distribution protocol), userland PRNG design

2005-10-24 Thread John Denker

I've been following this thread for a couple of weeks now, and so
far virtually none of it makes any sense to me.

Back on 10/12/2005 Travis H. wrote:

I am thinking of making a userland entropy distribution system, so
that expensive HWRNGs may be shared securely amongst several machines.

What evidence is there that HRNGs are expensive?  How many machines do
you have?  How many of them already have soundcards?  How much entropy
do they need (bits per second)?

The obvious solution is to put a high-performance low-cost HRNG in each machine.
Is there some reason why this cannot be done?  If so, please explain.

Otherwise, this whole discussion seems like a futile exercise, i.e. trying
to find the optimal way of doing the wrong thing.

]] ABSTRACT: We discuss the principles of a High-Entropy Randomness Generator 
(also called a True
]] Random Number Generator) that is suitable for a wide range of applications, 
]] cryptography, high-stakes gaming, and other highly adversarial applications. 
It harvests entropy
]] from physical processes, and uses that entropy efficiently. The hash 
saturation principle is used
]] to distill the data, resulting in virtually 100% entropy density. This is 
calculated, not
]] statistically estimated, and is provably correct under mild assumptions. In 
contrast to a
]] Pseudo-Random Number Generator, it has no internal state to worry about, and 
does not depend on
]] unprovable assumptions about “one-way functions”. We also describe a 
low-cost high-performance
]] implementation, using the computer’s audio I/O system.

For details, see

 Here's the algorithm from generation to use:

1) Entropy harvested from HWRNG.

OK so far.

2) Entropy mixed with PRNG output to disguise any biases present in
source.  ...   (Is XOR sufficent and desirable?)

If it were a decent HRNG it would have this built in.  XOR is not even
remotely sufficient.

3) Entropy used as truly random input in an extractor to map
somewhat random input (interrupt timing, memory contents, disk head
settling times) into strongly random output.

What's an extractor?  What is needed is a compressor.

4) Entropy passed through OWF to occlude state of previous systems in
this chain.

A decent HRNG is stateless and does not need any one-way functions.

5?) Entropy ciphered with a randomly-generated key (taken from the
previous step), rotated periodically.

A decent HRNG does not need any such encipherment.

Similarly, I also would like to use ID Quantique's HWRNG based on


but their modules are sealed and opaque.  What I want to do is
explore what kind of assurances I can make about the output, based on
assumptions about the attacker's ability to control, predict, or
observe one of the sources.

Such assurances are discussed at:

5) Do it in a language not as prone to security-relevant errors as C
and containing support for large numbers and bitstrings as first-class

turbid is already written in C++ for this reason.  Strings and suchlike
are part of the language, defined in the Standard Template Library.

1) Lack of standardization in the naming or semantics of kernel
facilities, such as the names of devices in /dev.

The semantics is just broken ... which is why turbid defines and
implements /dev/hrandom with its own semantics.  Optionally it can
feed entropy to /dev/[u]random for the benefit of legacy applications
under certain limited conditions.

2) Lack of support for sockets in the target language.

Really not a problem with turbid.

3) The use of ioctls for interfacing to sources of entropy in the kernel.

Really not a problem with turbid.

4) The use of tty devices to interface to HWRNGs

Really not a problem with turbid.

5) Multiple clients petitioning the daemon for random bits at once. 

Really not a problem with turbid.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

continuity of identity

2005-09-27 Thread John Denker

Jerrold Leichter mentioned that:

 a self-
signed cert is better than no cert at all:  At least it can be used in an 
SSH-like continuity of identity scheme.

I agree there is considerable merit to a continuity of identity

But there are ways the idea can be improved.  So let's discuss it.

For starters, let me suggest that rather than having a self-signed
certificate of the type created more-or-less automatically when
you set up your Apache server or set up your SSH daemon, it makes
more sense to set up your own CA and issue your own certs from
there.  In some sense this is just a different type of self-signing,
but it adds a possibly useful layer of indirection.

One advantage is that one CA can issue lots of certs, so if you
are (say) a hosting service provider running N hosts, you only
need to have one CA.  The idea is that your users can have
a notion of continuity of your CA.  This notion is more powerful
than having N separate lines of continuity, one for each of your
hosts.  That's because each line of continuity is vulerable at
its beginning.  Having one CA reduces the vulnerabilty by a factor
of N, or at least a factor of M, where M tells us how many of
your hosts are likely to be visited by a typical visitor over
the course of time.

Another advantage is that you can keep your CA on a separate
machine, with a security policy radically stricter than the
policy on the N hosts.  That means if one of your hosts is
compromised, they get only the host key, not the CA private
key.  New host keys can be generates as soon as the compromise
is remedied, and assuming the old host keys have a reasonably
short expiration time, then your total vulnerability is
bounded, much more tightly bounded than in any commonly-used
scheme.  (I don't know of anybody who renews his Verisign-issued
certs every day, or even every week.)

In a similar vein, rapid expiration of host certs limits how
careful you have to be with backup takes, et cetera.

Another advantage is that it might shrink your known_hosts
file ... reducing it by a factor of M or thereabouts, since it
now becomes, in effect, a known_ca file.

This scheme is in some sense doable already, since modern browsers
have a way of importing new CA keys under user control.  If you
can persuade a user to go through those steps you're all set.
However, this process is sufficiently arcane and cumbersome that
it must be considered
 -- in the short term, a serious drawback to my continuity-of-CA
  scheme, or
 -- in the medium term, a tremendous opportunity for improvement.

In particular, it would be nice if both SSH and the web protocols
had a way of piggybacking a CA public key along with each cert
signed by that CA, so that SSH clients and web browsers could
give the users a relatively Muggle-friendly way of deciding
 *) to trust the CA henceforth
 *) to trust only this particular host henceforth
 *) neither.

By default, such a CA should be limited to signing only one
domain and its subdomains, such as * (as opposed
to signing *.com).

Setting up a CA is no big deal.  There is a HOWTO by Ng Pheng Siong:


I realize that such a scheme is open to abuse ... but on the whole,
I expect it would solve more problems than it creates.

Comments, anyone?

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Clearing sensitive in-memory data in perl

2005-09-17 Thread John Denker

Victor Duchovni wrote:

So wouldn't the world be a better place if we could all agree on a 
single such library? Or at least, a single API. Like the STL is for C++.

Yes, absolutely, but who is going to do it?

One could argue it has already been done.

There exists a widely available, freely available, well-implemented
example of something just like the STL for C++.  It is called
the STL for C++.

a) Writing in C++ is easier than writing in C.

b) Porting legacy C to C++ isn't rocket surgery.  It can be done

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: solving the wrong problem

2005-08-07 Thread John Denker

Adam Shostack wrote:

Here's a thought:

Putting up a beware of dog sign, instead of getting a dog.

That's an interesting topic for discussion, but I don't think
it answers Perry's original question, because there are plenty
of situations where the semblence of protection is actually a
cost-effective form of security.  It's an example of statistical

Look at it from the attacker's point of view:  If a fraction X
of the beware-of-dog signs really are associated with fierce dogs,
while (1-X) are not, *and* the attacker cannot tell which are which,
and there are plenty of softer targets available, the attacker
won't risk messing with places that have signs, because the
downside is just too large.  The fraction X doesn't need to be
100%;  even a smallish percentage may be a sufficient deterrent.
OTOH of course if the sign-trick catches on to the point where
everybody has a sign, the sign loses all value.

We can agree that the dog-sign is not a particularly good
application of the idea of statistical enforcement, because
there are too many ways for the attacker to detect the
absence of a real dog.

A better example of statistical deterrence is traffic law
enforcement.  The cops don't need to catch every speeder every
day;  they just need to catch enough speeders often enough,
and impose sufficiently unpleasant penalties.  The enforcement
needs to be random enough that would-be violators cannot
reliably identify times and places where there will be no

Statistical enforcement (if done right) is *not* the same as
security by obscurity.

This is relevant to cryptography in the following sense:  I doubt
cryptological techniques alone will ever fully solve the phishing
problem.  A more well-rounded approach IMHO would include sting
operations against the phishers.  Even a smallish percentage
chance that using phished information would lead to being arrested
would reduce the prevalence of the problem by orders of magnitude.


Let me propose another answer to Perry's question:
  Wearing a millstone around your neck to ward off vampires.

This expresses both ends of a lose/lose proposition:
  -- a burdensome solution
  -- to a fantastically unimportant problem.

This is related to the anklets on the White Knight's horse,
to guard against the bites of sharks ... with added emphasis
on the burdensomeness of the solution.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: solving the wrong problem

2005-08-06 Thread John Denker

Perry E. Metzger wrote:

We need a term for this sort of thing -- the steel tamper
resistant lock added to the tissue paper door on the wrong vault
entirely, at great expense, by a brilliant mind that does not
understand the underlying threat model at all.

Anyone have a good phrase in mind that has the right sort of flavor
for describing this sort of thing?

In a similar context, Whit Diffie once put up a nice
graphic:  A cozy little home protected by a picket fence.
he fence consisted of a single picket that was a mile
high ... while the rest of the perimeter went totally

So, unless/until somebody comes up with a better metaphor,
I'd vote for one-picket fence.

I recognize that this metaphor is not sufficiently
pejorative, because a single picket is at least
arguably a step in the right direction, potentially
a small part of a real solution.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: ID theft -- so what?

2005-07-13 Thread John Denker

On 07/13/05 12:15, Perry E. Metzger wrote:

However, I would like to make one small subtle point. 
... the use of widely known pieces of information about
someone to identify them. 

Yes, there are annoying terminology issues here.

In the _Handbook of Applied Cryptography_ (_HAC_)
 -- on page 386 they say The terms _identification_
  and _entity authentication_ are used synonymously
  throughout this book  (which in fact they are :-)
 -- on page 24 they say the term _authentication_ is
  often abused.

It seems to me that the term _identification_ is even more
ambiguous and open to misunderstanding than authentication
is.  Overall, it's a quagmire.

In some circles, the notion of _identifying information_
is quite a weak notion, meaning sufficient information
to pick the desired entity out of a crowd.  More-or-less
synonymous notions include
 *) characterization
 *) description (sufficiently detailed to be unique)
 *) unverified _claim_ of identity
 *) pointer, indicator, index
 *) name, call-sign, handle
 *) record-locator, database-key
-- used as an index into the database,
-- *not* a key in the sense of lock-and-key
-- example: LOGNAME i.e. the first field in each
   record in the /etc/passwd database

In other circles, _identification_ refers to a much stronger
notion.  When the military speaks of IFF (identification,
friend or foe) they don't consider it sufficient for you
to be able to _describe_ a friend, they actually want you
to _be_ a friend.

As to whether the word _identity_ should refer to full-blown
entity authentication, or to a mere characterization or
call-sign ... that seems like an unanswerable question.

== My recommendation:  Avoid using terms like identity
and identification entirely.
-- If you mean entity authentication, use that term
 in preference to indentification.
-- If you mean a mere description, handle, record-locator,
 etc. use those terms.

It would be nice to have a convenient catch-all term for
the whole category of notions like description, handle,
record-locator, et cetera.  I don't recommend calling them
weak identification, because the term weak authentication
has already been taken, and means something else, namely
passwords and the like (_HAC_ page 388).

The only time that you can even dream of using a detailed
description as a means of entity authentication is when
meeting face to face.  Somebody who fits my description
in full detail probably is me, although even that isn't
entirely certain.

On the other side of that coin, in a typical e-commerce
situation, allowing a description or a call-sign to serve
in place of entity authentication is ludicrous.  It means
that anybody who can describe me can impersonate me.  The
vulerability to replay attacks and MITM attacks is unlimited.

Typically a full-blown entity authentication starts with one
party making a _claim_ of identity which the other party
then _verifies_.   Unix login is a familiar example:  first
I give my LOGNAME and then I give the corresponding password.

The notion of theft of my LOGNAME is vacuuous.  My LOGNAME
(jsd) is known to everybody, good guys and bad guys alike.
As Spike said, so what?  My LOGNAME is nothing more than a
handle, a call-sign, a record-locator, used to look up the
appropriate record in places like /etc/passwd.

If you want to impersonate me, my computer requires you to
know not just my LOGNAME but also my password.  The way we
should treat passwords is verrry different from the way we
should treat call-signs.

Using this refined terminology, I can clarify what I was
trying to say yesterday:
 1) Being able to _describe_ me (height, weight, date of birth,
SSN, LOGNAME, and great-great-grandmother's maiden name) does
not mean you _are_ me.
 2) Even fully identifying and authenticating me as me doesn't
suffice to prove that wish to authorize this-or-that financial
transaction.  Who I *am* and what I wish to *happen* are
dramatically different notions.  Authenticating me as an entity
is not a proper substitute for authenticating the transaction
itself.  These notions are not unrelated, but they are not
identical, either.

In present-day practice, SSNs, credit card numbers, and
various bits of personal description are suspended in some
weird limbo: they are not nearly as secret as passwords
should be, yet they are still treated by some parties as
if they had magical entity-authenticating power and even
transaction-authenticating power.

So where do we go from here?  In general:
 -- When we think and when we speak, always distinguish
  handle versus entity authentication versus transaction
 -- Don't entrust our money to institutions that can't
  reliably make that distinction.
 -- Obtain legislation so that Muggles are protected, not
  just us.

Also:  A critical step that phishers must take in order to
exploit phished information is to check its validity.  Therefore
banks *must* be required to perform entity-authentication on

ID theft -- so what?

2005-07-12 Thread John Denker

I am reminded of a passage from Buffy the Vampire Slayer.
In the episode Lie to Me:

  BILLY FORDHAM:  I know who you are.
  SPIKE:  I know who I am, too.  So what?

My point here is that knowing who I am shouldn't be a
crime, nor should it contribute to enabling any crime.
Suppose you know who I am.  Suppose you know my date of
birth, social security number, and great-great-grandmother's
maiden name.  As Spike said, so what?

It's only a problem if somebody uses that _identifying_
information to spoof the _authorization_ for some

And that is precisely where the problem lies.  Any
system that lets _identification_ serve as _authorization_
is so incredibly broken that it is hard to even discuss
it.  I don't know whether to laugh or cry.

Identifying information cannot be kept secret.  There's
no point in trying to keep it secret.  Getting a new
SSN because the old one is no longer secret is like
bleeding with leeches to cure scurvy ... it's completely
the wrong approach.  The only thing that makes any sense
is to make sure that all relevant systems recognize the
difference between identification and authorization.

Repeat after me:  identification is not authorization.
Identification is not authorization.

When people talk about authentication factors such as
  a) something I know
  b) something I have
  c) something I know
it is crucial to keep in mind that item (a) must be something
I know _that other people don't know_.  Identifying information
doesn't qualify, and cannot possibly qualify.  My SSN is not
a password.  It lacks many of the properties that a password
should have.

Credit-card numbers, in practice, do little more than
identify me and my account.  They are not handled the way
passwords should be handled.

Eliminating ludicrously broken authentication schemes is
something we should work on.  Password theft is something
we should try to prevent.  But when it comes to ID theft,
we should say: So what?

I've been saying this for years, but it seems timely to say
it again.

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: WYTM - but what if it was true?

2005-06-27 Thread John Denker

On 06/27/05 00:28, Dan Kaminsky wrote:

... there exists an acceptable solution that
keeps PC's with persistent stores secure.  A bootable CD from a bank is
an unexpectedly compelling option

Even more compelling is:
 -- obtain laptop hardware from a trusted source
 -- obtain software from a trusted source
 -- throw the entire laptop into a GSA-approved safe when
  not being used.

This is a widely-used procedure for dealing with classified

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy and PRNGs

2005-01-26 Thread John Denker
Ed Gerck wrote:
Let me comment, John, that thermal noise is not random
When did you figure that out?  If you'd been paying attention,
you'd know that I figured that out a long time ago.
First of all, the phrase not random is ambiguous.  I said
Some people think random should denote 100% entropy density,
and anything less than that is non-random even if it
contains a great deal of unpredictability. Other folks think
that random refers to anything with an entropy density
greater than zero, and non-random means completely
Thermal noise, as it comes off the hardware, has an entropy
density greater than zero and less than 100%, as I said at
and elsewhere.
There are several quantities that can be estimated in thermal
noise, reducing its entropy according to what you seem to expect
today. See photon bunching, as an example that is usually ignored.
Another, even though trivial, example is due to the observation that
thermal noise is not white noise. Yet another observation is that no
noise is really white, because of causality (in other words, it's
duration must be finite). The noise that is due to photon fluctuations
in thermal background radiation, for another example, depends
also on the number of detectors used to measure it, as well as
single- or multiple-mode illumination, and both internal and external
noise sources.
Stop wasting our time.  All that doesn't change the fact that
thermal noise, as it comes off the hardware, has a nonzero
entropy density.  And it is easy to arrange situations where
I can calculate a very useful lower bound on the entropy density.
Yes, it's entirely possible that someone in the future will know
more about your entropy source than you do today! Even thermal
That's tantamount to saying the second law of thermodynamics will
be repealed.  By that standard, it's entirely possible that the
sun will rise in the west tomorrow morning.  But I wouldn't bet
on it.
OTOH, why are nuclear decay processes considered safe as a source
of entropy? Because the range of energies preclude knowing or
tampering with the internal state. These processes are, however,
not free from correlations either.
Nuclear decay processes are not in any practical sense safer than
thermal noise.  As I discuss at
nuclear is in the same category as thermal:  entropy density
greater than zero and less than 100%.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy and PRNGs

2005-01-10 Thread John Denker
Referring to
I wrote:
I just took a look at the first couple of pages.
IMHO it has much room for improvement.
David Wagner responded:
I guess I have to take exception.  I disagree.  I think Ben Laurie's
paper is quite good.  I thought your criticisms missed some of the points
he was trying to make (these points are subtle, so this is completely
understandable).  Presumably his paper could be criticized as not clear
enough, since it seems it did not convey those points adequately, but
I don't think his paper is inaccurate. 
Or perhaps it was my critique that was not clear enough.
I am not motivated to look for subtle shades of meaning in a
paper that claims the system clock is a source of entropy.
First of all, you can't throw around the term conditional
entropy without saying what it's conditioned on.

 Conditioned on everything known to the attacker, of course.
Well, of course indeed!  That notion of entropy -- the entropy
in the adversary's frame of reference -- is precisely the
notion that is appropriate to any adversarial situation, as I
have consistently and clearly stated in my writings;  see
e.g. the end of the definitions section, i.e.
If it has no unpredictability, it has no entropy,
according to any reasonable definition of entropy, including the
somewhat loose definition on page 1.
Actually, I think Ben got it right.  Entropy depends on context.
The attacker might have extra context that allows him to narrow down
the possible values of the randomness samples.
Heed your own of course statement.  It is hard to imagine a
situation where my adversary has more context about my
generator than I have myself.
For instance, imagine if we use packet inter-arrival times (measured down
to the nanosecond) as our randomness source. 
I am unable to imagine myself being so silly.
This is the difference between unconditional and conditional entropy that
Ben was trying to introduce.  In information-theoretic notation, H(X)
vs H(X|Y).  Let X = packet inter-arrival time, and Y = everything seen by
a local eavesdropper, and you will see that H(X|Y) can be much smaller
than H(X).  Indeed, we can have H(X|Y) = 0 even if H(X) is very large.
This is Ben's point, and it is a good one.
No.  There is only one entropy that matters in an adversarial
situation.  The so-called unconditional entropy H(X) is
merely a wild overestimate of the only thing that matters.
There is no glory in outstripping donkeys.  (Martial)
There is no honor in introducing H(X) since it is irrelevant
in any adversarial situation.
A counter is fine as long as there is only one machine in the universe
that will ever assign UUIDs.  However, if you want to do distributed
generation of UUIDs, then counters are insufficient because there is no
way to prevent overlap of two machine's counter spaces.
I imagine a smart person such as DAW should be able to come
up with five schemes in five minutes whereby UUID generation
can be delegated to virtually any machine that wants it.
MAC(eth0) /concat/ local counter will do for scheme #1.
Perhaps what Ben should have said is that:
* Unconditional entropy is sufficient for UUIDs;
  conditional entropy is not needed.
* For centrally-assigned UUIDs, even unconditional entropy is unnecessary;
  a centrally-managed counter is fine.
* For distributed, unsynchronized assignment of UUIDs, unconditional
  entropy appears to be necessary and sufficient.
Horsefeathers.  For generating  UUIDs,  _zero_ entropy is
sufficient, and no positive amount of entropy (unconditional
or otherwise) can be called necessary.
I am not interested in ways of obtaining wild overestimates
of zero.
If you want people to believe that unconditional entropy is a
worthwhile concept, you'll have to come up with a nontrivial
application for it.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy and PRNGs

2005-01-10 Thread John Denker
Ben Laurie wrote:
The point I am trying to make is that predictability is in the eye of 
the beholder. I think it is unpredictable, my attacker does not.
I still cannot see how that can happen to anyone unless
they're being willfully stupid.  It's like something out
of Mad Magazine: White Spy accepts a cigar from Black Spy,
lights it, and is surprised when it explodes.  That's
funny when it happens to somebody else, but as for me,
I'm not going to accept alleged entropy from any source
such that my adversary might know more about it than I do.
I'm just not.
By your argument, no PRNG ever has any entropy, since the inputs are 
clearly known (at least to the PRNG).
I *almost* agree with that, but my real argument is somewhat
more nuanced:
a) Certainly there is a very wide class of PRNGs that
have no entropy whatsoever, including many that Mr. Laurie
seems willing to attribute entropy to.
b) It is also possible, as I have repeatedly explained,
for an ordinary PRNG to have a modest amount of entropy
residing in its internal state.  This entropy must
have abeen obtained from elsewhere, from something
other than a PRNG, not produced _de novo_ by any PRNG.
Categories (a) and (b) share the property of having
no nonzero lower bound on the entropy _density_ of
the output stream;  the entropy density is either
strictly zero (case a) or asymptotically zero (case b).
c) At the opposite extreme, there exist things that
produce 100% entropy density.  These must not be
called PRNGs.  I like the name HESG -- High Entropy
Symbol Generator.
d) Also as I have repeatedly explained, there exist
intermediate cases, where something that works like
a PRNG is coupled to something else that provides
real entropy.  I recommend calling this a SRSG,
i.e. Stretched Random Symbol Generator, since it
isn't just a PRNG and it isn't just a HESG either.
Linux /dev/urandom was an early and unsatisfactory
attempt at an SRSG.  Yarrow, coupled to a good HESG,
is vastly better, and that's what I implemented.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Entropy and PRNGs

2005-01-10 Thread John Denker
John Kelsey wrote:
If your attacker (who lives sometime in the future, and may
have a large budget besides) comes up with a better model to
describe the process you're using as a source of noise, you
could be out of luck.   The thing that matters is H(X| all
information available to the attacker), which is based on P(X
| all information available to the attacker), which includes a
model that may be better than yours.
I disagree.  For the sources of entropy that I consider
real entropy, such as thermal noise, for a modest payoff I'd
be willing to bet my life -- and also the lives of millions
of innocent people -- on the proposition that no adversary,
no matter how far in the future and no matter how resourceful,
will ever find in my data less entropy than I say there is.
(For instance, under suitable conditions I might say that
there's 159.98 bits of entropy in a 160-bit word.)
Of course I'd want to make approriate checks for implementation
errors, but in principle the entropy is there and the adversary
can't make it go away.  Some guy named Shannon had something to
say about this.
But I think it's of practical value to consider the different
attackers whose information might not include some information
you use for seeding a PRNG.  Some sources of entropy, such as
packet arrival times, are not worth much for attackers on your
local network who are attacking you in real time, but are
quite valuable against attackers who attack you later.  Other
sources of entropy, such as the hash of the contents of your
Windows registry, or a full directory tree from your hard
drive, are worthwhile against real-time attackers without
access to your machine, but worthless against  attackers with
your machine in their hands.  
Can you please exhibit a nonzero lower bound on the entropy
content of the windows registry?  If not, please don't call
it entropy.
In particular, I ask you to consider the case of mass-produced
network appliances as mentioned at
and consider whether, registry or no registry, the boxes will
be waaay too much alike.
In ordinary situations, the windows registry is constructed
by strictly deterministic processes.  No entropy.  If your
adversaries are so lame that they cannot figure out how the
registry is constructed, you're living in some kind of paradise.
Differentiate between measures of entropy.  Collision entropy ...
Let's stay on-topic.
There is only one measure of entropy appropriate to a random
symbol generator.  If I am unable in principle to predict
the output of my HESG, then under mild assumptions that's
what I call entropy.
  By mild assumptions I mean things like assuming my
  machine has not been taken over by the attacker.  This
  assumption is of course common to *all* discussions
  of security algorithms and principles.  I mention it
  only to fend off nit-picks.  There's not much point in
  discussing algorithm A if you're going to turn around
  and tell me your box might be implementing some other
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: entropy depletion

2005-01-08 Thread John Denker
Zooko O'Whielacronx wrote:
I would love to have an information-theoretic argument for the security 
of my PRNG, but that's not what we have,
Yes, and I'd like my goldfish to ride a bicycle, but he can't.
The P in PRNG is for Pseudo, and means the PRNG is relying
on computational intractability, not entropy.
 and I don't think reducing the 
entropy_count by one bit per output bit gets us any closer to such an 
For PRNGs that's true ... which is why I have introduced the
terminology of SRNG, meaning Stretched RNG, which is a
hermaphrodite, being partly PRNG and partly HESG.  Linux
/dev/urandom is an early and less-than-satisfactory attempt
at a SRNG.  Yarrow is vastly better.
For starters, the entropy_count value before you output the bit is 
obviously not a measure of the real information-theoretic entropy in the 
PRNG's state.  It is a guess implemented by a simple algorithm (the 
entropy estimator).  So if we set the resulting entropy_count after 
outputting one bit to be equal to the previous entropy_count - 1, we 
really have no justification for thinking that the resulting 
entropy_count is any closer to the true information-theoretic entropy 
than if you had set it equal to the previous entropy_count - 2.

On the other hand, I've haven't heard an information-theoretic argument 
that the output bit contains a whole bit of entropy. 
Well, that depends on details.  Suppose you seed your PRNG with
1000 bits of entropy, and then put out 50,000 bits of output.
For one typical class of PRNGs, which includes linux /dev/urandom
and excludes yarrow, the first 1000 bits of output will use up
all the entropy, in the sense that the mapping from seeds to
output-strings will be very nearly one-to-one.  To say the
same thing another way, the adversaries are trying to find the
seed by brute-force search of seed-space, if they've guessed
the wrong seed they'll know it with very high probability after
seeing the first 1000 bits of output.  (They could also wait
and know the same thing from the second 1000 bits of output,
so we can have metaphysical arguments about just where the
entropy resides ... but since the adversaries get to see the
output stream in order, our minimax strategy is to assume
they are not stupid and have used the first 1000 bits to
reduce *their* entropy.  The number that I'm decrementing by
one bit per bit is my minimax estimate of the adversaries'
entropy.  If you have some other number that you would like
to decrement by some other rule, that's fine ... but I think
I'm doing the right thing with my number.)
Yarrow, in contrast, might well be configured to use only
100 bits of entropy for the first 50,000 bits of output,
and then reseed itself with another 100 bits for the
second 50,000 bits of output.  This leads to a notion of
average entropy density of the output, which is greater than
zero but less than 100%.
There is a nice 
practical-cryptography argument that an observer gains a whole bit of 
information from seeing that output bit, but in pure 
information-theoretic terms an observer gains less than one bit of 
information from seeing that output.  So perhaps when you output a bit 
from /dev/random you ought to decrement entropy_count by 0.9 instead?
That is not minimax, as explained above.
In general, I've heard no persuasive information-theoretic argument to 
justify the practice of decrementing the entropy_count by 1 bit per 
See above.
 If that practice does indeed ever protect someone from a
cryptanalytic attack on their PRNG, it will not be because the practice 
is Information Theoretically Correct, but because the 
entropy_count-bits-added-per-input-bit minus the 
entropy_count-bits-subtracted-per-output-bit were an engineering fudge 
factor that was turned up high enough to drown out the cryptanalytic 
weakness in the PRNG.
I'm not sure I would have said it that way, but I might agree
with the sentiment.  If the PRNG were secure against
cryptanalysis, including practical cryptanalysis such as
peeking at the state vector, then you wouldn't need more
than an infinitesimal entropy density.
Of course using such a fudge factor has some other costs, including the 
cost of introducing new security risks.  I estimate that the chance of a 
successful attack due to timing attacks, induced failure, taking 
advantage of accidental failure, social engineering, etc. outweighs the 
chance of a successful attack due to cryptanalysis of the PRNG, which is 
why I use /dev/urandom exclusively [*, **].  You may weigh those 
trade-offs differently, but you shouldn't think that by decrementing 
entropy_count you are achieving information-theoretic security.

[*] Of course I have to be aware of the regrettable possibility that 
/dev/urandom has *never* been properly seeded and protect against that 
in user space.
[**] ... and the possibility that the operating system is re-using 
stored random state which it already used just before an unclean shutdown.
Meanwhile, implementing a High 

Re: entropy depletion

2005-01-07 Thread John Denker
Jerrold Leichter asked:
 random number generator this way.   Just what *is*
good enough?
That's a good question.  I think there is a good answer.  It
sheds light on the distinction of pseudorandomness versus
  A long string produced by a good PRNG is conditionally
  compressible in the sense that we know there exists a shorter
  representation, but at the same time we believe it to be
  conditionally incompressible in the sense that the adversaries
  have no feasible way of finding a shorter representation.
In contrast,
  A long string produced by a HESG is unconditionally, absolutely
  incompressible. There does not exist a shorter representation.
  There cannot possibly exist a shorter representation.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: entropy depletion

2005-01-07 Thread John Denker
I wrote:

A long string produced by a good PRNG is conditionally 
compressible in the sense that we know there exists a
shorter representation, but at the same time we believe it
to be conditionally incompressible in the sense that the
adversaries have no feasible way of finding a shorter

In contrast,
A long string produced by a HESG is unconditionally,
absolutely incompressible. There does not exist a shorter
representation. There cannot possibly exist a shorter
Jerrold Leichter wrote:
You're using Kolgomorv/Chaitin complexity here, but unfortunately that's a
bit hard to apply.  
It works for me.  It is in principle hard to apply exactly, but
in practice it's easy to apply to a good-enough approximation.
In particular, for any given PRNG I can readily exhibit the
compressed representation of its output, namely the PRNG
itself along with the initial internal state.
 K/C complexity is robust
when you talk about sets of possible strings, because playing around with
the machine can only shorten a fixed number of them.  
Yes, I stand corrected.  I sloppily wrote string when I should
have written strings.  Specifically, the set of strings
collectively cannot be compressed at all. As for each string
individually, it is not compressible either, except possibly for
trivially rare cases and/or trivial amounts of compression.
*Any* string has a shorter representation - if I get to
specify the model *after* you choose the string.
Yes, for one particular string, or very small set of strings.
You get to choose the model, but you only get to choose once.
So you can only compress a trivially small subset of all the
output strings, unless we're talking about a trivial degree
of compression.  In any case, you derive no cryptanalytic
advantage from compression based on an ad-hoc model, so you
might as well not bother, and just stick to some convenient
standard model.
 Turning that into a
notion useful for cryptography is a bit tougher - and in any case, if you
want to get at PRNG's, you need to get at K/C complexity with trapdoor
information:  The bit stream may *look* uncompressible, but given the
internal state, it is easily produced.
Isn't that what I said, as quoted above?  I said it was
conditionally compressible.  It should go without saying
that knowing the trapdoor information, i.e. the internal
state, is the sufficient condition for feasible compressibility.
More generally:  We're talking about stretching entropy here. 
Well, I can agree that that's what we _should_ be talking about.
I like to speak of an SRNG (stretched random number generator)
as having a nonzero lower bound on the entropy density of the
output, as opposed to a traditional PRNG where the entropy
density is asymptotically zero.
 There are
certainly theoretical results in that direction, the one usually mentioned
being the BBS bit generator, which takes k bits of entropy and gives you
p(k) (for some polynomial p) bits that are polynomially-indistinguishable
from random bits.  But you (a) need some significant work to set this up and
prove it; (b) BBS generators are slow.
A simpler approach that does work (in some sense) is:  Choose a random
starting value R, and a number k.  Compute SHA^i(R) for i from 0 to k.  Emit
these values *backwards*.  Then given the first k-1 outputs, an attacker
cannot determine the next one (under the standard hypotheses about SHA).
Unfortunately, this is useless as, say, a key generator:  If you send me
the k-1'st output for use as a key, I can't determine what your *next* key
will be - but I can trivially read your preceding k-2 sessions.
When I wanted to stretch entropy, I implemented the Yarrow
approach, i.e. encrypted counter plus systematic reseeding:
It's not slow, and appears incomparably stronger than the
SHA^i example.  Indeed it does not have any serious weaknesses 
that I know of.  If anybody knows of any flaws, please explain!

The idea that revealing just the hash of random bits doesn't reduce the
effective entropy sounds great, but it's naive. 
We agree it's naive.  Indeed it's just wrong, as I've said N
times over the last couple of days.  And please let's not talk
about effective entropy.  I have no idea what ineffective
entropy is supposed to be, and I can't imagine any way that makes
sense to distinguish effective entropy from plain old entropy.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: entropy depletion

2005-01-06 Thread John Denker
I wrote:
  Taking bits out of the PRNG *does* reduce its entropy.
Enzo Michelangeli wrote:
By how much exactly?
By one bit per bit.
I'd say, _under the hypothesis that the one-way
function can't be broken and other attacks fail_, exactly zero; in the
real world, maybe a little more. 
If you said that, you'd be wrong.
This is getting repetitious.  As I said before, this is an abuse of the
terminology.  If you want to quantify the goodness of your PRNG, go right
ahead, but please don't apply the word entropy to it.  The word is already
taken.  It means something very specific.
But in
/usr/src/linux/drivers/char/random.c I see that the extract_entropy()
function, directly called by the exported kernel interface
get_random_bytes(), states:
if (r-entropy_count / 8 = nbytes)
r-entropy_count -= nbytes*8;
r-entropy_count = 0;
...which appears to assume that the pool's entropy (the upper bound of
which is POOLBITS, defined equal to 4096) drops by a figure equal to the
number of bits that are extracted (nbytes*8). This would only make sense
if those bits weren't passed through one-way hashing. 
The linux /dev/random driver has lots of problems, but that's not one of 
That makes perfect sense.  Anything else would not make sense.  That's what
entropy is.  If you're not interested in entropy, then go measure whatever
you are interested in, but don't call it entropy.
 Perhaps, a great
 deal of blockage problems when using /dev/random would go away with a more
 realistic estimate.
100% of the blocking problems go away if you use /dev/urandom to the exclusion
of /dev/random.
For the Nth time:
  a) Most of modern cryptography is based on notions of computational 
I'm not saying that's good or bad;  it is what it is.
  b) My point is that there is an entirely different set of notions, including 
notion of entropy and the related of unicity distance, which have got *nothing* 
do with computational intractability.
You can study (a) if you like, or (b) if you like, or both.  Maybe (a) is best
suited to your application, or maybe (b).  But whatever you do, please don't
mistake one for the other.
Lots of things have large amounts of usable randomness, with little or no
entropy.  Please don't mistake one for the other.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: SSL/TLS passive sniffing

2005-01-04 Thread John Denker
I wrote:
If the problem is a shortage of random bits, get more random bits!
Florian Weimer responded:
We are talking about a stream of several kilobits per second on a busy
server (with suitable mailing lists, of course).  This is impossible
to obtain without special hardware.
Not very special, as I explained:
Almost every computer sold on the mass market these days has a sound
system built in. That can be used to generate industrial-strength
randomness at rates more than sufficient for the applications we're
talking about.  
How many bits per second can you produce using an off-the-shelf sound
card?  Your paper gives a number in excess of 14 kbps, if I read it
correctly, which is surprisingly high.
1) You read it correctly.
2) The exact number depends on details of your soundcard.  14kbits/sec
was obtained from a plain-vanilla commercial-off-the-shelf desktop
system with AC'97 audio.  You can of course do worse if you try (e.g.
Creative Labs products) but it is easy to do quite a bit better.
I obtained in excess of 70kbits/sec using an IBM laptop mgfd in
3) Why should this be surprising?
It's an interesting approach, but for a mail server which mainly sends
to servers with self-signed certificates, it's overkill.  
Let's see
 -- Cost = zero.
 -- Quality = more than enough.
 -- Throughput = more than enough.
I see no reason why I should apologize for that.
Debian also
supports a few architectures for which sound cards are hard to obtain.
And we would separate desktop and server implementations because the
sound card is used on desktops.  I'd rather sacrifice forward secrecy
than to add such complexity.
As the proverb says, no matter what you're trying to do, you can always
do it wrong.  If you go looking for potholes, you can always find a
pothole to fall into if you want.
But if you're serious about solving the problem, just go solve the
problem.  It is eminently solvable;  no sacrifices required.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: SSL/TLS passive sniffing

2004-12-22 Thread John Denker
Florian Weimer wrote:
Would you recommend to switch to /dev/urandom (which doesn't block if
the entropy estimate for the in-kernel pool reaches 0), and stick to
generating new DH parameters for each connection, 
No, I wouldn't.
 or ...
generate them once per day and use it for several connections?
I wouldn't do that, either.

If the problem is a shortage of random bits, get more random bits!
Almost every computer sold on the mass market these days has a sound
system built in. That can be used to generate industrial-strength
randomness at rates more than sufficient for the applications we're
talking about.  (And if you can afford to buy a non-mass-market
machine, you can afford to plug a sound-card into it.)
For a discussion of the principles of how to get arbitrarily close
to 100% entropy density, plus working code, see:
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: IPsec +- Perfect Forward Secrecy

2004-12-05 Thread John Denker
OK, let me ask a more specific question.  Actually, let me
put forth some hypotheses about how I think it works, and
see if anyone has corrections or comments.
0) I'm not sure the words Perfect Forward Secrecy convey what
we mean when we talk about PFS.  Definition 12.16 in HAC suggests
_break-backward protection_ as an alternative, and I prefer that. 
Perhaps the complementary concept of break-back _exposure_ would
be even more useful.

I think for today we don't have a simple yes/no question as
to whether the secrecy is perfect;  instead we have multiple
quantitative questions as to which connections have how much
break-back exposure.
1) First an ISAKMP SA is set up, then it is used to negotiate
one or more IPsec SAs, which carry the traffic.
2) Ephmeral DH is always used on the ISAKMP SA, so the ISAKMP
session has no more than one ISAKMP session's worth of break-back
exposure.  That is, the attacker who steals an ISAKMP session
key can read that session, but (so far as we know :-) does not
thereby gain any head-start toward reading earlier ISAKMP sessions.
3) Each IPsec SA has its own session key.  The stated purpose of
Quick Mode is to provide fresh keying material.  Nonces are
used.  As I understand it, that means the IPsec session keys are
sufficiently ephemeral that each IPsec session has no more than
one IPsec session's worth of break-back exposure.  That is, the
attacker who steals an IPsec session key can read that session,
but does not (sfawk :-) gain any head-start toward reading
earlier IPsec sessions.
4) As far as I can tell, the only interesting question is whether
a break of the ISAKMP session is _inherited_ by the IPsec sessions
set up using that ISAKMP session.  The break of an IPsec session
will not spread at all.  The break of an ISAKMP session will not
spread beyond that ISAKMP session ... but what happens within that
ISAKMP session?  The answer, as I understand it, depends on the
setting of the misleadingly-named IPsec PFS option.  If the
option is set, there is an additional layer of opacity on a
per-IPsec-SA basis, so that a break of the ISAKMP session is not
inherited by its IPsec SAs.
Bottom line:
As I understand it, IPsec always has reasonably tight limit on
the amount of break-back exposure, but setting the so-called
PFS option reduces the exposure further ... roughly speaking,
by a factor of the number of IPsec SAs per ISAKMP SA.
Comments, anyone?
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Compression theory reference?

2004-09-06 Thread John Denker
Matt Crawford wrote:
Plus a string of log(N) bits telling you how many times to apply the
decompression function!
Uh-oh, now goes over the judge's head ...
Hadmut Danisch wrote:
The problem is that if you ask for a string of log(N) bits, then 
someone else could take this as a proof that this actually works, 
because a string of log(N) bits is obviously shorter than the 
message of N bits, thus the compression scheme is working. Hooray!
That misses the point of the construction that was the subject of
Matt's remark.  The point was (and remains) that the compressed
output (whether it be 1 bit, or 1+log(N) bits, or 1+log^*(N) bits)
is ridiculous because it is manifestly undecodeable.  It is far, far
too good to be true.
The only question is whether the construction is simple enough
for the judge to understand.
There is no question whether the construction is a valid _reductio
ad absurdum_.
  While we are on the subject, I recommend the clean and elegant
  argument submitted by Victor Duchovni (08/31/2004 03:50 PM) and
  also in more detail by Matt Crawford (08/31/2004 06:04 PM).  It
  uses mathematical induction rather than proof-by-contradiction.
  It is a less showy argument, but probably it is understandable
  by a wider audience.  It proves a less-spectacular point, but it
  is quite sufficient to show that the we-can-compress-anything
  claim is false.  (Although with either approach, at least *some*
  mathematical sophistication is required.  Neither PbC nor MI will
  give you any traction with ten-year-olds.)
  So it appears we have many different ways of approaching things:
   1) The pigeon-hole argument.  (This disproves the claim that all
N-bit strings are compressible ... even if the claim is restricted
to a particular fixed N.)
   2) The mathematical induction argument.  (Requires the claimed
algorithm to be non-expansive for a range of N.)
   3) The proof-by-contradiction.  (Requires the claimed algorithm
to be compressive -- not just non-expansive -- for a range of N.)
   4) Readily-demonstrable failure of *any* particular claimed example,
including Lempel-Ziv and all the rest.
   *) Others?
  Harumph.  That really ought to be enough.  Indeed *one* disproof
  should have been enough.
The problem is, that the number of iterations is not in the order of 
N, but in the order of 2^N, so it takes log2(around 2^N) = around N bits to
store the number of iterations.
I don't see why the number of iterations should be exponential in
the length (N) of the input string.  A compression function is
supposed to decrease N.  It is not supposed to decrement the
number represented by some N-bit numeral  after all, the string
might not represent a number at all.
Also I repeat that there exist special cases (e.g. inputs of
known fixed length) for which no extra bits need be represented,
as I explained previously.
The recursion convertes a message of 
N bit recursively into a message of 1 or zero bit length (to your
taste), *and* a number which takes around N bit to be stored. 
Nothing is won. But proof that. 
I disagree, for the reasons given above.
In the worst case, you need log^*(N) extra bits, not N bits.  In
special cases, you don't need any extra bits at all.  The win
is very substantial.  The win is extreme.
This recursion game is far more complicated than it appears to be. 
Maybe.  But there's no need to make it more complicated than it
really is.
Note also that storing a number takes in reality more than log(N)
bits. Why? Because you don't know N in advance. We don't have any
limit for the message length. 
For general N, that's true.
 So you'r counting register needs
theoretically inifinte many bits. 
Maybe.  For many practical purposes, the required number of bits
is considerably less than infinity.
 When you're finished you know
how many bits your number took. But storing your number needs an 
end symbol or a tristate-bit (0,1,void). That's a common mistake. 
We agree that there are many common mistakes.  We agree that
it is a mistake to have undelimited strings.  But it is also a
mistake to think that you need to reserve a special symbol to
mark the end of the string.  Yes, that is one option, but from a
data-compression point of view it is an inefficient option.
Anybody who is interested in this stuff reeeally ought to read
Chaitin's work.  He makes a big fuss about the existence of
self-delimiting strings and self-delimiting programs.  There
are many examples of such:
 -- The codewords of many commonly-used compression algorithms
  are self-delimiting.  This is related to the property of being
  instantaneously decodable.
 -- As Chaitin points out, you can set up a dialect of Lisp such
  that Lisp programs are self-delimiting.
 -- If you want to be able to represent M, where M is *any* N-bit
  number, you need more than log(M) bits (i.e. more than N bits).
  That's because you need to specify how many bits are used to
  represent log(M), which adds another log(log(M)) bits.  

Re: Compression theory reference?

2004-09-01 Thread John Denker
I wrote:
 4) Don't forget the _recursion_ argument.  Take their favorite
algorithm (call it XX).  If their claims are correct, XX should
be able to compress _anything_.   That is, the output of XX
should _always_ be at least one bit shorter than the input.
Then the compound operation XX(XX(...)) should produce something
two bits shorter than the original input.  If you start with a
N-bit message and apply the XX function N-1 times, you should be
able to compress each and every message down to a single bit.
Matt Crawford wrote:
Plus a string of log(N) bits telling you how many times to apply the 
decompression function!
Uh-oh, now goes over the judge's head ...
Actually you don't need to adjoin log(N) bits.  But perhaps my
assertion would benefit from some clarification.
I emphasize that I am only discussing messages of length N,
where N is some pre-chosen number.  For concreteness, let's
choose N=10.
I repeat my assertion that _if_ XX can compress any string,
shortening it by one bit, and _if_ you know that the original
messages each have exactly 10 bits, _then_ any 10-bit message
can be compressed down to a single bit.
I have proved that XX is ridiculous in this one case.
My function YY := XX^9 is less general than XX.  XX works
on any input, whereas YY by its definition only applies to
10-bit messages.
The fact remains that we have a proof by contradiction.  We
assume by way of hypothesis that the bad-guys are right, namely
that XX exists and has the properties they assign to it.  Then
I can construct YY.  But YY is ridiculous, through no fault of
mine.  Ergo the bad guys are wrong, i.e. no such XX can exist.
With a little more work I could construct a more powerful and/or
more general version of YY ... but that would be doing more work
than is required.  Their XX stuck its neck out;  it is not
required for my YY to stick its neck out in the same way.  The
requirement, as I understood it, was to prove the bad guys
wrong.  Well, the bad guys have been proved wrong.  If something
more is required, please explain the requirements in more detail.
  (BTW I suppose it would be better to call this the 'iterated
  composition' argument rather than the recursion argument.)
Hadmut wrote:
I found some outside Germany. But they didn't give me a paper with 
signature, just e-mails. Will see whether the court will accept that. 
Ask your legal advisor.
In the US, getting such emails admitted as evidence would be
problematic.  Each jurisdiction (I assume) has its own standards
for how affidavits should be prepared.  Figure out the rules,
and play by the rules.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: ?splints for broken hash functions

2004-09-01 Thread John Denker
I wrote
 the Bi are the input blocks:
 (IV) - B1 - B2 - B3 - ... Bk - H1
 (IV) - B2 - B3 - ... Bk - B1 - H2
then we combine H1 and H2 nonlinearly.
  (Note that I have since proposed a couple of improvements,
   but I don't think they are relevant to the present remarks.)

David Wagner wrote:
This does not add any strength against Joux's attack.  One can find
collisions for this in 80*2^80 time with Joux's attack.
First, generate 2^80 collisions for the top line.  Find B1,B1* that
produce a collision, i.e., C(IV,B1)=C(IV,B1*)=V2.  Then, find B2,B2*
that produce a collision, i.e., C(V2,B2)=C(V2,B2*)=V3.  Continue to
find B3,B3*, ..., Bk,Bk*.  Note that we can combine this in any way
we like (e.g., B1, B2*, B3*, B4, .., Bk) to get 2^80 different messages
that all produce the same output in the top line (same H1).
OK so far.  I think this assumes a sufficiently long input string,
but I'm willing to make that assumption.
Next, look at the bottom line.  For each of the 2^80 ways to combine
the above blocks, compute what output you get in the bottom line.
OK so far.
By the birthday paradox,
Whoa, lost me there.
I thought H1 was fixed, namely the ordinarly has of the original
message.  Two questions:
  1) If H1 is fixed, I don't understand how birthday arguments apply.
   Why will trying the bottom line 2^80 times find me H@ equal to a
   particular fixed H1?  There are a whole lot more that 2^80
  2) If H1 is not fixed, then this seems to be an unnecessarily
   difficult way of saying that a 160-bit hash can be broken using
   2^80 work.  But we knew that already.  We didn't need Joux or
   anybody else to tell us that.
   A proposal that uses 80*2^80 work is particularly confusing, if
   H1 is not fixed.
you will find some pair that produce a
collision in the bottom line (same H2).  But that pair also produces
a collision in the top line (since all pairs collide in the top line),
so you have a collision for the whole hash (same H1,H2).
Still lost, for the same reasons.  Please explain.  Also if possible
please address the improved version, with different IVs and long-range
permutation of a partial block.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: ?splints for broken hash functions

2004-08-31 Thread John Denker
Jerry Leichter writes:
 However ... *any* on-line algorithm falls to a Joux-style attack.
Hal Finney wrote:
... hashes that support incremental hashing, as any useful hash surely must,
If you insist on being able to hash exceedingly long strings
(i.e. longer than your storage capacity) here is a modification
of the splint I proposed earlier.
Run two hash-machines in parallel.  The first one operates
normally.  The second one puts the first block of the string
into a buffer (bounded size!) and then proceeds to hash the
rest of the string, starting with the second block.  At the
end of the message, it appends the saved block to the tail of
the message and hashes it.
The two results are combined in some nonlinear way, perhaps
by appending the other machine's hash onto this machine's
input string.
The strength here comes from the idea the that you cannot
engineer a collision in the saved block in the second
machine until you have engineered the collision in the
first machine, and then if you change the saved block to
engineer a collision in the second machine you have to
redo everything you did in the first machine.
Of course it would be nice to have hash functions that were
strong on a block-by-block basis ... but still I think there
is value in destroying the prefix property without destroying
the online property, i.e. not destroying the ability to hash
strings that are too long to store.
  Small point: I'm not entirely convinced that *any* useful
  hash must have the online property.  That's not a defining
  property of hash functions according to most authorities.
  I've done lots of hashing in situations where I would have
  been happy to store the entire input-string.
  Still we can agree that the online property is *sometimes*
  nice, and I'm happy to be able to provide it.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: ?splints for broken hash functions

2004-08-31 Thread John Denker
I agree with 99% of what Hal Finney wrote.  I won't repeat
the parts we agree on.
But let's discuss these parts:
how much harder?
Well, probably a lot.  Finding a regular B2 collision in a perfect
160 bit hash compression function takes 2^80 work.  Finding a double
collision like this is essentially the same as finding a collision on
a double-width hash function and takes 2^160 work, enormously more.
I'd say the answer is at most 2^160, possibly 2^160, but not provably
2^160.  Every amateur cryptographer who combines two methods expects
to get a multiplicative increase in strength ... but cryptanalysts
earn their living by finding ways to kill multiple birds with one
Of course we don't know for sure that there is no short-cut attack
that exploits the weakness of the compression function to allow double
collisions to be attacked, 
how about this simpler construction?
  (IV1) - B1 - B2 - B3 - ... Bk - H1
  (IV2) - B1 - B2 - B3 - ... Bk - H2
and again combine H1 and H2.  Here we run a hash function twice in
parallel but with two different IV values.  Superficially it looks weaker
than John Denker's construction, because the blocks line up nicely, but
as I have rearranged John's construction above they may not really be that
different.  Is there an attack against this version that John's resists?
I don't think that last question is the right question.  Journeyman
cryptographers check that their work is resistant to *known* attacks.
Master cryptographers design stuff that is resistant to attacks that
are not known, and not really even conceived of.
I don't claim to be much of a cryptographer, so I suppose I'd better
answer the question :-).
*Ideally* just changing the IV should be sufficient.  But *ideally*
the basic hash function wouldn't be broken and wouldn't need splinting.
I strongly prefer my long-range permutation trick, for the simple(?)
reason that I just don't like the prefix property.  It's a gut reaction.
My intuition is screaming at me:  bad design, bad design, asking for
trouble, asking for trouble.
Here is a barely-conceivable attack, offered as an a-posteriori
parable in support of the just-mentioned intuition.  Suppose we
have a long input string.  And suppose there are some initialization
vectors that are somehow weak.  Further suppose somebody jiggers
the first block to make a weak IV for the second block, then jiggers
the second block to make an even weaker IV for the third block, and
so on.  Finally the IV for the last block is so weak that any desired
final result can be achieved.  (I'm not saying that there's the
slightest reason to believe such an attack is possible ... remember
this is just a parable.)  The point of the story is that maybe long
messages are more vulnerable to attack than short messages.  They
offer a bigger target.  [End of parable.]
In any case, I reeeally don't like the prefix property.  And the
cost of eliminating it is incredibly small ... so why not?  Just
apply the long-range permutation, i.e. cut a few bytes off the
front of the second string and paste them onto the end
So I still prefer the scheme in my previous email.  In its
simplest version, the diagram is:
   IV1 (h) B1  (h) B2  (h) B3  (h) ... Bk  - H1
   IV2 (h) B1' (h) B2' (h) B3' (h) ... Bk' (h) H1 - H2 = final
where processing proceeds left-to-right, both rows can be
computed in parallel, (h) denotes the elementary hash-a-block
operation, and the primes on the second stream indicate that
it has been subjected to the long-range permutation operation.
For that matter, it wouldn't hurt my feelings if you applied
long-range permutations to *both* strings.  (Not the same
permutation, of course.)  Did I mention that I really don't
like the prefix property?
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Compression theory reference?

2004-08-31 Thread John Denker
Hadmut Danisch wrote:
It can be easily shown that there is no lossless 
compression method which can effectively compress every possible
... I need a book about computer science or encoding theory,
which explicitely says that this is impossible, in a way that a person
unexperienced in computer science (judges at a court) can read and 
at least see that this is not just my personal phantasy and at least
somehow reasonable. 
There are two conficting requirements there.
 -- authoritative and correct, and
 -- understandable to a judge with no technical expertise.
As you know, it is easy to satisfy either requirement separately.
My suggestions:
 1) Get a few expert witnesses to certify that your position is
certainly not a personal fantasy.  (I assume German jurisprudence
has something akin to the US notion of expert witnesses, right?)
 2) Try to get the burden-of-proof reversed.  The opposition
are claiming that known algorithms do the job.  Get them to be
specific about which algorithms they are referring to.  Then
file with the court some disks, each containing 1.44 megabytes
of data.  If the opposition are telling the truth, they should
be able to compress these using one of their chosen algorithms.
Offer to concede the entire case if they can shorten the text
by more than, say, 0.1 percent (i.e. 1.44 kilobytes shorter).
Of course you fill your disks using a good hardware random
number generator, e.g.
Be sure to arrange suitable precautions so that the opposition
doesn't get a chance to peek at your disks and build a
special-purpose rumplestiltskin encoder/decoder, i.e. one
that contains quotations of your data.  One precaution would
be to require that they use an open-source implementation, or
at least an impartial implementation, of a published algorithm.
 3) Diagram the pigeon-hole argument for the judge.  See
diagrams below.
 4) Don't forget the _recursion_ argument.  Take their favorite
algorithm (call it XX).  If their claims are correct, XX should
be able to compress _anything_.   That is, the output of XX
should _always_ be at least one bit shorter than the input.
Then the compound operation XX(XX(...)) should produce something
two bits shorter than the original input.  If you start with a
N-bit message and apply the XX function N-1 times, you should be
able to compress each and every message down to a single bit.

Here's how to diagram the pigeon-hole argument:
Write down all the three-bit numbers, and demonstrate a lossless
non-compressive encoding:
plaintext codetext
  000 -- 000
  001 -- 001
  010 -- 010
  011 ---\  /=== 011
  100 ===/  \--- 100
  101 -- 101
  110 -- 110
  111 -- 111
Then give an example of a lossy compressive code.  Make
the point that the codetext words are shorter than the
plaintext words.  However, there are necessarily fewer
of them, so the coding scheme is necessarily lossy and
plaintext codetext
  000 -- 00  zero
  001 -- 01  one
  010 -- 10  two
  011 ---=== 11  many
  100 /|
  101 /|
  110 /|
  111 /
Then give an example of a code that might or might not
be compressive, depending on statistical assumptions.
The following code is compressive if zero, one, and two
are considerably more probable than the other three-bit
numbers, but does is anti-compressive if all eight
three-bit numbers are equally likely, or nearly equally

  000 -- 00  zero
  001 -- 01  one
  010 -- 10  two
  011 -- 11000
  100 -- 11001
  101 -- 11010
  110 -- 11011
  111 -- 11100
Average length, for equiprobable plaintext words:
   (2+2+2+5+5+5+5+5) / 8 = 3.875
which is an expansion, not a compression.
Judges like fairness.  Certainly an equiprobable distribution
of input words is fair.  It reflects the bit-strings that
would be generated by tossing fair coins (three at a time),
or tossing a fair eight-sided die.  This fairest of distributions
is incompressible in the usual sense.

Finally, offer a challenge.  Write down all eight three-bit
plaintext words, and all four two-bit codetext words.  Ask
the judge to conclude for himself that it is obviously
impossible to draw lines connecting corresponding words in
a one-to-one fashion.
  000   00
  001   01
  010   10
  011   11

Note that in the last example, the codetext words were only one bit
shorter than the plaintext words.  If you want to pound on the point,
present the samething with four-bit plaintext and two-bit 

Re: First quantum crypto bank transfer

2004-08-24 Thread John Denker
Jerrold Leichter wrote:
... the comments I've seen on this list and elsewhere have been much 
broader, and amount to QM secure bit distribution is dumb, it solves
no problem we haven't already solved better with classical 
Most of the comments on this list are more nuanced than that.
Examples of sensible comments include:
 -- We have seen claims that QM solves the key distribution
  problem.  These claims are false.
 -- _Commercialization_ of QM bit-exchange is dumb, for now
  and for the forseeable future.  I am reminded of a slide
  Whit Diffie showed (in a different context) of an attempt
  to build a picket fence consisting of a single narrow pale
  a mile high ... while the rest of the perimeter remains
  undefended.  That's a dumb allocation of resources.  The
  opposition aren't going to attack the mega-pale;  they are
  going to go around it.  QM doesn't solve the whole problem.
  Sensible research should not be directed toward making the
  tall pale taller;  instead it should be directed toward
  filling in the gaps in the fence.
 Even if some snake-oil salesmen have attached themselves
 to the field doesn't say research in the field is worthless.
Be that as it may, there are other grounds for judging the
commercialization projects to be near-worthless.
Also, there is a world of difference between:
1.  Showing something is possible in principle;
2.  Making it work on the lab bench;
3.  Making it into something that works in the real world.
For QM key exchange, step 1 goes back maybe 10-15 years, and most
people thought it was a curiosity - that you could never maintain
coherence except in free space and over short distances.
That's backwards.  Quantum crypto free in space is hard.  It's
much easier to use a single-mode fiber, over distances such
that there is little total attenuation (which can be a quite
macroscopic distance, since the attenuation is a fraction of
a db/km if you do it right).
Step 2 is a couple of years back, the first surprise being that you
could actually make things work through fiber, then through a couple
of Km of fiber coiled on a bench.
Again, that diametrically misstates the physics.  Propagation
through a couple km of fiber shouldn't have surprised anybody.
BTW, if we look at QM *computation* in comparison, we've barely made
it through Step 1.  There are still plausible arguments that you
can't maintain coherence long enough to solve any interesting
Within a year of the invention of quantum computation,
people were working on quantum error correction.  This
is interesting work and has had spin-offs in the form
of changing how people think about error correction even
in non-quantum systems.  And it has had spin-offs
applicable to quantum cryptography, i.e. showing how it
is possible to survive a modest amount of attenuation.
Some of the papers I've seen solve the problem only in their titles:
They use a QM system, but they seem to only make classical bits
available for general use.   
Huh?  The world abounds in QM systems that produce classical
results, including e.g. transistors, lasers, practically all of
chemistry, etc. etc. etc.  Quantum computers produce classical
results because that is what is desired.
The contrast between this work and QM
key exchange is striking. 
If the intent is to make quantum cryptography sound better
than quantum computation, the point is implausible and
If the intent it so make the best results in quantum crypto
sound better than the lamest parts of quantum computation,
then the comparision is (a) unfair and (b) hardly a ringing
endorsement of quantum crypto.
after all, transistors were invented to build phone lines, not
It's not true that transistors were invented solely for
application to phone lines.  Even if it were true, it would
be irrelevant for mulitple reasons.  For starters, keep
in mind that the big computers built during the 1940s
were built using vast amounts of telecom switch gear.
Bletchley Park relied on engineers from the Post Office
(which was the 'phone company' in those days).
And even if the facts had been otherwise, arguments about
the near-term applicability of one technology are largely
irrelevant to the near-term applicability of another
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

The Call Is Cheap. The Wiretap Is Extra

2004-08-24 Thread John Denker
1) Here's an article from the New York Times.
The headline just about says it all.  Reportedly
THEY want voice-over-internet users to pay for
the privilege of having their calls tapped.
 The Call Is Cheap. The Wiretap Is Extra.
(I cite the version online at The Ledger because
folks can read it there without registering, unlike
the site.)
2) A modest proposal:
I think we should set up the following system:
  a) Users certify to their ISP that they use end-to-end
strong crypto on all their voice-over-internet calls, using
tamper-resistant (or at least tamper-evident) hardware and
  b) The ISP demands such certification from all users.
  c) The ISP declines to install wiretap equipment, and
passes the savings on to the users.
  ... Who could possibly object?
Note that traffic-analysis is still possible, but the
equipment to do that is much cheaper.
Also note that if THEY are going to bugger my endpoints
to defeat the crypto, they might as well do the job right
and change the signalling so that the call goes directly
to THEIR premises ... That way the ISP doesn't need to get
involved, i.e. the ISP has no tap-related costs.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Al Qaeda crypto reportedly fails the test

2004-08-02 Thread John Denker
News article
says in part:
The BBC's Zaffar Abbas, in Islamabad, says it appears that US
investigators were able to unscramble information on the computers
after Pakistan passed on suspicious encrypted documents.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Using crypto against Phishing, Spoofing and Spamming...

2004-07-18 Thread John Denker
Enzo Michelangeli wrote:
Can someone explain me how the phishermen escape identification and
prosecution? Gaining online access to someone's account allows, at
most, to execute wire transfers to other bank accounts: but in these
days anonymous accounts are not exactly easy to get in any country,
and anyway any bank large enough to be part of the SWIFT network
would cooperate in the resolution of obviously criminal cases.
Good question.
Actually there are two questions we should consider:
 a) What are the procedures phishermen are using today,
procedures that they manifestly *can* get away with?
 b) Why why why are they allowed to get away with such
Here is something of an answer to question (a):
The details are a bit sketchy, and maybe not entirely to
be trusted since they come from self-described crooks,
but they are plausible.
Still question (b) remains.  The described procedures seem
to be the e-commerce analog of parking your car in a bad
neighborhood with the windows rolled down and the keys in
the ignition.  That is, I expect that most people on this
list could easily think of several things the card-issuers
could do that would shut down these attack-procedures,
significantly raising the phishermen's work-factor and risk
of arrest -- without significantly burdening legitimate
merchands or cardholders.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Humorous anti-SSL PR

2004-07-15 Thread John Denker
J Harper [EMAIL PROTECTED] wrote:

This barely deserves mention, but is worth it for the humor:
Information Security Expert says SSL (Secure Socket Layer) is Nothing More
Than a Condom that Just Protects the Pipe
To which Eric Rescorla replied:
What's wrong with a condom that protects the pipe? I've used
condoms many times and they seemed to do quite a good job
of protecting my pipe.
The humor just keeps on coming.  It's always amusing to
see an invocation of the principle that I've tried it
on several occasions and it seemed to work, therefore
it must be trustworthy.
What's wrong with this depends, as usual, on the threat
model.  Sometimes it is wise to consider other parts
of the system (not just the pipe) in the threat model.
If we set you up on a blind date with an underfed grizzly,
you might find that protecting your pipe with a condom
doesn't solve all your problems.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Hyperencryption by virtual satellite

2004-07-11 Thread John Denker
Way back on 05/20/2004 Ivan Krstic wrote:
Michael O. Rabin  lectures on hyper-encryption and provably
everlasting secrets.
View here:
To my surprise, there has been no follow-up discussion
on this list.
  (Hint:  Most people on this list will want to skip
  the first four parts of the presentation, which are
  just an introduction to cryptography, and start with
  the part entitled hyper-encryption and semantic
We should not let Rabin's assertions to go unchallenged.
Here is a brief review:
In this presentation, there are some remarkable claims,
including claims of a method that produces provably
everlasting secrets.  Alas no such proof is provided;
the alleged proof is full of holes.
By way of example, a particularly glaring hole surrounds
the notion that a PSN (page server node) should serve
a given page only twice.  That causes all sorts of
difficulties to the legitimate system users (see the
discussion of page reconciliation) but poses no
particular burden on the attackers, who can passively
copy one of those two issuances.
As another example, the proof of correctness uses
randomness in invalid ways.  It asserts that the
system users will randomly select a subset of the
available pages of random numbers (which is OK as
far as it goes) but alas it further assumes that the
attackers will only be able to select pages by an
_independent_ random process.  Perhaps this comes
from an assumption that attacks will be directed
against the servers.  However, attacks can perfectly
well be directed against the system users' PCs.  The
attacker therefore only needs communication bandwidth
and storage capabilities comparable to the system
users who are under attack (!!not!! all users total).
As far as I can tell, the whole topic of hyper-encryption
would be more usefully discussed under the heading of
_privacy amplification_.  I get 2000 hits from
It remains to be seen whether these authors will make
any useful contributions in this area.  To do so, they
will need to come out with some more modest claims
and/or some stronger proofs.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: authentication and authorization (was: Question on the state of the security industry)

2004-07-01 Thread John Denker
Ian Grigg wrote:
The phishing thing has now reached the mainstream,
epidemic proportions that were feared and predicted
in this list over the last year or two. 
  For the first
time we are facing a real, difficult security
problem.  And the security experts have shot
their wad.
The object of phishing is to perpetrate so-called identity
theft, so I must begin by objecting to that concept on two
different grounds.
1) For starters, identity theft is a misnomer.  My identity
is my identity, and cannot be stolen.  The current epidemic
involves something else, namely theft of an authenticator ...
or, rather, breakage of a lame attempt at an authentication
and/or authorization scheme.  See definitions and discusions
in e.g. _Handbook of Applied Cryptography_
I don't know of any security experts who would think for a
moment that a reusable sixteen-digit number and nine-digit
number (i.e. credit-card and SSN) could constitute a sensible
authentication or authorization scheme.
2) Even more importantly, the whole focus on _identity_ is
pernicious.  For the vast majority of cases in which people
claim to want ID, the purpose would be better served by
something else, such as _authorization_.  For example,
when I walk into a seedy bar in a foreign country, they can
reasonably ask for proof that I am authorized to do so,
which in most cases boils down to proof of age.  They do
*not* need proof of my car-driving privileges, they do not
need my real name, they do not need my home address, and
they really, really, don't need some ID number that some
foolish bank might mistake for sufficient authorization to
withdraw large sums of money from my account.  They really,
really, reeeally don't need other information such as what
SCI clearances I hold, what third-country visas I hold, my
medical history, et cetera.  I could cite many additional
colorful examples, but you get the idea:  The more info is
linked to my ID (either by writing it on the ID card or
by linking databases via ID number) the _less_ secure
everything becomes.  Power-hungry governments and power-
hungry corporations desire such linkage, because it makes
me easier to exploit ... but any claim that such linkable
ID is needed for _security_ is diametrically untrue.
Returning to:
  For the first
 time we are facing a real, difficult security
 problem.  And the security experts have shot
 their wad.
I think a better description is that banks long ago
deployed a system that was laughably insecure.  (They got
away with it for years ... but that's irrelevant.)  Now
that there is widespread breakage, they act surprised, but
none of this should have come as a surprise to anybody,
expert or otherwise.
Now banks and their customers are paying the price.  As
soon as the price to the banks gets a little higher, they
will deploy a more-secure payment authorization scheme,
and the problem will go away.
(Note that I didn't say ID scheme.  I don't care who
knows my SSN and other ID numbers ... so long as they
cannot use them to steal stuff.  And as soon as there
is no value in knowing ID numbers, people will stop
phishing for them.)
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: TIA Offices Discovered

2004-06-10 Thread John Denker
R. A. Hettinga wrote:
From: Tefft, Bruce [EMAIL PROTECTED]
Where Big Brother Snoops on Americans 24/7

Although employees who work in the building are supposed to keep their
presence there a secret, they regularly sport their DARPA id badges around
their necks when eating at restaurants near the building. The straps
attached to the badges are printed with DARPA in large letters.
That's the main DARPA building.  For driving directions,
see the DARPA web site:
It's not very surprising that folks there wear DARPA badges.
Should we congratulate Hampton and Thompson on their discovery?
Or should we chip in and buy them each a new tinfoil hat?
Their full article may be found at:
I imagine parts of it are actually true.  But then again, a
stopped watch gives the correct time twice a day.
More-reliable accounts of TIA are readily available.  A
useful compendium is:
et cetera.
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

stego in the wild: bomb-making CDs

2003-12-28 Thread John Denker
] Thursday 25 December 2003, 17:13 Makka Time, 14:13 GMT
] Saudis swoop on DIY bomb guide
] Authorities in the kingdom have arrested five people after
] raiding computer shops selling compact disks containing
] hidden bomb-making instructions, a local newspaper reported
] on Thursday.
] Police were questioning four owners of computer shops in the
] southern Jazan region and a fifth person believed to have
] supplied the CDs to the shops, Al-Watan newspaper said.
] Officials were not immediately available for comment.
] The daily said some of the shop owners might not have known
] about the bomb-making tutorial files hidden on the CDs. Only
] someone with technical knowledge would be able to find the
] files.

That was quoted from:
and the same story, almost verbatim, was carried by Reuters.

 1) This is not entirely unprecedented.  Al Qaeda for years has
been hiding recruitment and training footage in the middle
of otherwise-innocuous video tape cassettes.  
 2) OTOH using a commercial distribution channel bespeaks a 
certain boldness ... somebody is thinking big.  
 3) Also: as a rule, anything you do with computers generates 
more headlines than doing the same thing with lower-tech methods.
This is significant to terrorists, who are always looking for
headlines.  Conversely it is significant to us, who have much
to lose when our not-so-fearless leaders over-react.
 4) One wonders how many CDs were distributed before the operation
was terminated.
 5) I wonder how the authorities found out about it.
 6) The article speaks of technical skill ... I wonder how 
much technical skill was required.  Probably not much.
 7) Did it rely entirely on security-by-obscurity, or was there 
crypto involved also?
(The latter is possible;  whatever leak told the authorities
where to look could also have told them the passphrase...
but the article didn't mention crypto.)

I suspect there is a lot more to this story..

The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]