Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-27 Thread David Malone
On Sat, Mar 25, 2006 at 07:26:51PM -0500, John Denker wrote:
 Executive summary:  Small samples do not always exhibit average behavior.

That's not the whole problem - you have to be looking at the right
average too.

For the long run encodability of a set of IID symbols produced with
probability p_i, then that average is the Shannon Entropy.  If
you're interested in the mean number of guesses (per symbol) required
to guess a long word formed from these symbols, then you should be
looking at (\sum_i \sqrt(p_i))^2. Other metrics (min entropy, work
factor, ...) require other averages.

To see this behaviour, you both need a large sample and the right
type of average to match your problem (and I've assumed IID).

David.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-26 Thread John Kelsey
From: John Denker [EMAIL PROTECTED]
Sent: Mar 24, 2006 11:57 AM
To: Erik Zenner [EMAIL PROTECTED], cryptography@metzdowd.com
Subject: Re: Entropy Definition (was Re: passphrases with more than 160 bits 
of entropy)

Erik Zenner wrote:

0 occurs with probability 1/2
each other number from 1 to 2^{160}+1 happens with 
probability 2^{-161}.

  Is anyone aware of whether (and where) this was 
 discussed in the literature, or what other approaches are taken?

This particular problem is contrived or at least exaggerated.  

The example is a contrived, pathological case.  And there are a bunch
of easy solutions once you know this is the distribution.  The point
is to demonstrate that Shannon entropy gives you the wrong answer when
you try to use it here.

Now, you might ask if this is a problem in a real-world setting.  So
let's imagine a very simple distribution--sequences of flips of a
biased coin.  There are nice ways to remove the bias, but let's
imagine we're not doing that--instead, we're going to take a sequence
of coin flips, turn it into a 0/1 sequence, and then hash it down to
get a key.  

Suppose the coin is biased so that heads comes up 0.9 of the time, and
that we generate 16-bit sequences at a time.  The Shannon entropy of a
16-bit sequence is about 7.5, but the most likely symbol (all heads)
comes up about 18% of the time.  So if we tried to generate a 7-bit
key, we'd lose on the attacker's first guess 18% of the time. 

So, let's go further with this.  We want to generate a DES key, with
56 bits of entropy.  A 64-bit sequence produced with this biased coin
has Shannon entropy of about 60 bits, but an attacker has about a
1/1000 chance of guessing the DES key we derive from it on his first
try, which is unacceptable for just about any crypto application.
(The min-entropy is about ten bits.)

So yes, I used a contrived example to demonstrate the problem, but no,
the problem isn't just an ivory-tower concern.

The intuition here is that Shannon entropy is concerned with
communications channels, where we assume we have to send every
symbol.  So when you have lots of low-probability symbols, and one of
those low-probability symbols is chosen 999 times out of a 1000, the
amount of bandwidth you need to transmit those symbols easily becomes
dominated by them.  Most of the time, you're sending one of a large
set of symbols, and they all have to be distinguished from one
another.  

The situation for the attacker is a lot more like having a channel
where it's acceptable to only send a small fraction of those symbols,
and just drop the rest.  That is, it's okay for the attacker's model
to just ignore the huge set of low-probability symbols that occur
999/1000 of the time, and instead just transmit the highest
probability symbol 1/1000 of the time.  Instead of transmitting it,
he's just guessing it.  When he gets it right, he learns your DES
key.  

...
This version serves to illustrate, in an exaggerated way, the necessity
of not assuming that the raw data words are IID (independent and identically
distributed).

Yes!  The independent part is much harder to deal with than the
per-symbol distribution, in many real-world applications.  The worst
of these are operating system events (used in Windows and various
Unixes to get random bits).  Peter Gutmann did some really nice work
on this on a bunch of operating systems in a Usenix paper, and he
updated that and has a version of it on his website.  If you're doing
OS-event sampling to generate random bits, you really ought to look at
his stuff.  (But if you can get some hardware support, either directly
or via sampling the microphone like the Turbid design does, you're
probably on much firmer ground.)  

--John Kelsey, NIST


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-26 Thread John Denker

In the context of

 0 occurs with probability 1/2
 each other number from 1 to 2^{160}+1 happens with
 probability 2^{-161}.

I wrote:

 This ... serves to illustrate, in an exaggerated way, the necessity
 of not assuming that the raw data words are IID (independent and identically
 distributed).

Correction:  IID isn't the issue here.  The raw data words described
above are IID.  That's not the problem.  The problem is that the
distribution is highly skewed.

Executive summary:  Small samples do not always exhibit average behavior.

Let me explain.  There is a very simple way to look at this.

Consider the expression for the entropy of the source,
   S := Sum_i P_i log(1/P_i)  [1]

where i runs over all symbols in the alphabet.

One may, with a little care, attribute log(1/P_i) bits of unpredictability
to the (i)th symbol in the alphabet.  Then S can be interpreted as the
appropriately weighted _average_ unpredictability per symbol.

In the example quoted above, the minimum log(1/P_i) is vastly smaller than
the average log(1/P_i) -- namely 1 versus 161.  So focussing on the average
is unlikely to solve all the world's problems.

In crypto applications (including RNG construction), a crude yet provably
correct approach is to rely on the minimum (not the average) per-symbol
unpredictability.  Using this approach, it would require 80 samples of the
given distribution to produce an output with 80 bits of entropy.

Things can only get better from here:
 -- With full knowledge of the source statistics, one can distinguish the large
  log(1/Pi) words from the small log(1/Pi) words.  This would allow the number
  of required samples to be closer to the typical value (2) than to the 
worst-case
  value (80).  An example of this is the Huffman coding I discussed in my 
previous
  note.
 -- With even modest knowledge of the given source, one can (by histogramming
  if nothing else) estimate the probabilities well enough to yield worthwhile
  improvements in efficiency.
 -- If you want to produce a long sequence of output words, as you often do, a
  reasonable-sized buffer is all you really need to come fairly close to optimal
  efficiency, namely only about 1 sample of the distribution, on average, per
  80-bit output word.  The chance of getting fooled can be made verrry small,
  at a modest cost in terms of buffer-size and extra samples of the 
distribution.
  That is, the law of large numbers comes to your rescue sooner or later, 
usually
  rather soon.
 -- It may be possible to engineer a different source with larger minimum
  log(1/Pi).

Bottom line:  Setting up a highly-skewed distribution and then drawing from
it only a single sample is not guaranteed to produce average behavior.  Duh,
no surprise there!

The source entropy S in equation [1] is a single number that characterizes the
average behavior.  For a small sample from a highly-skewed distribution, S
doesn't tell you everything you need to know.  This has no bearing on the
definition of entropy;  entropy is defined by equation [1].  It just means that
equation [1] by itself doesn't solve all the world's problems.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


RE: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-25 Thread John Kelsey
From: Erik Zenner [EMAIL PROTECTED]
Sent: Mar 24, 2006 4:14 AM
To: cryptography@metzdowd.com
Subject: RE: Entropy Definition (was Re: passphrases with more than 160 bits 
of entropy)

...
 [I wrote:]
 0 occurs with probability 1/2
 each other number from 1 to 2^{160}+1 happens with 
 probability 2^{-161}.
 
 The Shannon entropy on this distribution is 81.5 bits.  But 
 if you tried to sample it once to generate an 80-bit Skipjack 
 key, half the time, I'd guess your key on my first try.  

It's entirely correct that entropy is the wrong measure here, but
the question is how a good measure would look like. 

There are a bunch of different entropy measures.  Shannon entropy is
the right one for compression ratios and channel capacity, but not for
difficulty of guessing the string.

Assume that you have a sample space with N elements and an intelligent 
attacker (i.e., one that tries the most probable elements first). Then 
what you actually are interested in is that the attacker's probability 
of success after q sampling attempts is as close as possible to the 
lowest possible, namely q * 2^{-N}. A natural way of measuring this 
seems to be some kind of distance between Pr[succ after q samples] and 
the ideal function q * 2^{-N}. Such a measure might allow a designer
to decide whether a non-perfect distribution is still acceptable or
simply far out. Is anyone aware of whether (and where) this was 
discussed in the literature, or what other approaches are taken?

The best discussion I've seen of this is in a couple papers by John
Pliam, about something he calls the work function, W(n).  This is just
the probability of having guessed some secret value after n operations
(typically n guesses). You can generalize this to the number of
operations you have to carry out to have a given probability of
violating any security property (repeating a nonce, getting a block
collision in a block cipher chaining mode, etc).  It's a very general
way of thinking about limiting parameters of crypto algorithms.
You're basically heading toward this idea in what you said above.

Let's think of this for an ideal case: a 128-bit key.  When you have
done 2^k guesses of the key, you have a 2^{n-k} chance of having
guessed the key correctly.  So if you graphed the work vs probability
of success on a log/log chart, you'd get a straight line for the ideal
case.  W(n) = 2^{-128} n, as you said above.  

Now, consider the case where you are generating a 128-bit key from a
bunch of sampled events on your computer that have been concatenated
together in a string S.  Imagine making a list of all the possible
values of that string, and sorting them in descending order of
probability.  You now have the best possible sequence of guesses.
W(n) is the sum of the first n probabilities.  

If W(n)  2^{-128} n for any n, then the attacker has some point where
it is to his advantage to guess S instead of guessing your key.

So, this is where min-entropy comes in.  Min-entropy is just
-lg(P[max]), the base 2 log of the maximum probability in the
distribution.  You can convince yourself than if the min-entropy of S
is at least 128, then it's never easier to guess S than to guess K.
This is because each possible input string must have probability lower
than 2^{-128}, so the sum of the first n, W(n)  n 2^{-128}.  

This doesn't solve the much harder engineering/data analysis problem
of getting some reasonable approximation for the probabilities of S,
unfortunately.  The easy way to solve that is to design a hardware RNG
in such a way that you pretty-much know the probabilty model to use
and can work out how to sample it to get a good estimate of the
probabilities.  However, it is kind of nice to recognize that you only
have to estimate the largest probability to compute the min-entropy.
(It ought to be the easiest probability to measure and bound!)  

Erik

--
Dr. Erik Zenner   Phone:  +45 39 17 96 06Cryptico A/S
Chief Cryptographer   Mobile: +45 60 77 95 41Fruebjergvej 3
[EMAIL PROTECTED]   www.cryptico.com   DK 2100 Copenhagen

--John Kelsey, NIST


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread John Denker

Hal Finney wrote:
...

This is true, in fact it is sometimes called the universal distribution
or universal measure.  In more detail, it is a distribution over all
finite-length strings.  The measure for a particular string X is defined
as the sum over all programs that output X of 1/2^L_i, where L_i is the
length of each such program.


There is no such that as the universal measure;  rather
there are lots of universal measures.  Universality is a
rather arcane property of the measure, the term doesn't mean
what most people think it means.
  -- Universal does NOT mean all-purpose.
  -- Universal does NOT mean general-purpose.
  -- There are many inequivalent universal distributions, most
   of which are not what you want for any given application.
  -- Certain things that are true asymptotically are not true
   for _any_ practical application.
  -- Ratio converging to 1 does not imply difference converging to 0.

This is probably not the right forum for cleaning out this Augean stable.

...

The universal measure for a string can also be thought of as the
probability that, given a fixed Universal Turing Machine (UTM), a randomly
chosen input program will output that string.

So this is clearly a probability distribution (with some technicalities
regarding issues of program lengths being glossed over here) as John
Denker says.  However to go from this to a notion of entropy is more
questionable.


Not really questionable.  If you have a probability, you have an
entropy.


Shannon entropy is defined over a probability distribution.  That is,
given a probability distribution we can find its Shannon entropy by
summing -p_i / log2(p_i) over all members of the distribution.  If we
approximate the universal distribution by 1/2^n for strings of length
n, this sum is clearly divergent.  So there is not really a meaningful
Shannon entropy for this probability distribution, or in other words
the entropy is infinite.


The entropy _per string_ is unbounded, as it jolly well should be for
random strings of unbounded length.  That's true but uninteresting.
A more interesting quantity is the _entropy density_ aka the entropy
_per symbol_ which for typical long random bit-strings is one bit of
entropy per bit of string-length.  Similarly if you have a string of
symbols over a 32-symbol alphabet, you would expect to see five bits
of entropy per symbol for a typical long random string.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


RE: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread Erik Zenner

 Shannon entropy is the one most people know, but it's all 
 wrong for deciding how many samples you need to derive a key. 
  The kind of classic illustration of this is the probability 
 distirbution:
 
 0 occurs with probability 1/2
 each other number from 1 to 2^{160}+1 happens with 
 probability 2^{-161}.
 
 The Shannon entropy on this distribution is 81.5 bits.  But 
 if you tried to sample it once to generate an 80-bit Skipjack 
 key, half the time, I'd guess your key on my first try.  

It's entirely correct that entropy is the wrong measure here, but
the question is how a good measure would look like. 

Assume that you have a sample space with N elements and an intelligent 
attacker (i.e., one that tries the most probable elements first). Then 
what you actually are interested in is that the attacker's probability 
of success after q sampling attempts is as close as possible to the 
lowest possible, namely q * 2^{-N}. A natural way of measuring this 
seems to be some kind of distance between Pr[succ after q samples] and 
the ideal function q * 2^{-N}. Such a measure might allow a designer
to decide whether a non-perfect distribution is still acceptable or
simply far out. Is anyone aware of whether (and where) this was 
discussed in the literature, or what other approaches are taken?

Erik

--
Dr. Erik Zenner   Phone:  +45 39 17 96 06Cryptico A/S
Chief Cryptographer   Mobile: +45 60 77 95 41Fruebjergvej 3
[EMAIL PROTECTED]   www.cryptico.com   DK 2100 Copenhagen

This e-mail may contain confidential information which is intended for
the addressee(s) only and which may not be reproduced or disclosed to
any other person. If you receive this e-mail by mistake, please contact
Cryptico immediately and destroy the e-mail. Thank you.

As e-mail can be changed electronically, Cryptico assumes no
responsibility for the message or any attachments. Nor will Cryptico be
responsible for any intrusion upon this e-mail or its attachments. 


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread Ed Gerck

Someone mentioned Physics in this discussion and this
was for me a motivation to point out something that
has been forgotten by Shannon, Kolmogorov, Chaitin
and in this thread.

Even though Shannon's data entropy formula looks like an
absolute measure (there is no reference included), the often
confusing fact is that it does depend on a reference. The
reference is the probability model that you assume to fit
the data ensemble. You can have the same data ensemble and
many different (infinite) probability models that fit that
data ensemble, each one giving you a valid but different
entropy value. For example, if a source sends the number 1
1,000 times in a row, what would be the source's entropy?

Aram's assertion that the sequence of bytes from 1-256 has
maximum entropy would be right if that sequence came as one of
the possible outcomes of a neutron counter with a 256-byte
register. Someone's assertion that any data has entropy X
can be countered by finding a different probability model that
also fits the data, even if the entropy is higher (!). In short,
a data entropy value involves an arbitrary constant.

The situation, which seems confusing, improves when we realize
that only differences in data entropy can be actually measured,
when the arbitrary constant can be canceled -- if we are careful.

In practice, because data security studies usually (and often
wrongly!) suppose a closed system, then, so to say automatically,
only difference states of a single system are ever considered.
Under such circumstances, the probability model is well-defined
and the arbitrary constant *always* cancel. However, data systems
are not really closed, probability models are not always ergodic
or even accurate. Therefore, due care must be exercised when
using data entropy.

I don't want to go into too much detail here, which results
will be available elsewhere, but it is useful to take a brief
look into Physics.

In Physics, Thermodynamics, entropy is a potential [1].
As is usual for a potential, only *differences* in entropy
between different states can be measured. Since the entropy
is a potential, it is associated with a *state*, not with
a process. That is, it is possible to determine the entropy
difference regardless of the actual process which the system
may have performed, even whether the process was reversible or
not.

These are quite general properties. What I'm suggesting is
that the idea that entropy depends on a reference also applies
to data entropy, not just the entropy of a fluid, and it solves
the apparent contradictions (often somewhat acid) found in data
entropy discussions. It also explains why data entropy seems
confusing and contradictory to use. It may actually be a much
more powerful tool for data security than currently used.

Cheers,
Ed Gerck

[1] For example, J. Kestin, A Course in Thermodynamics, Blaisdell,
1966.



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-24 Thread John Denker

Ed Gerck wrote:


In Physics, Thermodynamics, entropy is a potential [1].


That's true in classical (19th-century) thermodynamics, but not
true in modern physics, including statistical mechanics.  The
existence of superconductors and superfluids removes all doubt
about the absolute zero of entropy.  For details, see
  http://www.av8n.com/physics/thermo-laws.htm#sec-spectator
  http://www.av8n.com/physics/thermo-laws.htm#sec-secret-s


As is usual for a potential, only *differences* in entropy
between different states can be measured. 


Not true.


These are quite general properties. 


They are neither general nor relevant to crypto.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

Aram Perez wrote:

* How do you measure entropy? I was under the (false) impression that  
Shannon gave a formula that measured the entropy of a message (or  
information stream).


Entropy is defined in terms of probability.  It is a measure of
how much you don't know about the situation.  If by message
you mean a particular message that you are looking at, it has
zero entropy.  It is what it is;  no probability required.

If by stream you mean a somewhat abstract notion of an ensemble
of messages or symbols that you can only imperfectly predict, then
it has some nonzero entropy.

* Can you measure the entropy of a random oracle? Or is that what  both 
Victor and Perry are saying is intractable?


That is a tricky question for a couple of reasons.
 a) It will never be completely answerable because the question hinges
  on the word random, which means different things to different people.
  Thoughtful experts use the word in multiple inconsistent ways.
 b) It also depends on just what you mean by measure.  Often it
  is possible to _ascertain_ the entropy of a source ... but direct
  empirical measurements of the output are usually not the best way.


* Are there units of entropy?


Yes.  It is naturally _dimensionless_, but dimensionless quantities
often have nontrivial units.  Commonly used units for entropy
include
 ++ bits
 ++ joules per kelvin.  One J/K equals 1.04×10^23 bits
For more on this, see
  http://www.av8n.com/physics/thermo-laws.htm#sec-s-units


* What is the relationship between randomness and entropy?


They are neither exactly the same nor completely unrelated.
A pseudorandom sequence may be random enough for many
applications, yet has asymptotically zero entropy density.
A sequence of _odd_ bytes is obviously not entirely random,
yet may have considerable entropy density.

* (Apologies to the original poster) When the original poster  requested 
passphrases with more than 160 bits of entropy, what was he requesting?


When you apply a mathematical function to an ensemble of inputs, it
is common to find that the ensemble of outputs has less entropy than
the ensemble of inputs. A simple pigeonhole argument suffices to show
that a function whose output is represented in 160 bits cannot possibly
represent more than 160 bits of entropy per output.  So if you want
the ensemble of outputs to have more than 160 bits of entropy, it is
necessary to do something fancier than a single instance of SHA-1.

* Does processing an 8 character password with a process similar to  
PKCS#5 increase the entropy of the password?


No.


* Can you add or increase entropy?


Shuffling a deck of cards increases the entropy of the deck.

For more on this, see
  http://www.av8n.com/physics/thermo-laws.htm#sec-entropy

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread Jack Lloyd
On Wed, Mar 22, 2006 at 03:29:07PM -0800, Aram Perez wrote:

 * How do you measure entropy? I was under the (false) impression that  
 Shannon gave a formula that measured the entropy of a message (or  
 information stream).

He did give a formula for the entropy of a source; however the caculation is
based on the probabilties of each symbol appearing. Unless you know those, you
can't actually apply the formula. So it is computable in theory, just not in
pratice for any source that is at all interesting.

 * Can you measure the entropy of a random oracle? Or is that what  
 both Victor and Perry are saying is intractable?

A random oracle, by definition, produces a completely random output. However,
since random oracles don't actually exist that does not seem to be a terribly
interesting thing to be measuring the entropy of.

 * Are there units of entropy?

Bits are usually the most intuitive/useful unit.

 * What is the relationship between randomness and entropy?

I have a vague feeling this question requires a deeper answer than I'm able to
provide.

 * Does processing an 8 character password with a process similar to  
 PKCS#5 increase the entropy of the password?

No, because there are no additional secrets. Knowledge of the password is all
you need to rederive the final output, thus clearly there is no additional
information (ie, entropy) in the output that was not in the original password.

This ignores the salt, iteration count, and the specification of the algorithm
itself, but those are all (usually) public. So they contribute to the entropy,
they do not contribute to the conditional entropy, which is what we are usually
interested in when thinking about entropy and crypto.

 * Can you add or increase entropy?

Yes. Let's say the contents of tommorrow's NY Times has n bits of entropy (we
probably can't actually calculate n, but we know it is a finite and fixed
value). And the LA Times from the same day will also have some amount of
entropy (call it n'). However, the entropy of the two papers combined would
(probably) not be n+n', but some number x st min(n,n') = x = n+n', because
the two papers will report on many of the same issues, carry some of the same
AP stories, and so forth. This redundant information doesn't increase the
entropy (reading the same AP story in a second newspaper wouldn't give you any
new information).

A book you may find interesting is Elements of Information Theory by Cover 
Thomas.

-Jack

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread Sandy Harris
Aram Perez [EMAIL PROTECTED] wrote:

 So, if you folks care to educate me, I have several questions related
 to entropy and information security (apologies to any physicists):

I'll answer the easier questions. I'll leave the harder ones for someone
with a better grounding in information theory.

 * What is the relationship between randomness and entropy?

Roughly, they both measure unpredictability. Something that is hard
to predict is random, or has high entropy. There are mathematical
formulations that make this a lot more precise, but that's the basic
idea.

 * Does processing an 8 character password with a process similar to
 PKCS#5 increase the entropy of the password?

Absolutely not!

At best, you preserve the original entropy, just distributing it
differently. If you get the processing wrong, you can reduce or
even entirely destroy it, but no amount of any kind of processing
can increase it.

 * Can you add or increase entropy?

You can add more entropy, either from another source or more
from the same source. That is the only way to increase it.

--
Sandy Harris
Zhuhai, Guangdong, China

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Kelsey
From: Jack Lloyd [EMAIL PROTECTED]
Sent: Mar 22, 2006 11:30 PM
To: cryptography@metzdowd.com
Subject: Re: Entropy Definition (was Re: passphrases with more than 160 bits 
of entropy)

...

As an aside, this whole discussion is confused by the fact that there
are a bunch of different domains in which entropy is defined.  The
algorithmic information theory sense of entropy (how long is the
shortest program that produces this sequence?) is miles away from the
information theory sense of entropy, and even that has a bunch of
different flavors.  

For the information theory meanings of entropy, what you're really
talking about is a property of a probability distribution.  You can do
that in terms of Shannon entropy if you want to know about bounds on
average bandwidth requirements for transmitting symbols from that
distribution.  You can look at guessing entropy if you want to know
the -lg(expected work to guess the next symbol).  You can look at
min-entropy if you want a hard bound on how many symbols you need to
sample to derive a 128-bit key that won't ever expose you to
weaknesses based on how you selected it.  And so on.  

Shannon entropy is the one most people know, but it's all wrong for
deciding how many samples you need to derive a key.  The kind of
classic illustration of this is the probability distirbution:

0 occurs with probability 1/2
each other number from 1 to 2^{160}+1 happens with probability
2^{-161}.

The Shannon entropy on this distribution is 81.5 bits.  But if you
tried to sample it once to generate an 80-bit Skipjack key, half the
time, I'd guess your key on my first try.  

...

Yes. Let's say the contents of tommorrow's NY Times has n bits of entropy (we
probably can't actually calculate n, but we know it is a finite and fixed
value). And the LA Times from the same day will also have some amount of
entropy (call it n'). However, the entropy of the two papers combined would
(probably) not be n+n', but some number x st min(n,n') = x = n+n', because
the two papers will report on many of the same issues, carry some of the same
AP stories, and so forth. This redundant information doesn't increase the
entropy (reading the same AP story in a second newspaper wouldn't give you any
new information).

Right.  If the probabilities are independent, you can add the
entropies.  That's why they're defined in log terms.  

--John


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

John Kelsey wrote:


As an aside, this whole discussion is confused by the fact that there
are a bunch of different domains in which entropy is defined.  The
algorithmic information theory sense of entropy (how long is the
shortest program that produces this sequence?) is miles away from the
information theory sense of entropy, and even that has a bunch of
different flavors.  


I would have said almost the opposite.  The great power and beauty
of the entropy idea is that it has the *same* meaning across many
domains.  Entropy is defined in terms of probability, period.  Any
other definition is either
  a) exactly equivalent,
  b) an approximation, valid under this-or-that restrictive conditions, or
  c) wrong.

With some slight fiddling to get the normalization right, 1/2
raised to the power of (program length) defines a probability
measure.  This may not be the probability you want, but it
is a probability, and you can plug it into the entropy definition.

The problem is almost never understanding the definition of
entropy.  The problem is almost always ascertaining what is
the correct probability measure applicable to the problem
at hand.

Don't get me started about universal probability measures.
That has an arcane narrow technical meaning that is verrry widely
misunderstood.



0 occurs with probability 1/2
each other number from 1 to 2^{160}+1 happens with probability
2^{-161}.

The Shannon entropy on this distribution is 81.5 bits. 


I think it is very close to 81 bits, not 81.5, but that is a minor
point that doesn't change the conclusion:


 But if you
tried to sample it once to generate an 80-bit Skipjack key, half the
time, I'd guess your key on my first try.  


Yeah, but if I sample it N times, with high probability I can generate
a large number of very good keys.  This problem is faced by (and solved
by) any decent TRNG.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread Greg Rose

At 22:09  -0500 2006/03/22, John Denker wrote:

Aram Perez wrote:


* Can you add or increase entropy?


Shuffling a deck of cards increases the entropy of the deck.


As a minor nit, shuffling *in an unpredictable manner* adds entropy, 
because there is extra randomness being brought into the process. If 
I was one of those people who can do a perfect riffle shuffle, 
reordering the cards in this entirely predictable manner does not 
increase or decrease the existing entropy.


So in one sense, the answer is a simple no... nothing you can do to 
a passphrase can increase its (that is, the passphrase's) entropy. 
You can add randomness from another source, and increase the total 
entropy, but I don't think that is relevant to the original question.


Greg.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Kelsey
From: John Denker [EMAIL PROTECTED]
Sent: Mar 23, 2006 1:44 PM
To: John Kelsey [EMAIL PROTECTED], cryptography@metzdowd.com
Subject: Re: Entropy Definition (was Re: passphrases with more than 160 bits 
of entropy)

...
With some slight fiddling to get the normalization right, 1/2
raised to the power of (program length) defines a probability
measure.  This may not be the probability you want, but it
is a probability, and you can plug it into the entropy definition.

No, this isn't right for the algorithmic information theory meaning at
all.  For that measure, we can intelligently discuss the entropy of a
specific random string, without reference to a probability model.
Indeed, what's the probability distribution of the sequence of bits
defined by Chaitin's Omega?  

You can certainly complain that they should have used a different term
than entropy here, but you're not going to get these to be the same.  

The problem is almost never understanding the definition of
entropy.  The problem is almost always ascertaining what is
the correct probability measure applicable to the problem
at hand.

Well, you need to decide which of the probability distribution kinds
of entropy measures to use, and that differs in different
applications.  If you use min-entropy or guessing entropy to estimate
the limits on how well some sequence of symbols will compress, you'll
get a pretty lousy answer.  The same goes for using Shannon entropy to
determine whether you have collected enough entropy in your pool to
generate a 128-bit AES key.  

...
 0 occurs with probability 1/2
 each other number from 1 to 2^{160}+1 happens with probability
 2^{-161}.
 
 The Shannon entropy on this distribution is 81.5 bits. 

I think it is very close to 81 bits, not 81.5, but that is a minor
point that doesn't change the conclusion:

  But if you
 tried to sample it once to generate an 80-bit Skipjack key, half the
 time, I'd guess your key on my first try.  

Yeah, but if I sample it N times, with high probability I can generate
a large number of very good keys.  This problem is faced by (and solved
by) any decent TRNG.

Hmmm.  I've seen plenty of people get this wrong.  If you use Shannon
entropy as your measure, and then say when I have collected 128 bits
of Shannon entropy in my pool, I can generate a 128-bit AES key, you
will generate keys that aren't as secure as you think they are.  Now,
most TRNGs seem to evade this and many other problems by designing the
hardware to give a relatively simple probability model.  If your
probability model is close to uniform, then all these probability
distribution based entropy measurements converge (with a constant
difference).   

In the case above, we had better specify N.  If you sample it 16
times, then you have a 2^{-16} chance of still being in a weak state.
Once you get enough samples that the probability of being in the
pathological worst case is negligible (whatever that means in your
application), then you can start generating output bits.  That's
probably somewhere between N=32 and N=64.  

--John

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread John Denker

I wrote:

With some slight fiddling to get the normalization right, 1/2
raised to the power of (program length) defines a probability
measure.  This may not be the probability you want, but it
is a probability, and you can plug it into the entropy definition.

John Kelsey wrote:


No, this isn't right for the algorithmic information theory meaning at
all.  


OK, in a moment we will have gone through four plies of no-it-isn't
yes-it-is no-it-isn't yes-it-is.  Let's get serious.  The axiomatic
definition of a measure is
  -- a mapping from sets to numbers
  -- positive
  -- additive on the countable union of disjoint sets

And a probability measure has the further property of being
  -- bounded above

I have checked that -- with due attention to trivial details --
.5 ^ (program length) satisfies this definition.  If you wish to
renew the assertion that there is no such probability measure, please
explain which of the axiomatic requirements is not met.  Please be
specific.

 For that measure, we can intelligently discuss the entropy of a
 specific random string, without reference to a probability model.

That's like saying we can talk about three-dimensional force, velocity,
and acceleration without reference to vectors.

Measure theory is the tried-and-true formalism for dealing with random
strings.  It would be spectacularly contrary to ignore the formalism,
and just plain wrong to say the formalism is inapplicable.


Indeed, what's the probability distribution of the sequence of bits
defined by Chaitin's Omega?  


Huh?  Omega is so obviously a probability that usually the word probability
is used in its definition.  See e.g.
  http://mathworld.wolfram.com/ChaitinsConstant.html
I suppose a masochistic nitpicker could demand to see a proof that this word
is justified, but I'm not going to bother, for reasons given above.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-23 Thread Hal Finney
This is getting pretty far afield from cryptography but it is a topic
I find very interesting so I can't resist jumping in.

John Denker writes:
 OK, in a moment we will have gone through four plies of no-it-isn't
 yes-it-is no-it-isn't yes-it-is.  Let's get serious.  The axiomatic
 definition of a measure is
-- a mapping from sets to numbers
-- positive
-- additive on the countable union of disjoint sets

 And a probability measure has the further property of being
-- bounded above

 I have checked that -- with due attention to trivial details --
 .5 ^ (program length) satisfies this definition.  If you wish to
 renew the assertion that there is no such probability measure, please
 explain which of the axiomatic requirements is not met.  Please be
 specific.

This is true, in fact it is sometimes called the universal distribution
or universal measure.  In more detail, it is a distribution over all
finite-length strings.  The measure for a particular string X is defined
as the sum over all programs that output X of 1/2^L_i, where L_i is the
length of each such program.

Often the algorithmic complexity of a string is defined as the length
of the shortest program to output the string.  The universal measure is
based on the same idea, but takes into consideration that there may be
multiple programs that output the same string.  Each program of length L_i
contributes 1/2^L_i to the string's measure.  If there is only one short
program and all others are much longer, then the probability measure
is essentially 1/2^C where C is the length of this shortest program,
i.e. the algorithmic complexity.

The universal measure for a string can also be thought of as the
probability that, given a fixed Universal Turing Machine (UTM), a randomly
chosen input program will output that string.

So this is clearly a probability distribution (with some technicalities
regarding issues of program lengths being glossed over here) as John
Denker says.  However to go from this to a notion of entropy is more
questionable.

There are a countably infinite number of finite strings, and all of
them have non-zero probabilities under this distribution.  This means
that for most strings, the probability must be very low, asymptotically
approaching zero.  In fact you can show that for most strings of length n,
their measure is 1/2^n; this is equivalent to saying that most strings
are effectively random and have no shorter program to output them.

Shannon entropy is defined over a probability distribution.  That is,
given a probability distribution we can find its Shannon entropy by
summing -p_i / log2(p_i) over all members of the distribution.  If we
approximate the universal distribution by 1/2^n for strings of length
n, this sum is clearly divergent.  So there is not really a meaningful
Shannon entropy for this probability distribution, or in other words
the entropy is infinite.

John Kelsey asked:
  Indeed, what's the probability distribution of the sequence of bits
  defined by Chaitin's Omega?  

This probability distribution is defined only over finite strings and so
Omega is not within the universe of this distribution.  It should also be
noted that it is impossible for an n bit program to output more than n
bits of Omega (plus or minus an additive constant specific to the UTM).
Hence even if we consider successive approximations to Omega of ever
increasing length, their measures would tend asymptotically to zero.

Hal Finney

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Taral
On 3/21/06, Travis H. [EMAIL PROTECTED] wrote:
 Does anyone have a good idea on how to OWF passphrases without
 reducing them to lower entropy counts?

I've frequently seen constructs of this type:

H(P), H(0 || P), H(0 || 0 || P), ...

If entropy(P)  entropy(H), the entries will be independent, theoretically.

--
Taral [EMAIL PROTECTED]
You can't prove anything.
-- Gödel's Incompetence Theorem

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Joseph Ashwood
- Original Message - 
From: Travis H. [EMAIL PROTECTED]

Subject: passphrases with more than 160 bits of entropy



I was thinking that one could hash the first block, copy the
intermediate state, finalize it, then continue the intermediate result
with the next block, and finalize that.  Is this safe?  Is there a
better alternative?


Use a bigger hash, SHA-512 should be good to basically 512-bits, same for 
Whirlpool. If those aren't enough for you, use the chaining mode from 
Whirlpool and Rijndael with appropriate size.
   Joe 




-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Stefan Lucks
 Does anyone have a good idea on how to OWF passphrases without
 reducing them to lower entropy counts?  That is, I've seen systems
 which hash the passphrase then use a PRF to expand the result --- I
 don't want to do that.  I want to have more than 160 bits of entropy
 involved.

What kind of application are you running, that  150 bits of entropy is
insufficient?

 I was thinking that one could hash the first block, copy the
 intermediate state, finalize it, then continue the intermediate result
 with the next block, and finalize that.  Is this safe?  Is there a
 better alternative?

As I understand your proposal, you split up the passphrase P into L Blocks
P_1, ..., P_L, (padding the last block P_L as neccessary) and then you
output L*160 bit like this:

F(P) = ( H(P_1), H(P_1,P_2), ..., H(P_1, P_2, ..., P_L) ).

This does not look like a good idea to me:

   1.   If the size of the P_i is the internal block size of the hash
function (e.g., 512 bit for SHA-1, SHA-256, or RIPEMD-160) and
your message P=P_1 is just one block long, you definitively end
with (at most) 160 bit of entropy, how large the entropy in P_1
is (could be 512 bit).

   2.   If the local entropy in each block P_i (or even the conditional
entropy in P_i, given all the P_{i-1}, ..., P_1) is low, then you
can step by step find P. This function F(P) is thus *much* *worse*
than its simple and straight counterpart H(P).

   3.   In fact, to calculate the entropy of F(P), you can take the
conditional entropy in P_i. The entropy of F(P) is close to the
maximum of these conditional entropies ...


Any better solution I can think of will be significantly less performant
than just applying H(P). One idea of mine would be the function G:

   *Let i be a counter of some fixed size, say 32 bit.

   *Let J+1 be the number of 160-bit values you need (e.g., J = 4*L).

   *G(P) = ( H(P_1,0,P_1,P_2,0,P_2, ..., P_L,0,P_L),
 H(P_2,1,P_2, ..., P_L,1,P_L,P_1,1,P_1),
...
 H(P_L,J,P_L,P_1,J,P_1, ..., P_{L-1},J,P_{L-1})
   )

Would that be OK for you application?

In any case, I think that using a 160-bit hash function as a building
block for a universal one-way function with (potentially) much more than
160-bit of entropy is a bit shaky.




-- 
Stefan Lucks  Th. Informatik, Univ. Mannheim, 68131 Mannheim, Germany
e-mail: [EMAIL PROTECTED]
home: http://th.informatik.uni-mannheim.de/people/lucks/
--  I  love  the  taste  of  Cryptanalysis  in  the  morning!  --


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Aram Perez

On Mar 22, 2006, at 4:28 AM, Thierry Moreau wrote:


Travis H. wrote:

Hi,
Does anyone have a good idea on how to OWF passphrases without
reducing them to lower entropy counts?  That is, I've seen systems
which hash the passphrase then use a PRF to expand the result --- I
don't want to do that.  I want to have more than 160 bits of entropy
involved.


More than 160 bits is a wide-ranging requirement.

Entropy is a highly discussed unit of measure.


And very often confused. While you do want maximum entropy, maximum  
entropy is not sufficient. The sequence of the consecutive numbers 0  
- 255 have maximum entropy but have no randomness (although there is  
finite probability that a RNG will produce the sequence).


Regards,
Aram Perez


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Aram Perez

On Mar 22, 2006, at 9:04 AM, Perry E. Metzger wrote:



Aram Perez [EMAIL PROTECTED] writes:

Entropy is a highly discussed unit of measure.


And very often confused.


Apparently.


While you do want maximum entropy, maximum
entropy is not sufficient. The sequence of the consecutive numbers 0
- 255 have maximum entropy but have no randomness (although there is
finite probability that a RNG will produce the sequence).


One person might claim that the sequence of numbers 0 to 255 has 256
bytes of entropy.


It could be, but Shannon would not.


Another person will note the sequence of numbers 0-255 completely
describes that sequence and is only 30 bytes long.


I'm not sure I see how you get 30 bytes long.


Indeed, more
compact ways yet of describing that sequence probably
exist. Therefore, we know that the sequence 0-255 does not, in fact,
have maximum entropy in the sense that the entropy of the sequence
is far lower than 256 bytes and probably far lower than even 30 bytes.


Let me rephrase my sequence. Create a sequence of 256 consecutive  
bytes, with the first byte having the value of 0, the second byte the  
value of 1, ... and the last byte the value of 255. If you measure  
the entropy (according to Shannon) of that sequence of 256 bytes, you  
have maximum entropy.



Entropy is indeed often confusing. Perhaps that is because both the
Shannon and the Kolmogorov-Chaitin definitions do not provide a good
way of determining the lower bound of the entropy of a datum, and
indeed no such method can exist.


No argument from me.

Regards,
Aram Perez

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread leichter_jerrold
| Let me rephrase my sequence. Create a sequence of 256 consecutive  
| bytes, with the first byte having the value of 0, the second byte the  
| value of 1, ... and the last byte the value of 255. If you measure  
| the entropy (according to Shannon) of that sequence of 256 bytes, you  
| have maximum entropy.
Shannon entropy is a property of a *source*, not a particular sequence
of values.  The entropy is derived from a sum of equivocations about
successive outputs.

If we read your create a sequence..., then you've described a source -
a source with exactly one possible output.  All the probabilities will
be 1 for the actual value, 0 for all other values; the equivocations are
all 0.  So the resulting Shannon entropy is precisely 0.

-- Jerry

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Perry E. Metzger

Aram Perez [EMAIL PROTECTED] writes:
 On Mar 22, 2006, at 9:04 AM, Perry E. Metzger wrote:


 Aram Perez [EMAIL PROTECTED] writes:
 Entropy is a highly discussed unit of measure.

 And very often confused.

 Apparently.

 While you do want maximum entropy, maximum
 entropy is not sufficient. The sequence of the consecutive numbers 0
 - 255 have maximum entropy but have no randomness (although there is
 finite probability that a RNG will produce the sequence).

 One person might claim that the sequence of numbers 0 to 255 has 256
 bytes of entropy.

 It could be, but Shannon would not.

No, he wouldn't. You did, however. The maximum entropy a string of 256
bytes could have would be 256*8 bits. Since we're talking about a
sequence with far less entropy than 256*8 bits, it is not a sequence
of maximum entropy. There are, of course, trivially produced sequences
of maximum entropy.

 Another person will note the sequence of numbers 0-255 completely
 describes that sequence and is only 30 bytes long.

 I'm not sure I see how you get 30 bytes long.

$ echo the sequence of numbers 0-255 | wc -c
  30

Now, of course, there are probably not more than 1.5 bits of entropy
per letter in that sentence fragment, so really we're probably talking
about ~6 bytes of information. Doubtless, though, cleverer people
could do better.

 Indeed, more
 compact ways yet of describing that sequence probably
 exist. Therefore, we know that the sequence 0-255 does not, in fact,
 have maximum entropy in the sense that the entropy of the sequence
 is far lower than 256 bytes and probably far lower than even 30 bytes.

 Let me rephrase my sequence. Create a sequence of 256 consecutive
 bytes, with the first byte having the value of 0, the second byte the
 value of 1, ... and the last byte the value of 255.

We heard you the first time.

The Shannon information of a message is the negative of the log (base
2) of the probability of the message. Of course, that definition is
only really useful if you're talking about a sequence of messages. The
Kolmogorov-Chaitin information of a text is (roughly) the smallest
program that can generate the text.

Both of these definitions are getting at can be informally described
as the smallest representation of a piece of information.

If Alice asks Bob what color are your eyes, Bob could send a 10M
high resolution image of his eye, a precise measurement of the
spectrum reflected by his eyes, the word blue, or perhaps even
something shorter, like a couple of bits that, according to a
pre-agreed table, represent an eye color from a table of eye
colors. The smallest possible representation of just the eye color,
the couple of bits from a table of eye color codes, is likely closest
to the information content of someone's eye color, though a precise
measurement is impossible since it is highly context dependent.

 If you measure the entropy (according to Shannon) of that sequence
 of 256 bytes, you have maximum entropy.

Clearly you don't, since the sequence can be described with far less
information than 256 bytes. A completely random and incompressible
sequence of 256 bytes would have maximum entropy, since it is
impossible to compress to less than 256*8 bits, but the sequence
0..255 has very little entropy because it is easily compressed to a
smaller size. This should be obvious. If I start reciting to you
zero. one. two... for a few iterations the probability of the next
byte will be 1/256 or close to it. Shortly, though, anyone who isn't
an idiot will guess what the next value (with nearly probability 1) in
the sequence is, and the information content of subsequent bytes falls
to far less than one bit per byte. This is just another way of saying
that the smallest program that generates 0..255 is quite small, or
that you can easily compress 0..255 into a description that fits in
far less than 256 bytes.

 Entropy is indeed often confusing. Perhaps that is because both the
 Shannon and the Kolmogorov-Chaitin definitions do not provide a good
 way of determining the lower bound of the entropy of a datum, and
 indeed no such method can exist.

 No argument from me.

I thought you were, indeed, still arguing.


Perry

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Perry E. Metzger

[EMAIL PROTECTED] writes:
 | Let me rephrase my sequence. Create a sequence of 256 consecutive  
 | bytes, with the first byte having the value of 0, the second byte the  
 | value of 1, ... and the last byte the value of 255. If you measure  
 | the entropy (according to Shannon) of that sequence of 256 bytes, you  
 | have maximum entropy.

 Shannon entropy is a property of a *source*, not a particular sequence
 of values.  The entropy is derived from a sum of equivocations about
 successive outputs.

 If we read your create a sequence..., then you've described a source -
 a source with exactly one possible output.  All the probabilities will
 be 1 for the actual value, 0 for all other values; the equivocations are
 all 0.  So the resulting Shannon entropy is precisely 0.

Shannon information certainly falls to zero as the probability with
which a message is expected approaches 1. Kolmogorov-Chaitin
information cannot fall to zero, though it can get exceedingly small.

In either case, though, I suspect we're in agreement on what entropy
means, but Mr. Perez is not familiar with the same definitions that
the rest of us are using.

Perry

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Matt Crawford
Let me rephrase my sequence. Create a sequence of 256 consecutive  
bytes, with the first byte having the value of 0, the second byte  
the value of 1, ... and the last byte the value of 255. If you  
measure the entropy (according to Shannon) of that sequence of 256  
bytes, you have maximum entropy.


I so often get irritated when non-physicists discuss entropy.  The  
word is almost always misused. I looked at Shannon's definition and  
it is fine, from a physics point of view.  But if you apply  
thoughtfully to a single fixed sequence, you correctly get the answer  
zero.


If your sequence is defined to be { 0, 1, 2, ..., 255 }, the  
probability of getting that sequence is 1 and of any other sequence,  
0.  Plug it in.


If you have a generator of 8-bit random numbers and every sample is  
independent and uniformly distributed, and you ran this for a  
gazillion iterations and wrote to the list one day saying the special  
sequence { 0, 1, 2, ..., 255 } had appeared in the output, that's a  
different story.  But still, we would talk about the entropy of your  
generator, not of one particular sequence of outputs.



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread Perry E. Metzger

Victor Duchovni [EMAIL PROTECTED] writes:
 Actually calculating the entropy for real-world functions and generators
 may be intractable...

It is, in fact, generally intractable.

1) Kolmogorov-Chaitin entropy is just plain intractable -- finding the
   smallest possible Turing machine to generate a sequence is not
   computable.
2) Shannon entropy requires a precise knowledge of the probability of
   all symbols, and in any real world situation that, too, is
   impossible.

Usually, the best you can do is produce (bad) bounds, and sometimes
not even that.

One thing that can be done, of course, is that you can prove, under
certain assumptions, that it would take an intractable amount of
computation to distinguish a particular PRNG from a true random
sequence with greater than 50% probability. However, this is very much
NOT the same as showing that the PRNG sequence contains an endless
stream of entropy -- in fact, the unicity distance very clearly shows
you how much real entropy you have, and it usually isn't
much. Merely because too much computation would be needed does not
mean that you've created entropy -- you've just made it hard for the
opponent to get at your PRNG seed.

Anyone who considers arithmetical methods of producing random digits
is, of course, in a state of sin. -- John von Neumann

Perry

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Entropy Definition (was Re: passphrases with more than 160 bits of entropy)

2006-03-22 Thread Aram Perez

On Mar 22, 2006, at 2:05 PM, Perry E. Metzger wrote:


Victor Duchovni [EMAIL PROTECTED] writes:
Actually calculating the entropy for real-world functions and  
generators

may be intractable...


It is, in fact, generally intractable.

1) Kolmogorov-Chaitin entropy is just plain intractable -- finding the
   smallest possible Turing machine to generate a sequence is not
   computable.
2) Shannon entropy requires a precise knowledge of the probability of
   all symbols, and in any real world situation that, too, is
   impossible.


I'm not a cryptographer nor a mathematician, so I stand duly  
corrected/chastised ;-)


So, if you folks care to educate me, I have several questions related  
to entropy and information security (apologies to any physicists):


* How do you measure entropy? I was under the (false) impression that  
Shannon gave a formula that measured the entropy of a message (or  
information stream).
* Can you measure the entropy of a random oracle? Or is that what  
both Victor and Perry are saying is intractable?

* Are there units of entropy?
* What is the relationship between randomness and entropy?
* (Apologies to the original poster) When the original poster  
requested passphrases with more than 160 bits of entropy, what was  
he requesting?
* Does processing an 8 character password with a process similar to  
PKCS#5 increase the entropy of the password?

* Can you add or increase entropy?

Thanks in advance,
Aram Perez

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: passphrases with more than 160 bits of entropy

2006-03-22 Thread John Denker

Matt Crawford wrote:

I so often get irritated when non-physicists discuss entropy.  The  word 
is almost always misused. 


Yes, the term entropy is often misused ... and we have seen some
remarkably wacky misusage in this thread already.  However, physicists
do not have a monopoly on correct usage.  Claude S was not a physicist,
yet he definitely knew what he was talking about.  Conversely, I know
more than a few card-carrying physicists who have no real feel for what
entropy is.

I looked at Shannon's definition and  it is 
fine, from a physics point of view.  


Indeed.

But if you apply  thoughtfully to a 
single fixed sequence, you correctly get the answer  zero.


I agree with all that, except for the But.  Shannon well knew that
the entropy was zero in such a situation.

If your sequence is defined to be { 0, 1, 2, ..., 255 }, the  
probability of getting that sequence is 1 and of any other sequence,  
0.  Plug it in.


Indeed.

If you have a generator of 8-bit random numbers and every sample is  
independent and uniformly distributed, and you ran this for a  gazillion 
iterations and wrote to the list one day saying the special  sequence { 
0, 1, 2, ..., 255 } had appeared in the output, that's a  different 
story.  But still, we would talk about the entropy of your  generator, 
not of one particular sequence of outputs.


Yes.  Shannon called it the source entropy, i.e. the entropy of
the source, i.e. the entropy of the generator.


Perry Metzger wrote:


Usually, the best you can do is produce (bad) bounds, and sometimes
not even that.


Huh?  What's your metric for usually?  I'll agree as a matter of
principle that whatever you're doing, you can always do it wrong.
But that doesn't prevent me from doing it right.  I can use physics
to produce good bounds, that is,
  http://www.av8n.com/turbid/



===

The problem posed by the OP is trivial, and good solutions have already
been posted.  To recap: SHA-512 exists, and if that isn't good enough,
you can concatenate the output of several different one-way functions.
You can create new hash functions at the drop of a hat by prepending
something (a counter suffices) to the input to the hash.

Example:  result = hash(1 || pw)  ||  hash(2 || pw)  ||  hash(3 || pw)


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]