Re: Looking through a modulo operation

Matt Ball matt.ball ieee.org wrote

Here is a C implementation of __random32:

typedef unsigned long u32;
struct rnd_state { u32 s1, s2, s3; };
static u32 __random32(struct rnd_state *state)
{
#define TAUSWORTHE(s,a,b,c,d) ((sc)d) ^ (((s a) ^ s)b)

state-s1 = TAUSWORTHE(state-s1, 13, 19, 4294967294UL, 12);
state-s2 = TAUSWORTHE(state-s2,  2, 25, 4294967288UL, 4);
state-s3 = TAUSWORTHE(state-s3,  3, 11, 4294967280UL, 17);

return (state-s1 ^ state-s2 ^ state-s3);
}

I see TAUSWORTHE (briefly tested with the above constants) isn't a
permutation of the 32-bit input state and is going to get very dull
when s is 0.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Looking through a modulo operation

On Mon, Jul 21, 2008 at 8:33 AM, Matt Ball [EMAIL PROTECTED] wrote:

If someone uses the __random32 function as defined in the 2.6.26
Linux kernel, and leaks to you the result of taking successive outputs
modulo 28233 (= 9 * 3137), can you determine the probable 96-bit
internal state with fewer than 1000 outputs and with modest processing
power (e.g., a laptop computer running less than a day)?

Another attacking avenue is the 32-bit initial seed.  If the
implementation re-seeds frequently, or leaks to you the first outputs
after initialization, then you only have to brute-force the 32-bit
seed space, times the number of samples since reseeding.

Here is the function that performs the initial seeding, based on a 32-bit seed:

static void __set_random32(struct rnd_state *state, unsigned long s)
{
if (s == 0)
s = 1;  /* default seed is 1 */
#define LCG(n) (69069 * n)
state-s1 = LCG(s);
state-s2 = LCG(state-s1);
state-s3 = LCG(state-s2);
/* warm it up */
__random32(state);
__random32(state);
__random32(state);
__random32(state);
__random32(state);
__random32(state);
}

For those who want to get cleaver, I suspect there is an optimization
for running __random32 six times.

--
Thanks!
Matt Ball, IEEE P1619.x SISWG Chair
M.V. Ball Technical Consulting, Inc.
Phone: 303-469-2469, Cell: 303-717-2717
http://www.mvballtech.com

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Looking through a modulo operation

Matt Ball writes:
Another attacking avenue is the 32-bit initial seed.  If the
implementation re-seeds frequently, or leaks to you the first outputs
after initialization, then you only have to brute-force the 32-bit
seed space, times the number of samples since reseeding.

Well, that's good and broken then, isn't it?  No ifs about it.
It doesn't matter whether the implementation re-seeds frequently or
rarely.  It doesn't matter whether it leaks to you the first outputs
after initialization or only later ones.  If it's using a 32-bit seed,
this sucker is just plain broken, period.

The attack is trivial -- it's just exhaustive keysearch.

Attack: Given ~ 3 known outputs from the RNG, try all possible 32-bit
seeds.  You'll be able to recognize when your guess at the seed is correct
because the output from your guessed seed will match the observed output.
This assumes you know the offsets of your outputs (how many times the
RNG has been cranked before producing your known outputs), but even
if you only have partial information about that you can still make a
variant of this attack work.  The exhaustive search attack has to try
only 2^32 possibilities, so I'm guessing this attack should only take
minutes (at worst, hours) on a fast computer.

My earlier email about fancy attacks on this scheme is irrelevant.
There's no need to bother with fancy attacks, when the PRNG only uses
a 32-bit seed -- that's just blatantly insecure.  This is a textbook
error, one of the oldest mistakes in the book.  (For example, look
here:  http://www.cs.berkeley.edu/~daw/papers/ddj-netscape.html)

Using this PRNG for security-critical purposes would not be wise.

I'll include your code snippet for seeding the PRNG below.  Note: I'm

Here is the function that performs the initial seeding, based on a 32-bit seed:

static void __set_random32(struct rnd_state *state, unsigned long s)
{
if (s == 0)
s = 1;  /* default seed is 1 */
#define LCG(n) (69069 * n)
state-s1 = LCG(s);
state-s2 = LCG(state-s1);
state-s3 = LCG(state-s2);
/* warm it up */
__random32(state);
__random32(state);
__random32(state);
__random32(state);
__random32(state);
__random32(state);
}

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Looking through a modulo operation

On Sun, Jul 20, 2008 at 04:14:33PM -0600, Matt Ball wrote:

From a little bit of off-line discussion, I think I've got a restatement of
the problem that is more suitable to those with a stronger programming
background than mathematical background:

If someone uses the __random32 function as defined in the 2.6.26 Linux
kernel, and leaks to you the result of taking successive outputs modulo
28233 (= 9 * 3137), can you determine the probable 96-bit internal state
with fewer than 1000 outputs and with modest processing power (e.g., a
laptop computer running less than a day)?

Here is a C implementation of __random32:

typedef unsigned long u32;
struct rnd_state { u32 s1, s2, s3; };
static u32 __random32(struct rnd_state *state)
{
#define TAUSWORTHE(s,a,b,c,d) ((sc)d) ^ (((s a) ^ s)b)

state-s1 = TAUSWORTHE(state-s1, 13, 19, 4294967294UL, 12);
state-s2 = TAUSWORTHE(state-s2,  2, 25, 4294967288UL, 4);
state-s3 = TAUSWORTHE(state-s3,  3, 11, 4294967280UL, 17);

return (state-s1 ^ state-s2 ^ state-s3);
}

After any consecutive 96 outputs, the 97th is a fixed linear function of
those just observed. It is not necessary to determine the internal state.

The internal state is s_n = (A**n)(s_0) for a fixed 96x96 matrix A (the
fact that it is a direct product of 3 32-bit matrices is not important).
This matrix has a minimum polynomial of degree at most 96.

A**96 = c_95 * A**95 + c_94 * A**94 ... + c_0 * I

The 32-bit output then also satisfies:

x_96 = c_95 * x_95 + c_94 * x_94 ... + c_0

for the same coefficients.

--

/\ ASCII RIBBON  NOTICE: If received in error,
\ / CAMPAIGN Victor Duchovni  please destroy and notify
X AGAINST   IT Security, sender. Sender does not waive
/ \ HTML MAILMorgan Stanley   confidentiality or privilege,
and use is prohibited.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Re: Looking through a modulo operation

Florian Weimer writes:
I've got a function f : S - X x S where S = (Z/2Z)**96 and
X = (Z/2Z)**32.  Suppose that s_0 is fixed and (x_i, s_i) = f(s_{i-1}).
(f implements a PRNG.  The s_i are subsequent internal states and the
x_i are results.)

Now f happens to be linear.  I know the values of x_i, x_{i+1}, ...,
x_{i+k} modulo N, [where N = 28233].
Is it possible to recover s_i with reasonable effort (better than brute
force, and k should be in the hundreds, not thousands)?

Yes.  I will show two attacks.  The results: Attack #1 is a
straightforward improvement to exhaustive search, and takes 2^52 work (~
10-20 CPU-years?).  Attack #2 is a more sophisticated meet-in-the-middle
attack, and takes 2^37 work and 2^35 space; the best instantiation I've
found looks like it would involve 16GB of RAM and a few days or weeks
of computation.  Both attacks recover the entire state of the PRNG,
given only a handful of known outputs.  It's possible there may be
other, better attacks.

My conclusion is that this PRNG is not cryptographically strong and
should not be used anywhere that you may face an adversary who is
motivated to break the PRNG.  In short, this PRNG is broken.

Attack #1: Given x mod N, there are only about 2^32/28233 = 2^17.2
possibilities for x, and it's easy to enumerate them all.  Suppose we
know x_1 mod N, x_2 mod N, ..., x_7 mod N.  Enumerate all
(2^17.2)^3 = 2^51.6 possibilities for x_1, x_2, and x_3.  For each
such possibility, you know 96 bits of output from a linear system,
and there are 96 unknowns, so you can solve for s_0.  Given s_0,
you can compute the predicted value of x_4, .., x_7 and compare them
to the observed values of x_4 mod N, etc.  If everything matches,
you've found the original seed s_0 and broken the PRNG.

This requires trying 2^51.6 possibilities, so it's a bit less than
2^51.6 work.  That's already a small enough number that the system
is not cryptographically secure.

This can be sped up slightly by noting that x_4 is a linear function
of x_1,x_2,x_3.  We can precompute the form of that linear function
and express it as a 32x96 matrix.  The best approach I can see will take
something like 500 CPU cycles to compute x_4 from x_1,..,x_3, so I'd
expect the total number of CPU cycles needed at something like 2^60.6
or so.  On a 4GHz machine, that's like 13 CPU-years.  Of course it is
trivially parallelizable.

Attack #2: If we were given inferred values of x_1, x_2, x_3, and x_4,
we could check them for consistency, since they represent 128
equations in 96 unknowns.  In particular, we can find a linear function
g from 128 bits to 32 bits such that g(x_1,x_2,x_3,x_4) = 0 if
x_1,..,x_4 are consistent with having been produced by this PRNG.
In fact, this can be broken down further, so that if x_1,..,x_4
are consistent, they will satisfy this equation:
h(x_1,x_2) = h'(x_3,x_4).
Call this the check equation.  Here h,h' are linear functions from 64
bits to 32 bits, so if x_1,..,x_4 are not consistent, they have only a
2^-32 chance of satisfying this check equation.

Suppose we are given x_1 mod N, .., x_8 mod N.  We'll enumerate all
2^34.4 possibilities for x_1,x_2, compute h(x_1,x_2) for each, and
store them in a hash table or sorted list keyed on the 32-bit value
of h(x_1,x_2).  Once that table is built, we'll enumerate all 2^34.4
possibilities for x_3,x_4, compute h'(x_3,x_4) for each, and find
all occurrences in the table where
h(x_1,x_2) = h'(x_3,x_4).
On average, we expect to find about 2^34.4/2^32 = 2^2.4 matches.
For each such match, compute the predicted value of x_5 as a function
of x_1,..,x_4 and see whether it agrees with the observed value of
x_5 mod N, etc., to weed out false matches.  In total, you'll need
to explore 2^34.4 + 2^34.4*2^2.4 ~= 2^37 possibilities, and you'll
need space for 2^34.4 entries in the table.

We can store the table as a sorted list.  This list will take up about
1600 GB, and you could buy enough hard disks for that at ~ \$2000, say.
You can add an index in memory that tells you, for each value of
h(x_1,x_2), which block or sector on disk those entries of the list is
found at.  You'll need to make 2^34.4 random seeks into this hard disk
array.  With good disk head scheduling algorithms you might be able to
get the seek time down a bit, but even optimistically we're probably
still talking about a year of computation.

Fortunately, we can trade time for space.  With 16GB of RAM, we can
store 1/128th of the list: namely, all values where the low 7 bits
of h(x_1,x_2) are some fixed value V.  We cycle through all choices
of V, repeating the attack once for each V.  With this choice, I expect
the amount of time spent waiting for RAM to be comparable to the CPU
cost of the attack.  For each V, we have to evaluate a linear function
2^35.4 times.  It looks to me like this will take a few CPU-days.

Disclaimers: I have not checked any of the above analysis.  I have
not implemented it to test whether these attacks actually work.  Even
if the

Re: Looking through a modulo operation

On Mon, Jul 21, 2008 at 12:03:50PM -0400, Victor Duchovni wrote:

On Sun, Jul 20, 2008 at 04:14:33PM -0600, Matt Ball wrote:

From a little bit of off-line discussion, I think I've got a restatement of
the problem that is more suitable to those with a stronger programming
background than mathematical background:

If someone uses the __random32 function as defined in the 2.6.26 Linux
kernel, and leaks to you the result of taking successive outputs modulo
28233 (= 9 * 3137), can you determine the probable 96-bit internal state
with fewer than 1000 outputs and with modest processing power (e.g., a
laptop computer running less than a day)?

I skipped over this part when reading the original message. Expecting per
Florian's original message the outputs to be a linear function of the
inputs, but they are not.

After any consecutive 96 outputs, the 97th is a fixed linear function of
those just observed. It is not necessary to determine the internal state.

This of course applies to the 32-bit output of the RNG. The operation
of reducing the 32-bit output modulo 28333, is not linear (over the
F_2 bit string vector-space). While:

x_96 = c_95 * x_95 + ... c_0 * x_0

this is only true bitwise modulo 2. It is not obvious how one might
recover the full 32-bit outputs from the truncated outputs.

--

/\ ASCII RIBBON  NOTICE: If received in error,
\ / CAMPAIGN Victor Duchovni  please destroy and notify
X AGAINST   IT Security, sender. Sender does not waive
/ \ HTML MAILMorgan Stanley   confidentiality or privilege,
and use is prohibited.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]

Looking through a modulo operation

I've got a function f : S - X x S where S = (Z/2Z)**96 and
X = (Z/2Z)**32.  Suppose that s_0 is fixed and (x_i, s_i) = f(s_{i-1}).
(f implements a PRNG.  The s_i are subsequent internal states and the
x_i are results.)

Now f happens to be linear.  I know the values of x_i, x_{i+1}, ...,
x_{i+k} module N, for a fixed, known N.  N is odd (but divisible by 9).
Is it possible to recover s_i with reasonable effort (better than brute
force, and k should be in the hundreds, not thousands)?  And if yes, how?
Prediction of candidates for x_{i+k+1} with high probability would be