Matthew,
Your approach is interesting, but it claims a gain of about 100 in reduced
cycle time, while giving up the ability to use the knowledge that ten key
bits are zero, which is a factor of 1000. So it is not a win in that sense.
However your analysis does suggest that going to a full 64 bit key would
not yield a corresponding gain in strength for A5/1.
I did think about precomputing the shift register state after part of the
key is setup and then loading that in parallel, but the extra circuitry
needed to enable parallel load of the shift registers might eat up most of
the thruput savings.
Also if you have 64 circuits running in parallel synchronously, then you
can't assume they will all fail in two cycles on the average. If you want
the engines to run asynchronously, then you need a 64-bit counter per
engine which would increase the footprint by a factor of two or so.
Finally, it is not clear to me that you can run A5 backwards after the 100
mixing steps because then the non-linear majority voting circuit is
operating. I haven't thought that question through.
Regards,
Arnold Reinhold
At 8:24 PM +0000 5/11/99, Matthew Francey wrote:
>You wrote:
>
>>The bad news is that Weiner's engine could do one key comparison per clock.
>>Here is an estimate of the number of clocks required to check an A5/1 key:
>>
>>Key load 64
>>Frame Number load 22
>
>There are only 64 bits of state in the A5 "engine". The extra 22 bits
>of framing are actually bogus and can be "wrapped" with the 64 bits of
>key. That is: given any A5 state, one can compute a key, given a frame
>number of choice. It is a simple linear system.
>
>Hence, 22 cycles go away.
>
>However, the 64 bits of key themselves are bogus. Rather than loading
>the key bit-by-bit, it would make more sense to directly increment a
>64 bit quantity and mass-load into the 3 shift registers for setup.
>
>Hence, 64 cycles go away.
>
>>Setup mix 100
>
>One can also dispense with these 100 cycles, since they too are
>pre-determined. That is, given the result of the previous 64 + 22 + 100
>cycles, one can compute the input key, given a frame number. One just
>runs A5 backwards 100 steps, then solves the previously mentioned linear
>system.
>
>Hence, 100 more cycles go away.
>
>>Output compare 24
>
>Output comparison is 1 cycle per bit. Simply step the engine until
>a bit-miscompare occurs. Half fail on first bit, half of the half on
>the next bit, etc. The expected value of the clock count to a mis-compare
>is 2 cycles.
>
>So 22 more cycles are saved.
>
>>Reset 2
>>
>>Total 212
>
>Given 1 cycle to transfer from counter to state registers then
>increment counter, and then O(2) cycles to check the key, and
>say 1 cycle for recovery/overhead/whatever, we have, say, 4
>cycles per A5 key.
>
>The bit-comparison might be reduced to almost exactly one cycle
>in almost all cases with suitable "lookahead" logic, instead of
>stepping the A5 engine once per bit.
>
>Given suitable pipelining, the entire mechanism is could be reduced
>to one A5 key per cycle.
>
>>Performance ratio: 212/64 = 3.3 times worse than the Wiener DES
>>search design.
>
>Which leaves us at 16-64 times faster than a DES search engine. So
>we have, in effect, O(60) key bits.
>
>The "problem" with this approach is that the use of the 10 "free"
>key bits is difficult to imagine -- they have been mangled into the
>64 bit free-counter. Perhaps saving the situation is that all of this
>logic is extremely simple, and (except for the look-head idea) requires
>no memory accesses or other latencies. This may permit clocking the
>whole system at very high rates -- billions of keys per second *per
>A5 engine*. One is tempted to bread-board a 100MHz version...
>
>[Note: I monitor Cypherpunks at
>http://www.inet-one.com/cypherpunks/current/index.html; the list volume
>is too high for my tastes. You can submit this rambling missive to the
>list if you think it would be interesting to them.]