Re: [A51] 2.5x efficiency increase by removing illegal states

M vd S Sun, 10 Jan 2010 08:16:09 -0800

On 1/9/10 7:30 PM, sascha wrote:
> Help me understand this please:
>
> Not all of the 2^64 possible A5/1 register states is valid, considering
> the 100+ extra clockings before. Ok that makes sense.
> You say 40% out of all states can be clocked backwards more than 16 times,
> lets assume that is also the number of states that can be clocked backwards
> 100+ times (do you have a better number here?)
>   
Here you go:


16 back: 57% illegal states
32 back: 68% illegal states
48 back: 74% illegal states
64 back: 78% illegal states
80 back: 82% illegal states
96 back: 84% illegal states

These numbers are based on one chain. (de001bc0006f0000 [0] -> 
36dc483fb4fe0000)

I verified superficially with chains
de001bc034563200 [0] -> d59ebd1df40d8000
f1234678e5f7d600 [0] -> 61df56b387178000

and both give the same pattern. Clocking back 150 times (the maximum in 
practice) gives 89% illegal states.

Apart from the optimization, we can also observe that the state space 
converges to 2^64 * (1-84%) before generating the keystream. This must 
also mean that only 16% of 2^64 possible 64 bit keystream can exist. I 
wouldn't know how this can be of value to us right now, but it is worth 
noting.

> Our keyspace shrinks down to 2^64 * 0.4, provided that we can eliminate the
> illegal states. 16 extra forward clockings would slow down the table
> generation by 25%, do you think this would rule out the majority of the
> illegal states? I think backclocking is out of the question for the table
> generation, as it would slow down everything too much.
>   
There is no real need for backclocking to find illegal states. A 32kb 
lookup table will suffice to know if a state can be clocked back 6 times 
(2^(3*6) bits = 2^15 bytes). This would rule out some 47% of states as 
illegal, leaving 37% illegal states alongside 16% legal states. This 
would be a factor 2 improvement at the cost of 32kb of memory and a 
fairly simple lookup. But for bitsliced implementations this maybe isn't 
so easy at all.

Forward clocking to sanitize the state is something I tried, but we must 
be sure that doesn't lead to another subset of state space which does 
not encompass the actual observed state space in practice.

Since there is no need to generate keystream for these forward 
clockings, the cost might be slightly less than that for 
keystream-generating clocking. Sanitizing only illegal states by forward 
clocking doesn't sound like a great idea.


Simply clocking forward N times modifies the 96 backclock figure as follows:
N    S (96 back illegal states)
0  84%
1  75%
2  72%
4  68%
8  63%
16 53%
32 37%
64 12%
96 0%

defining cpu cost factor as (1+N/64), and table efficiency factor as 
(1-S(N))/(1-S(0)) (so for N=4 we get 32% legal states which is twice 16% 
at N=0, and thus eff.factor=2)

and taking total efficiency as (1-S(N))/(1-S(0)) / (1+N/64), we get:
N    S    cpu cost    table eff    total eff
0    84%    1.00    1.00    1.00
1    75%    1.02    1.56    1.54
2    72%    1.03    1.75    1.70
4    68%    1.06    2.00    1.88
8    63%    1.13    2.31    2.06
16    53%    1.25    2.94    2.35
32    37%    1.50    3.94    2.63
64    12%    2.00    5.50    2.75
96    0%    2.50    6.25    2.50

Which looks to me like we should clock forward 64 times before taking 
the keystream. I don't think this will actually double the cpu cost, 
since we don't need to xor the keystream together and store the 
resulting bits.

If someone could give a better estimation of the 1+N/64 figure for the 
actual cuda implementation this would be great. In the formula the extra 
cost should be related not only to the generation of one round, but 
everything in the chain generating loop as well. Only if the figure 
would go to 1+N/192, there is no reason left to not just clock forward 
100 times.

(I thought about clocking forward 150 times, but that would reduce the 
chance to find a first 64 bit keystream state to 25%, since after 50 
steps the state space is 75% smaller)

> Another strategy would be to flip bits of an illegal state to change it
> into the closest (in term of bit flips) legal state. This would be very
> fast and should reduce the number of left over illegal states by quite a bit.
>
>   
This is possible, but this is not trivial. I think the risk here would 
be that you'd introduce some bias towards certain states, introducing 
convergence and therefore more chain mergers.

> Also note that the great majority of values in a table are never looked up
> but exist only as a link between the state we are interested in and the
> end value that is looked up in the data base. A false positive that does
> not pass the backclocking test is a rare case and does not influence the
> attack time very much. (is this true? how long does it take to to the
> backclocking?). Still we would need 2 times the storage if we use the old
> method.
>   
I think that effective cpu time is cut in half as well, which seems to 
me to be the biggest advantage.

> Your optimization seems to suggest that we can cut down our search space
> by roughly 50% if we invest a few more (processor) clock cycles at each
> link. If you would find a fast way to turn illegal states into legal
> (or probable legal) states then this could be used for new tables.
> Since you already have code running, could you try various strategies, like
> flipping bits, clocking N (N between 1 and say 16) forward and see how that
> influences the ability to backclock the state? This extra computation should
> obviously not increase generation time by much more than 100%. 120% would 
> still
> be ok, since we also reduce our storage requirement by 50%.
> Also note that there is no way to do the extra calculation only on illegal
> states, since they are bitsliced. You can of course only modifiy illegal
> states but you would have to do the calculation with every chain (and then
> ignore the result for legal states).
>
>   
Yes, therefore I'm sticking to investigating the effects of forward 
clocking, which I think improves hit rates of any (illegal and legal) 
state, considering the state convergance observed above. Code 
modifications are trivial.

Regards,
M.

> On Sat, Jan 09, 2010 at 02:01:30PM +0100, M vd S wrote:
>   
>> Hi,
>>
>> I've been studying the A5/1 cipher in some more detail. As I noted 
>> before, 24 out of 64 randomly generated states cannot be reversed 1 
>> step. Such a state I would like to call an illegal state, since there is 
>> no way that the state in an actual a5/1 implementation will ever have 
>> that value after clocking 100 times with majority clocking.
>>
>> When considering clocking back more times, the number of illegal states 
>> rises to some 60-65% (in theory). Note that an illegal state *does* 
>> produce a 64 bit keystream nonetheless.
>>
>> This means that when you find a hit while cracking, there's a 60% chance 
>> of finding an illegal a5/1 state, which was certainly not the state the 
>> phone had in mind before generating the ciphertext, and which is 
>> certainly useless in trying to decrypt a conversation. (assuming random 
>> keystream distribution in practice)
>>
>> It also means that 60% of a5/1 rounds are calculated in vain. Detecting 
>> an invalid state is trivial. Transforming an invalid state into a valid 
>> one as well. The easiest way would be to just sanitize the state by 
>> stepping *forward* a few times, in the hope of opening up at least one 
>> new pathway on the way back. It would require very little changes to 
>> existing code, cost very little cpu time, while not invalidating 
>> existing tables (although lookup code should be aware of the two table 
>> versions thus introduced). It is open to discussion if you should only 
>> sanitize illegal states this way, or just clock forward every state a 
>> few times before taking the 64 bits of keystream as the next state.
>>
>> If such an approach would be taken, every hit will at least generate a 
>> reversible state. Every chain will be 2.5 times more useful. With only 
>> 40% of these new tables, you would have the same success you have in 
>> decrypting with the current tables.
>>
>>
>> To demonstrate that the stated percentages are real, I've modified a 
>> reference C implementation of chain generating code (kindly provided by 
>> Frank) to put this into practice, and count the number of ancestor 
>> states when clocking back N times, as well as the number of 
>> back-clockings before hitting an illegal state (/only illegal states 
>> after branching)
>>
>> These are the preliminary results: (for 1 chain, N=16)
>>
>> Evaluated 504062 (57%) 16-step illegal states, 870384 total.
>> clocking back 16 times gave:
>>  0 pathways in 504062 cases (57.91%)
>>  1 pathways in 186024 cases (21.37%)
>>  2 pathways in 48875 cases (5.62%)
>>  3 pathways in 62250 cases (7.15%)
>>  4 pathways in 27368 cases (3.14%)
>>  5 pathways in 15404 cases (1.77%)
>>  6 pathways in 7952 cases (0.91%)
>>  7 pathways in 7360 cases (0.85%)
>>  8 pathways in 3166 cases (0.36%)
>>  9 pathways in 2284 cases (0.26%)
>> 10 pathways in 2071 cases (0.24%)
>> 11 pathways in 930 cases (0.11%)
>> 12 pathways in 724 cases (0.08%)
>> 13 pathways in 601 cases (0.07%)
>> 14 pathways in 353 cases (0.04%)
>> 15 pathways in 262 cases (0.03%)
>> 16 pathways in 179 cases (0.02%)
>> 17 pathways in 141 cases (0.02%)
>> 18 pathways in 91 cases (0.01%)
>> 19 pathways in 75 cases (0.01%)
>> 20 pathways in 45 cases (0.01%)
>> 21 pathways in 38 cases (0.00%)
>> 22 pathways in 32 cases (0.00%)
>> 23 pathways in 27 cases (0.00%)
>> 24 pathways in 11 cases (0.00%)
>> 25 pathways in 12 cases (0.00%)
>> 26 pathways in 11 cases (0.00%)
>> 27 pathways in 10 cases (0.00%)
>> 28 pathways in 10 cases (0.00%)
>> 29 pathways in 3 cases (0.00%)
>> 30 pathways in 4 cases (0.00%)
>> 31 pathways in 4 cases (0.00%)
>> 32 pathways in 3 cases (0.00%)
>> 33 pathways in 1 cases (0.00%)
>> 34 pathways in 1 cases (0.00%)
>> depth reached:
>> depth  0 reached 326278 times (37.49%)
>> depth  1 reached 40753 times (4.68%)
>> depth  2 reached 15433 times (1.77%)
>> depth  3 reached 12179 times (1.40%)
>> depth  4 reached 11289 times (1.30%)
>> depth  5 reached 10856 times (1.25%)
>> depth  6 reached 10251 times (1.18%)
>> depth  7 reached 9817 times (1.13%)
>> depth  8 reached 9564 times (1.10%)
>> depth  9 reached 9272 times (1.07%)
>> depth 10 reached 8832 times (1.01%)
>> depth 11 reached 8357 times (0.96%)
>> depth 12 reached 8270 times (0.95%)
>> depth 13 reached 7985 times (0.92%)
>> depth 14 reached 7588 times (0.87%)
>> depth 15 reached 7338 times (0.84%)
>> depth 16 reached 366322 times (42.09%)
>> de001bc0006f0000 [0] -> 36dc483fb4fe0000
>> Completed 1 chains
>>
>> So 57% of the states cannot be clocked back more than 16 times - and we 
>> need at least 100 steps to get to the point where the key and frame 
>> number is mixed in.
>>
>> 37.49% of the states cannot be clocked back even once (theory says 
>> 24/64=37.5%), which is determined easily (6 bit lookup or simple bit 
>> logic). If only this test would be performed, the efficiency increase 
>> would be 60%. If testing is done rigorously (32 steps back, giving 68% 
>> illegal states) the efficiency increase would be 212%, of 3.12 times 
>> more efficient. Taking 16 steps back as a tradeoff, the process will be 
>> about 2.5x more efficient. Not testing at all but just stepping forward 
>> a few times might be the easiest workaround. The optimum number of steps 
>> would have to be determined emperically.
>>
>> (Another worry I have is the number of ancestor states. The expected 
>> value of ancestor states 16 steps back, or E[anc(16)], in the above 
>> example is 1.0015, but when we leave out the illegal states (which do 
>> not occur in practice) I get E[anc(16)!=0] = 2.38. This would mean 
>> stepping back e.g. 128 times takes this to the power of 8, leading to 
>> some 1000 possible Kc - still manageable but way more than the reported 
>> 1-10 someone stated before)
>>
>> Kind regards,
>> M.
>>
>>
>> _______________________________________________
>> A51 mailing list
>> [email protected]
>> http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51
>>     
> _______________________________________________
> A51 mailing list
> [email protected]
> http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51
>
>   
_______________________________________________
A51 mailing list
[email protected]
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51

Re: [A51] 2.5x efficiency increase by removing illegal states

Reply via email to