Cryptography-Digest Digest #926

Digestifier Mon, 18 Jan 1999 11:14:57 -0500
Cryptography-Digest Digest #926, Volume #8       Mon, 18 Jan 99 11:13:02 EST

Contents:
  Re: Metaphysics Of Randomness (R. Knauer)
  Re: On the Generation of Pseudo-OTP (Mok-Kong Shen)
  General (but detailed enough) description of RC4... (Bartosz /Lakomiec)
  Re: Metaphysics Of Randomness (R. Knauer)
  Re: long numbers math - how to ? (Terje Mathisen)
  Some Thoughts on the Design of a Proposed Stream Cipher (Mok-Kong Shen)

----------------------------------------------------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: Metaphysics Of Randomness
Date: Mon, 18 Jan 1999 13:46:22 GMT
Reply-To: [EMAIL PROTECTED]

On 16 Jan 1999 03:59:01 GMT, [EMAIL PROTECTED] () wrote:

>From a *metaphysics* and *theorical* point of view, and from 
>my point of view, randomness dosent exist. What we usually 
>call "a random phenomenon" is a phenomenon which obey to 
>unknown or unpredictable rules.

The decay of a radioisotope is completely random - there is no rule
that tells you when a particular nucleus will decay. The probability
for a decay in the interval t -> t + dt is a constant, independent of
the time t. That is the meaning of random.

>My opinion is all phenomenon obey to rules even if the rules 
>arent currently known. ok it is just my opinion and i dont have proofs.

Whether a Turing Machine program halts or not is random. There is no
way to prove that it will halt or will not halt by formal methods -
rules as you call them.

>Under this assumption, all the security protocols which required
>randomness (e.g. OTP and more generally any key generation) are
>not secure.

The OTP is proveably secure because any possible message can be
encrypted to a given ciphertext - and there is no way for the
cryptanalyst to know which message is the one you intended. That's
because the OTP key is perfectly random.

Bob Knauer

"Liberty lies in the hearts of men and women.  When it dies there,
no constitution, no law, no court can save it."
--Justice Learned Hand


------------------------------

From: Mok-Kong Shen <[EMAIL PROTECTED]>
Subject: Re: On the Generation of Pseudo-OTP
Date: Tue, 12 Jan 1999 20:18:58 +0100

R. Knauer wrote:
> 
> On Tue, 12 Jan 1999 11:52:16 +0100, Mok-Kong Shen
> <[EMAIL PROTECTED]> wrote:
> 
> >A true OTP is thought to obtain the same undecidability.
> 
> "True OTP" is redundant - an OTP is either "true" or it is not an OTP.
> 
> And yes, the whole idea behind the OTP cryptosystem is that every
> possible plaintext, including those that are intelligible, has the
> same probability of occurance in the cipher - and therefore the
> cryptanalyst has no reason to choose one over the other.
> 
> That is the reason the OTP is proveably secure.

But what do you say to the fact that in the case I described
the analyst can't say which of the three possible messages
is the true one?


> 
> >Even there is a truly random physical process (I 'assume' that a
> >certain defition of true randomness can be verified, though I
> >doubt that),
> 
> I already discussed this a couple times earlier on this very thread.
> 
> In general radioactive decay processes follow a first order rate
> equation precisely - that can be verified experimentally to any level
> of precision you are willing to pay to get. And since scientists use
> OPM, they have done it to levels of precision way beyond anything we
> need to be concerned about in crypto. Give those guys a few megabucks
> and they will give you ten significant figures if you want.
> 
> If the decay process follows a first order rate equation, then the
> decay process is completely, totally, 100% bona fide, no doubt about
> it, unpredictably random with regard to when a particular decay will
> occur. The reason is that a first order rate equation is of the form:
> 
> dN/dt = - k N, where k is a constant.
> 
> The fact that k is a constant says that the probability that a decay
> occur in the interval t -> t + dt is independent of t, that is, it can
> happen any time, and therefore is completely unpredictable.
> 
> That is about as random as it gets, folks.

Fine that you say now the word 'about'.

> 
> >the actual registration of the process, i.e. the
> >obtaining of data, needs measuring appratus. Since no man-made
> >apparatus is perfect, if follows that the sequence obtained is
> >subject to errors, hence not perfect. Hence one can at best accept
> >the sequence as truly random with a certain statistical confidence
> >level.
> 
> If your TRNG is designed properly, you can achieve the level of
> randomness inherent in the radioactive decay process to practical
> levels. What is the significance of an error that is so small that it
> makes no practical difference to the security of the OTP?

But so can, I believe, a properly designed 'intended approximation
to an ideal OTP' of mine! Both CAN be good to 'practical levels'.

M. K. Shen

------------------------------

From: Bartosz /Lakomiec <[EMAIL PROTECTED]>
Subject: General (but detailed enough) description of RC4...
Date: Mon, 18 Jan 1999 16:01:33 +0100

 
 Hi, everybody,

 I'm not interested in RC4 pseudo/source code and inner workings
 but I would like to learn its 'external' specifications that
 are needed from the cryptography implementators' point of view:

 - input data formatting required (if any)
 - key size
 - key generation method
 - encryption/decryption synchronization method
 - ...

 During last several months I browsed tens cryptography sites
 but there is no word about this algorithm anywhere.
 I know that RSA published technical report on RC4 (TR-401, 1992)
 but I can't find it on-line.
 If you could point out any Web resource on RC4 I will be incredibly
 grateful.

 I'm working on a Point-to-Point Protocol encryption extension 
 and (apart from block ciphers)  I would like to consider RC4 
 as an example of a stream cipher. As I know this is the only stream cipher 
 that is found in several serious networking products.

                            
=========================================================================
Bartek Lakomiec
[EMAIL PROTECTED]
=========================================================================


------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: Metaphysics Of Randomness
Date: Mon, 18 Jan 1999 13:49:35 GMT
Reply-To: [EMAIL PROTECTED]

On Sat, 16 Jan 1999 11:23:48 -0500, "Kazak, Boris" <[EMAIL PROTECTED]>
wrote:


>There is also another detail which is often overlooked in such 
>discussions. The systems which produce apparent randomness fall
>in one of two broad categories - with memory amd without memory.
>All PRNG's are systems with memory, contrary to the dice, roulette
>and radioactive decay. The concept of correlated length gives an
>idea of how far does the influence of a particular event extend.
>   So in my understanding, any system with equiprobable outcomes
>and without memory will produce "true" randomness, in the sense 
>that any particular outcome will not depend on the past and will
>not influence the future. This seems to be a very practical 
>approach, for cryptography and other applications.

A Turing Machine program has memory associated with it - the tape. Yet
whether any particular program halts or does not halt is totally
random. See Chaitin, op. cit.

Bob Knauer

"Liberty lies in the hearts of men and women.  When it dies there,
no constitution, no law, no court can save it."
--Justice Learned Hand


------------------------------

From: Terje Mathisen <[EMAIL PROTECTED]>
Subject: Re: long numbers math - how to ?
Date: Mon, 18 Jan 1999 15:21:49 +0100

Rx Video wrote:
> 
> Hello,
> 
> I'm looking for some information on how to implement the long numbers
> calculations. I have some ideas of my own, but I'm sure there are some
> better solutions. If anybody knows of some articles on the subject, or
> has tried to do it, please, let me know. I'm looking for conceptual
> (theoretical) solutions.

You must store each number as an array of smaller units, which are
individually small enough to be handled directly by the cpu you are
going to use.

After this it becomes mostly a problem of defining the different basic
math operations on these (arbitrarily) long arrays:

Working in binary, addition and subtraction is easy if you have 
add-with-carry/sub-with-borrow instructions, otherwise you must either
work with smaller data units, or detect carry/borrow in some other way.

Multiplication is a _lot_ more interesting, and this is where people
have spent most of their time optimizing code.

For smaller numbers you can just do N*M multiply-adds to generate the
N+M long product, but when the numbers become very large, you should
look into things like FFT-based multiplication.

Division is something I usually handle either by successive
approximation and back-multiplication/subtraction, or by first
generating a reciprocal value and then multiply by this.

Square root is probably quickest if you start with an inverse square
root (1/sqrt(x)) and then do a multiplication by (x) to get the final
result.

This is so much fun that I've implemented 4 or 5 different long/abitrary
precision packeges over the years. :-)

Terje

PS. For crypto apps you need some other basic operations as well, like
modulo exponentiation, but I've never tried to implement this directly.

-- 
- <[EMAIL PROTECTED]>
Using self-discipline, see http://www.eiffel.com/discipline
"almost all programming can be viewed as an exercise in caching"

------------------------------

From: Mok-Kong Shen <[EMAIL PROTECTED]>
Subject: Some Thoughts on the Design of a Proposed Stream Cipher
Date: Mon, 18 Jan 1999 17:04:25 +0100

PRELUDE. This is a major re-work of an earlier article of the author
entitled 'On the Generation of Pseudo-OTP'. Its title has been changed
because of possible misleading interpretations/connotations of the
term 'pseudo-OTP'.

I.  INTRODUCTION.

While the general tendency of modern developments in cryptography has
been towards block ciphers of increasingly larger key length, with
e.g. AES demanding a minimum of 128 bits, a recent agreement of the
Wassenaar countries to restrict export of crypto products to those
having key length not exceeding 56 bits will invariably lead to a
more or less long-durated very undesirable situation where innocent
and law-obeying people outside of the 33 signatory nations will have
only relatively weak ciphers at their disposal that are insufficient
to offer an adequate level of security of their private confidential
communications (either with themselves or with people of the 33
countries). For eventually illegally imported strong crypto products
could have problems of general availability, cost, maintenance,
integrity (virus/bugs implanted by malicious agencies) and, last but
not least, legal issues. On the other hand, their countries are on the
average fairly poor in scientific and technical know-how currently, so
that it would be extremely difficult to bring up there in short terms
an adequate crypto infrastructure comparable to that of the
technically superior export-regulating nations.

One viable way for people in the said technically underdeveloped
region to escape from the dilemma appears to be to resort to
superencipherment, i.e. an encryption system consisting of a
combination of an easily implementable algorithm with a sufficiently
good 56-bit algorithm (which can be legally obtained from the 33
countries) with the hope that the resulting strength will exceed the
computational capability of the mighty eavesdroppers.

It is the purpose of this paper to present some preliminary ideas on
the realization of such an easily implementable complementary
algorithm in the form of a stream cipher having the property that its
keystream can be easily generated by both partners of a given
communication connection without involving any problems of transport
and secure storage, thus possessing the advantages of immunity to
(whatever) export controls and of simplicity of handling by
(eventually unsophisticated) users of the encryption scheme. (This is
to be contrasted, in particular, with keystreams obtained from
physical random processes, which normally incur severe management
problems of transport and secure storage.)

II.  RAW MATERIALS FOR OBTAINING KEYSTREAMS.

For good cryptological security of a stream cipher, the keystream
generated should be hard to infer, i.e. it should be difficult for the
analyst to predict any bit of the keystream from the other bits. As is
well-known, this very crucial problem is unfortunately extremely
difficult to deal with. We regret (as the reader may have foreseen
from our introduction) not being able to offer any satisfactory
solution to that in the present paper. Instead we suggest to appeal to
the admittedly crude (imprecise) heuristic argument that by rendering
each bit of the keysteam somehow to be a sufficiently complex
(complicated) function of a sufficiently large set of independent and
sufficiently random values (variables) the dependency of any bit from
the other bits of the entire sequence obtained can be made to be
arbitrarily small so as to be beyond the inference capability of the
analyst.

Assuming that the above heuristic argument can be accepted and noting
that the dependency can be expressed in terms of auto-correlations, we
suggest using the following coarse plan to generate keystreams for
our purpose: Start with some more or less random raw materials
(sequences of bits/bytes) that are easily available albeit inherently
poor (i.e. comparatively strongly auto-correlated) and apply a
sequence of suitably chosen refining processes (to be treated in the
next section) aimed to reduce the correlations such that the end-stage
output will be of satisfactory quality (progressive refinement).

Conceptually this is akin to obtaining potable water from a muddy
source through employing a series of differeent filtering and
disinfecting processes, with each augmenting the functionality of the
others such that the end product will hopefully be sufficiently free
of ingredients that are detrimental to human health.

Thus said, we shall presently look for ensembles of fluctuating
quantities that can be easily and perfectly leagally acquired by both
communication partners in identical copies independent of geographical
locations and that are apparently in some sense random, i.e. roughly
speaking non-systematic, disordered and without a discernable
periodicity. The following materials suggest themselves:

1. Natural language texts (books, news, public documents, etc.)

2. Artificial language texts (source programs, digit sequences of
   transcendental numbers, postcript files, etc.)

3. Sequences of data (physical measurements, economic time series,
   telephone numbers, results of numerical computations (with
   identical input, algorithms and computational accuracy), scientific
   data bases, etc.)

4. Other digitalized materials (voice, sound, photos, pictures,
   graphics, satellite and medical images, binary program files, etc.)

With the possible exception of the digit sequences of transcendental
numbers, these items are mostly fairly strongly auto-correlated. In
fact, some well-known methods of analysis are based on the frequency
distributions of bigrams, trigrams, etc. of the natural languages in
which the plaintext messages are written. In the next section we
shall therefore deal with a number of techniques aimed to efficiently
reduce the auto-correlations, hopefully leading to keystreams that are
of satisfactory quality for our purpose. (The envisaged, though
certainly not fully attainable, goal is of course the white noise,
whose auto-correlation function is defined to have the value zero
everywhere excepting at a singular point.)

III.  TECHNIQUES FOR REDUCING AUTO-CORRELATIONS.

For simplicity of presentation we shall confine ourselves in this
section to natural language texts (sequences of characters in 8-bit
bytes) as raw materials for obtaining keystreams, since these are
very easily available and since the other sources can be treated in
essentially the same manner. The following techniques, being largely
taken from classical elementary cryptography and readily implementable
by persons of modest programming proficiency, appear to be promising
candidates for reducing the auto-correlations that are present in the
texts:

1. Apply substitutions, e.g. polyalphabetic substitutions of bytes.

2. Apply permutations, i.e. transpositions, including reversal.

3. Apply homophones, utilizing the fact that the alphabet of the
   texts is smaller than the full 256 character set.

4. Combining two text streams through:

   a. Bitwise XOR.

   b. Addition mod 2^32 (assuming 32 bit word length).

   c. Substitution, e.g. mapping a pair of bytes (or halfbytes), one
      from each stream, to a pair of bytes (or halfbytes).

5. Compression through:

   a. Commonly available data compression algorithms.

   b. XORing groups of bits, e.g. 32 bits, i.e. taking their parity.

6. Use every second or third character of the texts.

7. Apply other simple encryption techniques, e.g. auto-keying
   applied to a group of words and circular shift of bits in a word.

Note that with (4) an arbitrary number of text streams can be combined
and that all techniques can be applied recursively, for example (4b)
can be applied after (5b).  Thus the proposed scheme for the
generation of keystreams can apparently indeed be made to be
arbitrarily complex, satisfying the requirement mentioned in our
heuristic argument of Section II. Of course, at this point one
naturally tends to recall the efficiency issue of processing which is
up till now among the essential selection criteria of modern
encryption algorithms. However, as the author recently pointed out
[1,2], one practical and simple way of coping with the 56-bit limitation
is to make use of a new paradigm, namely 'security through
inefficiency', i.e. paying the price of some tolerable longer
precessing time. This can certainly be justified in a substantial part
of civilian applications, e.g. encrypting e-mails, and can evidently
be applicable even to certain relatively time-critical encryptions as
well, if multiple hardware is available for processing parallel jobs.

IV.  MISCELLANEOUS REMARKS.

The 'key', i.e. the secret information to be kept by both
communictaion partners, consists of the names or ISBNs of the texts
being used and the arbitraily chosen initial offsets from the
beginning of the texts. The keystream is preferably to be generated
online rather than generated once in great length and used for a
number of subsequent sessions, since otherwise it would have to be
securely stored, resulting in substantial management problems. This
means that the user keeps a running record of the offsets to the
individual texts, i.e. the character positions of the texts after the
previous message has been processed. (As a variation one could
introduce some amounts of skips after each message processed.) No
other informations need be securely stored. (The texts can in fact be
freely displayed and read by the user for entertainment when he is not
doing encryption.)

Since one's library may contain quite a large number of texts
(possibly also of several different languages which is advantageous
from the point of view of combination), each having the potential of
being combined together to generate the keystreams for encryption,
the mere knowledge of the content of the library of the user is
virtually useless to the analyst. Simple combinatorical considerations
indicate that an almost unlimited number of keysteams (or equivatently
a non-repeating (non-periodic) keystream of practically infinite
length) can be generated from a library of non-trivial size. This is
to be contrasted with keystreams obtained from one or a few pseudo-
random number generators (LFSRs etc. see [3]) that normally have
fairly limited period lengths. (For a compound PRNG designed to accept
an arbitrarily long seed and produce pseudo-random number sequences
of arbitrarily long period lengths, see [4].) In other words, on the
assumption that the user commits no procedural errors, the keystream
obtained is of 'one-time' nature.

Note: The above said 'one-time' qualification should not be
mis-understood by the reader to induce conceptual associations with
the one-time pad (OTP). Our proposed scheme is simply a stream cipher.
OTP is on the other hand an ideal and purely theoretical concept that
is never fully realizable in practice, though it is is commonly
believed that OTP can be approximated to satisfactory degree through
adequately post-processing bit sequences obtained from some physical
random processes. (Our comparatively much less ambitious goal, though,
is to be able to approximate the white noise well.)

Maurer's universal statistical test [5], being fairly easily
implementable, appears to be particularly recommendable for testing the
quality of the keystreams obtained with a given design configuration.

More than one offsets can be applied to the same text. This may be
useful in the special case where only a single text is availble. (As
a matter of interest, we mention that the same technique can be
applied to a single record of hardware generated keystream, enabling
through combination (with e.g. XOR) of the subquences thus obtained
the derivation of a number of different keystreams from the same
record.)

It is not necessary to maintain a large library at the user's place.
Certain texts may be downloaded from publically accessible servers,
if care is taken not to leak thereby sufficient amount of exploitable
informations to the analyst as to which texts actually participate in
the keystream generation process.

Pi appears to have an advantage over the other transcendental numbers
in the present context in that one can compute its digits starting
from arbitrarily given offsets, thus dispensing with the need to store
its digit sequence (albeit at the price of higher computing cost).

Text file compression software are easily obtainable and not subject
to export regulations. Self-implementation of these may however incur
some non-trivial programming expenditure. Note that such compressions
are reversible (through the corresponding decompression processes).
For multi-media applications, a large number of methods, some
involving sophisticated mathematical transformations, are available
that are irreversible but ensure reproduction of acceptable visual
quality (see [6,7]). It should be noted however that in our case we
need to care neither reversibility nor satisfactory reproduction.
(Likewise we are not concerned at all with error detection/correction
of codes.) Hence the extremely simple compression technique of (5b) of
Section III appears to be particularly valuable for our purpose. (It
may be mentioned that (5b) is utilized in author's humble design
WEAK4-EX [8], a stream cipher based on psuedo-random number generation
with a 56-bit key as seed.)

We take this opportunity to raise a question that could be of some
indirect relevance to our topic: Suppose there are three character
strings as follow:

1. Purchase 500 million Yens.
2. Sell all Microsoft shares.
3. Cut UK offer by 7 procent.

Suppose string 1 is the proper message to be communicated. Using (4a)
of Section III we obtain a keystream from strings 2 and 3 and XOR it
with string 1 to produce the ciphertext, i.e. the result of XORing
all three strings together. Suppose now that the analyst has a method
to systematically obtain all meaningful possible plaintexts
corresponding to the given ciphertext. Then all three character
strings above are certainly among the set of candidate plaintexts
found. How can he know which is the true message? This ambiguity seems
to imply that it may not be unconditionally necessary to strive to
achieve a very excellent approximation to the white noise but that
certain comparatively inferior quality could be sufficient. The author
is ignorant of an answer to this seemingly paradoxical issue and
should very much appreciate hints and explanations from the experts.

V.  CONCLUSION.

As our title clearly indicates, this paper does not aim to present any
finalized design but simply sketches some apparently promising
techniques of realizing such. Based on our more or less fuzzy
heuristic argument (as given in Section II) alone it is evidently
entirely futile to predict which of the techniques (listed in Section
III) are efficient and which (eventually recursive) combination out
of them will actually lead to an optimal and practically useful stream
cipher. Only actual experiments will be able to tell this. However,
it is the author's intuitive conviction (partly based on some test
results of his WEAK-series of algorithms [1,8,9] that utilize part of
the techniques listed in this paper) that the direction we have been
persuing is a rewarding one worthy of serious further investigations.

The author wishes to thank CWL for reading a draft of the present
paper.

Comments, critiques and suggestions for improvement are sincerely
solicited.

M. K. Shen


LITERATURE.

[1] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper13

[2] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper17

[3] A. J. Menezes et al., Handbook of Cryptography. CRC Press, 1997.

[4] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper9 

[5] U. M. Maurer, A Universal Statistical Test for Random Bit
    Generators. J. Cryptology (1992) 5:89-105.

[6] W. Kou, Digital Image Compression. Kluwer, 1995.

[7] V. Bhaskaran, K. Konstantinides, Image and Video Compression
    Standards. Kluwer, 1995.

[8] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper16

[9] http://www.stud.uni-muenchen.de/~mok-kong.shen/#paper15

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************
Cryptography-Digest Digest #926

Reply via email to