Cryptography-Digest Digest #193, Volume #12      Mon, 10 Jul 00 14:13:01 EDT

Contents:
  Re: Comment/Analysis requested Password to RawBinarykey method... (Mark Wooding)
  Re: Quantum Computing (Was: Newbie question about factoring) ("Tony T. Warnock")
  Re: Use of EPR "paradox" in cryptography (Roger Schlafly)
  Re: Quantum Computing (Was: Newbie question about factoring) (Bob Silverman)
  Re: Another AONT (less expensive, this time) (John Myre)
  Re: Another AONT (less expensive, this time) (Mark Wooding)
  Re: Quantum Computing (Was: Newbie question about factoring) (Roger Schlafly)
  Re: Another AONT (less expensive, this time) (Roger Schlafly)
  Steganographic encryption system (phil hunt)
  Re: key dependent s-boxes (Mack)
  Re: Proposal of some processor instructions for cryptographical applications ("Scott 
Fluhrer")
  Re: Use of EPR "paradox" in cryptography (Bill Unruh)
  Re: cray and time needed to attack (Jerry Coffin)
  Re: Comment/Analysis requested Password to RawBinarykey method... (David A. Wagner)
  Re: Proposal of some processor instructions for cryptographical/other applications 
(Ian Kemmish)

----------------------------------------------------------------------------

From: [EMAIL PROTECTED] (Mark Wooding)
Subject: Re: Comment/Analysis requested Password to RawBinarykey method...
Date: 10 Jul 2000 14:10:04 GMT

Jay Summet <[EMAIL PROTECTED]> wrote:

> This is somewhat straightforward. However, I am generating a key based
> upon a user supplied pass phrase. I want a different (binary/raw key)
> for each algorithm (2 different keys from the same passphrase).
> 
> So, I built my own String -> raw binary array of bits/bytes method,
> using hash functions (MD5 and SHA, one for each cipher). A link to
> direct source code is provided at end of post, here is the overview.

[complicated iterative thing]

> We repeat this until the bytes of the key are filled up (24 for
> DESede, 56 for blowfish). (one version uses MD5, one version uses
> SHA1)

First of all, my traditional warning: MD5 isn't a good thing to use in
new applications.  Good hash algorithms to choose from are SHA1,
RIPEMD-160 and Tiger.  It may well be that in the complete overkill
method you're using the weaknesses in MD5 aren't a concern: it still
gives me the willies.

If I were you, I'd just hash the passphrase twice, either with different
good hashes or the same good hash but different prefixes.

The bizarre `choose a pseudorandom byte from the hash' procedure strikes
me as very strange.  If the hash is good then this won't produce better
results than just choosing the first byte and if it's not a good hash
then you shouldn't use it.  Similar comments apply to the iteration
scheme.

The only valid complaint I think you could make about my suggestion is
that the hash outputs don't fill the ciphers' keyspaces.  Note that
Blowfish is designed to accept variable-length keys, so that's not a
problem; you're missing 8 bits of DES key, though.  Using something like
RIPEMD160-MGF1 or SHA1-MGF1 to generate key blocks from the passphrase
will fix this much more simply.  If you really want to push the boat
out, you could use the function from SSL (or, even more excessively,
from TLS) for generating cipher keys from the shared random data block.

-- [mdw]

------------------------------

From: "Tony T. Warnock" <[EMAIL PROTECTED]>
Crossposted-To: comp.theory
Subject: Re: Quantum Computing (Was: Newbie question about factoring)
Date: Mon, 10 Jul 2000 08:54:46 -0600
Reply-To: [EMAIL PROTECTED]

Jeff Erickson wrote:

> ca314159 <[EMAIL PROTECTED]> writes:
> | D. Mermin poses a useful problem for only one qubit:
> | to determin the millionth bit of the binary expansion
> | of sqrt(2+x). That would be big news.
>
> Hardly.  Given the recent computation of the trillionth(!) bit of pi
> using classical computers, why should anyone think computing bits of
> sqrt(2+x) is hard?

I'm going to guess that it's much harder to compute the bits of an algebraic
number than the bits of Pi or or some other transcendentals.


------------------------------

From: Roger Schlafly <[EMAIL PROTECTED]>
Crossposted-To: sci.physics
Subject: Re: Use of EPR "paradox" in cryptography
Date: Mon, 10 Jul 2000 08:26:39 -0700

Benjamin Goldberg wrote:
> > And your attacker can just intercept on of the epr pairs,
> > read it and send on to you the output.
> 
> True about interception, but it's easy to detect if this has
> occurred.  If someone intercepts and re-sends data, then
> there is a 25% chance of polarization being changed (per bit
> intercepted).

Which means that someone can grab a few bits and have a
decent chance of not being caught. In some cryptographic
application, losing just a few bits is catastrophic.

------------------------------

From: Bob Silverman <[EMAIL PROTECTED]>
Crossposted-To: comp.theory
Subject: Re: Quantum Computing (Was: Newbie question about factoring)
Date: Mon, 10 Jul 2000 15:33:09 GMT

In article <[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] wrote:
> Jeff Erickson wrote:
>
> > ca314159 <[EMAIL PROTECTED]> writes:
> > | D. Mermin poses a useful problem for only one qubit:
> > | to determin the millionth bit of the binary expansion
> > | of sqrt(2+x). That would be big news.
> >
> > Hardly.  Given the recent computation of the trillionth(!) bit of pi
> > using classical computers, why should anyone think computing bits of
> > sqrt(2+x) is hard?
>
> I'm going to guess that it's much harder to compute the bits of an
algebraic
> number than the bits of Pi or or some other transcendentals.


No.  It is because an algorithm by Simon Plouffe allows computation
of the k'th bit of Pi   **without** having to compute the 1...k-1
bits.
--
Bob Silverman
"You can lead a horse's ass to knowledge, but you can't make him think"


Sent via Deja.com http://www.deja.com/
Before you buy.

------------------------------

From: John Myre <[EMAIL PROTECTED]>
Subject: Re: Another AONT (less expensive, this time)
Date: Mon, 10 Jul 2000 09:36:16 -0600

Mark Wooding wrote:
> 
> Benjamin Goldberg <[EMAIL PROTECTED]> wrote:
> 
> > The following is both to transform, and to reverse the transform:
> > ( a, x ) = ( first-20-bytes-of-message, rest-of-message )
> >   b = a XOR SHA1( x )
> >   y = ARC4( b, x )
> >   c = b XOR SHA1( y )
> > transformed-message = c || y
> 
> This is OAEP with a pointless extra round.  OAEP already has proven all-
> or-nothing properties.

Well - aren't there important differences?  First, OAEP requires adding
a random number, instead of using part of the message.  Wouldn't that
invalidate the proofs?  Second, would ARC4 qualify for use by OAEP?  The
proofs require an awfully good primitive.

> (Alternatively, you could look at it as unkeyed BEAR (or is it LION? --
> I can never remember which is which), but that's not helpful.)
> 
> -- [mdw]

(It's BEAR.)
Why do you say "that's not helpful"?  Do you mean, that point of view
would teach us nothing about the AON properties?

JM

------------------------------

From: [EMAIL PROTECTED] (Mark Wooding)
Subject: Re: Another AONT (less expensive, this time)
Date: 10 Jul 2000 16:29:49 GMT

John Myre <[EMAIL PROTECTED]> wrote:
> Mark Wooding wrote:
>
> > Benjamin Goldberg <[EMAIL PROTECTED]> wrote:
> >
> > >   b = a XOR SHA1( x )
> > >   y = ARC4( b, x )
> > >   c = b XOR SHA1( y )
> > > transformed-message = c || y
> > 
> > This is OAEP with a pointless extra round.  OAEP already has proven
> > all- or-nothing properties.
> 
> Well - aren't there important differences?  First, OAEP requires
> adding a random number, instead of using part of the message.
> Wouldn't that invalidate the proofs?  Second, would ARC4 qualify for
> use by OAEP?  The proofs require an awfully good primitive.

The proofs require an unattainably good primitive.  In practice, we have
to use hash functions and other similar things, and we have to use
analysis to determine components which might work adequately.  I suspect
that RC4 (appropriately modified to discard the first kilobyte of data)
is good enough according to our current knowledge.

I was too hasty in my article.  You're right: it's not OAEP, and needs
looking at in its own right.  In fact, it's nowhere near as strong as
OAEP.  The critical difference between Goldberg's construction and both
OAEP and Rivest's package transform is that both of the latter are
randomized transformations, whereas Goldberg's is deterministic.  And I
don't believe that a deterministic transformation can meet the
requirements for an all-or-nothing transform, as defined in the paper on
OAEP-as- AONT.  Indeed, we can see that no deterministic transform can
be as strong as a randomized one, even in a weaker scenario than the
non- adaptive case given: given two messages x and y and partial data of
their transforms, the adversary can compute the full transform of each
of x and y and compare against the partial transforms given.  Compare
this against the definition in which the adversary is permitted to
/choose/ the two messages to be transformed.

> > (Alternatively, you could look at it as unkeyed BEAR (or is it LION?
> > -- I can never remember which is which), but that's not helpful.)
> 
> (It's BEAR.)
> Why do you say "that's not helpful"?  Do you mean, that point of view
> would teach us nothing about the AON properties?

The proofs for BEAR and LION show that key recovery for the construction
can be used to recover the keys for both underlying primitives.  Since
this construction is unkeyed, the proofs are not applicable.  Other work
has demonstrated that BEAR and LION can be weak in other ways given
appropriately `clobbered' primitive operations.

-- [mdw]

------------------------------

From: Roger Schlafly <[EMAIL PROTECTED]>
Crossposted-To: comp.theory
Subject: Re: Quantum Computing (Was: Newbie question about factoring)
Date: Mon, 10 Jul 2000 09:42:49 -0700

"Tony T. Warnock" wrote:
> > Hardly.  Given the recent computation of the trillionth(!) bit of pi
> > using classical computers, why should anyone think computing bits of
> > sqrt(2+x) is hard?
> 
> I'm going to guess that it's much harder to compute the bits of an algebraic
> number than the bits of Pi or or some other transcendentals.

As Bob S explains, Pi just happens to be easy because of some
special formulas.

But I think the point is that the output should be some function
of the input. Ideally, if quantum computers were possible, one
could compute sqrt(2+x) by doing the bulk of the computation
without even knowing what x is. The result would be a quantum
superposition of states that could then be used to quickly
read off part of sqrt(2+x), given x.

------------------------------

From: Roger Schlafly <[EMAIL PROTECTED]>
Subject: Re: Another AONT (less expensive, this time)
Date: Mon, 10 Jul 2000 09:48:18 -0700

Mark Wooding wrote:
>  In fact, it's nowhere near as strong as
> OAEP.  The critical difference between Goldberg's construction and both
> OAEP and Rivest's package transform is that both of the latter are
> randomized transformations, whereas Goldberg's is deterministic.  And I
> don't believe that a deterministic transformation can meet the
> requirements for an all-or-nothing transform, as defined in the paper on
> OAEP-as- AONT. 

Why not?

> Indeed, we can see that no deterministic transform can
> be as strong as a randomized one, even in a weaker scenario than the
> non- adaptive case given: given two messages x and y and partial data of
> their transforms, the adversary can compute the full transform of each
> of x and y and compare against the partial transforms given.  Compare
> this against the definition in which the adversary is permitted to
> /choose/ the two messages to be transformed.

A deterministic AONT would still be useful. The user could
always just add his own randomness, if he wanted, by appending
some random bytes to the end of the message.

------------------------------

From: [EMAIL PROTECTED] (phil hunt)
Crossposted-To: comp.os.linux.development.apps,uk.comp.os.linux
Subject: Steganographic encryption system
Date: Mon, 10 Jul 2000 18:07:37 +0100
Reply-To: [EMAIL PROTECTED]

I am thinking of developing a steganographic encryption system that
will enable a particular cyphertext to be decrypted into two or more 
different plaintexts, depending on which key is used. (Provisionally
named 'stes'). To be more precise:

   $ stes --create ctf hello myletter.txt goodbye apicture.jpg

This creates cypher text file <ctf> which contains an encrypted form of
the data in the plaintext files <myletter.txt> and <apicture.jpg>. The
data in <myletter.txt> is accessible by the key "hello", the data in
<apicture.jpg> by the key "goodbye".

Note that if you have the file <ctf>, there is no way of knowing how
many different files are encrypted inside it; Cyphertext files will
contain 'padding' so you can't infer contents by length, either.

   $ stes --decode ctf goodbye xxx.jpg

This looks in the file <ctf> and finds out whether there is any data
in it using the key "goodbye". if there is data, it is put into the
file <xxx.jpg>. (in this example, there is data, i.e. the contents of file
<apicture.jpg>. (Note that stes doesn't know what a jpg file is, or anything 
else about file types))

   $ stes --alter ctf hello anotherfile.doc 

This looks in the file <ctf> and finds if there is any data for the
key "hello". In this example, there is, so the old data stored under
the key "hello" is lost, and replaced with the new data from 
<anotherfile.doc>. Note that if <ctf> didn't already have the key "hello",
it could not be added; keys can only be put into a cyphertext file when
it is created.

I have some questions:


(1) does anything like this exist already?

(I am aware of stegFS, which is a steganographic file system for Linux.
My system is different in that it encypts files not file systems, so
can be used under any OS, and also in StegFS, keys have "levels of secretness":
if you decrypt with one key, you automatically have access to all the data
encypted with "less secret" keys -- in stes, all keys are as secret as
each other).


(2) What's a good private-key cypher algorithm to use?

A good algorithm will have these features:

* strong encryption -- no-one can crack it (what key size do I need for this?)
* not encumbered by patent or similar restrictions
* an open source implementation, preferably one that is well documented
* popular. The more eyeballs looking at it, the less security holes will
  be in it.
* efficient on processor time (this is not as important as the other 
  requirements, it's an "all other things being equal" thing)

Would Blowfish be a good algorithm to use? If so, which implementation?
(Bearing in mind that my program will be written in C++ and aims to be as 
portable as possible across OSes, so any code that relies on, for example,
endianness would be non-optimal).

-- 
***** Phil Hunt ***** send email to [EMAIL PROTECTED] *****
Moore's Law: hardware speed doubles every 18 months
Gates' Law: software speed halves every 18 months 

------------------------------

From: [EMAIL PROTECTED] (Mack)
Subject: Re: key dependent s-boxes
Date: 10 Jul 2000 17:21:06 GMT

Vladimir Castro Alves [EMAIL PROTECTED] 


>
>Hi,
>
>I am looking for references on key dependent s-boxes
>
>and their robustness compared to "conventional" s-boxes.
>Thanks for any hint on the subject.
>Vladimir Castro.
>[EMAIL PROTECTED]
>
>

Don't know of any.  I don't think that you can
really make a blanket statement comparing the
two.  except maybe finding "good" fixed s-boxes
is hard.



Mack
Remove njunk123 from name to reply by e-mail

------------------------------

From: "Scott Fluhrer" <[EMAIL PROTECTED]>
Crossposted-To: comp.arch
Subject: Re: Proposal of some processor instructions for cryptographical applications
Date: Mon, 10 Jul 2000 10:12:39 -0700


Douglas A. Gwyn <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]...
> Mok-Kong Shen wrote:
> > > This is called bit reversal.  It is common in digital signal
processing.
> > I implicitly assume a common processor, e.g. what people normally
> > have access to, e.g. a PC.
>
> Bit reversal isn't very expensive on traditional architectures,
> especially if you can use assembly language so that you can get
> at the carry bit.  But in any case, DSP peripherals are not very
> expensive, and they would be cheaper still if many people really
> wanted them.  Why should much of the CPU design be geared toward
> your particular application if nobody is clamoring for it?
>
> What would serve a wide variety of applications, including
> cryptology, better would be good support for bit addressing.

Actually, I suspect that bit addressing would not be all that interesting
for cryptography.

For example, in terms of private key encryption systems, we already know how
to do secure (at least, to the best of our knowledge) encryption in 10-20
cycles per byte encrypted (eg. the AES finalists).  There is little point in
seriously looking at any slower schemes at this point.  If you do anything
with bit addressing, then we are looking at at least 1-2 cycles per bit
examined.  Unless you look at least most of the bits using your bit
addressing scheme, an attacker is likely to be able to come up with a
differential that avoids all your bit addresses.  And, if you do a bit
lookup at every time in the message, you're looking at 8-16 cycles per byte
just to do that, not to mention anything else your cipher would need to do.

Now, it bit addressing may be useful in some other aspect of cryptography
(say, a public key crypto system, or cryptanalysis).  However, it's not at
all clear how...


--
poncho





------------------------------

From: Bill Unruh <[EMAIL PROTECTED]>
Crossposted-To: sci.physics
Subject: Re: Use of EPR "paradox" in cryptography
Date: Mon, 10 Jul 2000 10:40:32 -0700

> Bill Unruh wrote:
> [snip]
> > And your attacker can just intercept on of the epr pairs,
> > read it and send on to you the output.
> 
> True about interception, but it's easy to detect if this has
> occurred.  If someone intercepts and re-sends data, then
> there is a 25% chance of polarization being changed (per bit
> intercepted).

Not if the polarisation direction is agreed upon before hand, and the
attacker has learned of that direction. This is why the complex protocol
in which the participants measure the value in random different
directions and communicate only afterwards as to what that direction
was, and use only those in which they agreed. 
Ie, if theattacker knows the polarisation direction used to convey the
info, then the scheme is as insecure as any conventional scheme to
eavesdropping.


------------------------------

From: Jerry Coffin <[EMAIL PROTECTED]>
Subject: Re: cray and time needed to attack
Date: Mon, 10 Jul 2000 11:41:42 -0600

In article <8kb67e$l77$[EMAIL PROTECTED]>, [EMAIL PROTECTED] says...

[ ... ] 

> You seem excessively focused on cycles time, as if it translates
> directly into computer speed. I will give a hint: It doesn't.
> Further, the block lanczos algorithm on a 64-bit VECTOR machine
> will far outperform even a faster, scalar processor with lower cycle
> time.

The cycle time of the main memory IS very significant to the speed in 
this case.  I'm not sure why you're talking about vector vs. scalar 
processors -- nobody (to my knowledge) has suggested that this should 
be one in a scalar machine anyway.
 
> > Right now, Samsung is working on the Alpha 21464, which will include
> > memory with a cycle time of under .3 ns.
> 
> Perhaps for on-chip cache. What about the 10 to 100 Terabytes
> of memory that will be needed to hold the matrix?

If it can be built into a cache, it can be done outside the cache as 
well.  Yes, you're likely to incur some slowdown (usually on the 
order of 10 to 20%) due to extra buffering needed.  To keep close to 
the same speed, you have to ensure that the loading (especially 
capacitance) in the connection from the memory to the CPU is kept 
extremely low.  This requires wiring with low-k dielectric, careful 
design (basically treating all the wires almost like transmission 
lines) and so on.  It's expensive, but all of it is stuff that's been 
known for a LONG time.  Probably more significant than any of this is 
the simple addition of propogation delay due to having to transmit 
the signal a greater distance.

> > Now, if you want to talk about arithmetic ability, try this: explain
> > how 30 divided by .3 gives only the 10:1 improvement you cite above.
> 
> Retard.  All I said that a 10:1 speed improvement would reduce the
> problem mod 2 to 6 million days. I gave no assumptions about what
> increase is actually achievable.  That is you putting words in
> my mouth.

I'd already said that the overall improvement available was roughly 4 
orders of magnitude, and you talked about what would happen if it was 
as large as one order of magnitude.  How does that make you feel 
entitled to talk about anybody else as being a "retard" or anything 
similar?
 
> And you are assuming that lowering cycle time by a factor of 100
> will decrease run time by the same amount.  I'll give you a hint:
> It won't come close except for problems that fit entirely in cache.

Nonsense.  I'm talking about the cycle time of the main memory.  If I 
was talking about the cycle time ONLY of the processor core, you'd 
have a point.  I'm not, and I never have been.
 
> If you think we will have Cray
> class (i.e. large memory, low latency, vector machines) that perform
> 10x faster than current Crays within the near future, then there is
> a bridge I know for sale in Brooklyn....

I never said such a thing was likely to be built.  I have VERY 
carefully said that I was talking about what was theoretically 
possible with current technology, NOT what was likely to be built or 
was economically feasible.  How much more simply will I have to make 
this statement before you understand it?

> (1) When working mod 2 one can add 64 columns at once via a single
> xor.  Working mod p means doing 64 separate additions mod p. And
> mod p additions are more than twice as slow as an xor.

Doing 64 separate additions does NOT mean taking 64 times longer 
unless there's a dependency between the additions.  In case you'd 
forgotten the fact, the whole basic notion of a vector machine is to 
allow doing things like this in parallel.

In case you'd forgotten still other facts, even if there are 
dependencies between the additions, you STILL don't usually have to 
take 64 times as long do do them.  You can do all 64 in parallel and 
save all the carries.  In a second operation, you add the carries in.  
You continue only until there are no carries.  In the worst case, you 
MIGHT need 64 operations to finish the job, but in most cases you can 
finish it sooner.  If you honestly wrote code for Crays as you claim, 
you should be well aware of this, as it's an important optimization 
for almost any vector machine.

It's true that adding mod p is slower than an XOR.  Even with 
specially designed hardware, it's likely to remain somewhat slower.  
OTOH, you started out claiming that the overall difference was at 
least an order of magnitude.  It appears that in reality it's 
considerably less than that, even at worst.

> (2) You have to solve the matrix 33 times, then paste the results
> together with the CRT.

Yes, and those 33 solutions can also be done in parllel, can't they?  
As such, looking at purely theoretical possiblities, the only time 
added here is the final step of the CRT itself.

> So. In going from mod 2 to mod 32-bit prime loses a factor of "about"
> 128 in adding columns of the matrix together (and the Block Lanczos
> method is very efficient in handling 64 x 64 bit sub-blocks mod 2 on
> the CRAY).  You also get another factor of 33 from having to solve
> the matrix multiple times mod single precision primes.
> Machines which provide full support for 64 bit arithmetic can cut
> the factor of 33 down to 17.

In reality, nearly NONE of these factors needs to apply at all.  If 
cost is not an object (and I've made it VERY clear from the beginning 
that we're talking ONLY about what's theoretically technically 
possible, NOT what's economically feasible) the only time that's 
really added is roughly doubling the time due to addition mod p 
rather than mod 2, and adding on the time for the CRT at the end.  
Between the two, we're not talking about anything as large as even a 
factor of 10, not to mention the factor of over 4,000 you're talking 
about above.  Again, since we're talking about what's theoretically 
possible, discussing a machine that doesn't fully support 64-bit 
arithmetic is foolish: the technology to support 64-bit (or even 
larger) arithmetic has been known for many years.
 
> Therefore, a machine 1000 times faster than the CRAY will only
> take 240 million days to solve it.
> 
> This is what I meant when I said that you couldn't do arithmetic.

Doing math usefully consists of two phases: first deciding what math 
to do, and then carrying out the calculations.  You've done the 
second flawlessly, but failed utterly in the first regard.  You've 
started with grossly incorrect assumptions that have lead you to an 
equally incorrect answer.

You've argued about things like whether anybody's likely to ship a 
machine faster than a current Cray in the near future.  Given that I 
don't really think Cray-class machines have been profitable in the 
last decade or so, I agree completely that it's unlikely to happen.

That has NOTHING to do with what's theoretically possible though.  If 
Intel (or AMD, or whoever) decided to build a 128-bit, 256-way vector 
machine using .3 ns SRAM for its main memory, they could do so.  
They're NOT going to do it, because it makes no sense economically, 
but they doesn't change the fact that the technology exists to make 
it theoretically possible.

-- 
    Later,
    Jerry.
 
The universe is a figment of its own imagination.

------------------------------

From: [EMAIL PROTECTED] (David A. Wagner)
Subject: Re: Comment/Analysis requested Password to RawBinarykey method...
Date: 10 Jul 2000 10:43:04 -0700

In article <[EMAIL PROTECTED]>,
Jay Summet  <[EMAIL PROTECTED]> wrote:
> We take the passphrase, and give it to the hash (say MD5). We get a hash
> value out. We use the first byte of the hash value to index into the hash
> value (passphrase dependant of course) and use that byte as the first byte
> of our key.

I think this byte-indexing scheme is not such a great idea.
It seems unlikely to improve the strength of the hash much, and
it _does_ introduce a small bias: the resulting byte will be biased
towards zero in the low four bits.

(Why the bias?  Let D[0..15] be the digest.  We use B = D[D[0] mod 16]
as our output byte.  If D[0] mod 16 = 0, then B = D[0], so B mod 16 = 0,
always.  If D[0] mod 16 != 0, then B is effectively random, so B mod 16
is uniformly distributed, and B mod 16 is zero 1/16 of the time.  From
these two observations, we can compute the total chances that B mod 16
will be zero; the result is (1/16)*1 + (15/16)*1/16 = 31/256.  This means
that seeing a zero in the low four bits of the output byte is about twice
as likely as it should be.)

------------------------------

Crossposted-To: comp.arch
Subject: Re: Proposal of some processor instructions for cryptographical/other 
applications
From: [EMAIL PROTECTED] (Ian Kemmish)
Date: Mon, 10 Jul 2000 18:01:44 GMT

In article <8k9q8f$qpn$[EMAIL PROTECTED]>, [EMAIL PROTECTED] says...
>
>"Mok-Kong Shen" <[EMAIL PROTECTED]> wrote
>
>> Transposition is one of the basic operations in cryptography.
>
>Is it, any more? Having a look at the AES candidates, most of them carefully
>refrain from calling for bit transpositions simply because they're rather
>hard to implement.

>> Another processor instruction that I think is desirable is
>> to obtain the parity of groups of bits (e.g. 4, 8, etc.) from
>> consecutive words in memory and accumulate these into a word.
>
>Why not rearrange the problem (especially if you're using a bit sequence
>with lots of entropy to begin with), assume some nice person has transposed
>the bit matrix for you, and just do V = U[0] ^ U[1] ^ U[2] ^ U[3]?

Both of these topics say `lookup table' to me.  As does much of my current 
musical code, requiring millions of logs and sines per second (and FFTs of 
course).  As does much of my past graphical code, with stencilling and 
colouring code, and colour coordinate conversion.

One thought I have had, which is probably not of much value, is that one might 
extend the currently fashionable `prefetch' instructions (e.g. in the PIII 
instruction set) to include some way of ``declaring'' that an area of memory 
was a lookup table that was liable to be used inside loopy code, so that when 
it was needed, you code load the body of the table into cache faster during the 
first few iterations of the loop, and avoid cache misses in later 
iterations....


BTW -- about 20 years ago I had occasion to use a Harris H-100 minicomputer 
(one of the more succesful products to use an AMD 2901 bit-slice).  There was a 
rumour that this had a single instruction for DES encryption - does anyone know 
if this was true, or just an Urban Myth?


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ian Kemmish                   18 Durham Close, Biggleswade, Beds SG18 8HZ, UK
[EMAIL PROTECTED]                Tel: +44 1767 601 361
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Behind every successful organisation stands one person who knows the secret
of how to keep the managers away from anything truly important.


------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************

Reply via email to