Hi,

> 1/ The new path is against the openssl head cvs.
> In fact, I didn't see your sha1-mips in cvs dev version, otherwise I
> wouldn't have spend time on the same subject.
> Moreover, we used the same optimizations, mainly using the MIPS large
> register set to avoid storing the SHA1 internal state in memory, so the
> speed of both version is exactly the same.
> I made a kick and dirty patch to test your code with GCC (In attachment).
> Since I don't know perl at all, I cannot make a patch proposal so that
> the code could work on both IRIX env and Linux env.

Assembler programming is exhausting exercise and it makes no sense to
refrain from attempt to make it as reusable as possible. Quoting
http://cvs.openssl.org/chngview?cn=19905:

"
# There is a number of MIPS ABI in use, O32 and N32/64 are most
# widely used. Then there is a new contender: NUBI. It appears that if
# one picks the latter, it's possible to arrange code in ABI neutral
# manner. Therefore let's stick to NUBI register layout...
# ... [and let's follow certain] coding rules [that] facilitate
# interoperability...
"

This kind of means that you'd have to adapt your modules one more time
(which is why I suggested to take smaller steps and start by figuring
out what's sensible thing to do:-)...

Idea is to pass ABI mnemonic as first argument to script. There are o32,
n32, 64, nubi32, nubi64 recognized. Question is what's your ABI? If you
don't know by heart, compile following snippet with gcc -O -S and submit
assembler output:

int foo(int a,int b,int c,int d,int e,int f)
{ return a+b+c+d+e+f; }

Note that above mentioned commit covers not only sha1-mips.pl, but even
mips-mont.pl. Could you benchmark the latter on your system? To do that
generate assembler module, compile it and add to libcrypto.a. Then 'cd
crypto/bn; rm bn_mont.o; make'. Note the compiler command line,
copy-n-paste it to command line, add -DOPENSSL_BN_ASM_MONT and execute
it. 'cd ../..; make; apps/openssl speed rsa dsa'.

> 3/4/ On possible vulnerability of the proposed patch:
> - those optimisations are made for common SoC usage (ie. internet
> gateway...): the risk is exactly the same as the risk of the original C
> version. Using this code for i.e smartcard would be insane.
> - I didn't see timing attack on this code but peer review is important
> and welcome.

The weakness derives from large partitioned tables. It doesn't matter if
it's assembler or C.

> - this code is not cache attack resistant,
> but the performance penality would be just too high.

"Too high" is non-objective measure:-) Anyway, "compressed tables" I was
referring to don't necessarily refer to minimal of 256B. I was rather
referring to 1KB or 2KB tables. I mean what is relation between Te0-3?
Rotate operation. I.e. something that can be derived from single table.
See other assembler modules for example. Implementations using
compressed 1KB or 2KB tables were observed to perform adequately, see
sparcv9, parisc, arm4 modules...

> In real world, cache attack needs local system
> account on the target. For common SoC systems, it would mean that the
> whole system have been compromised anyway.

Feasibility of remote attack was discussed too. The thing about SoC
systems is that they commonly operate at relatively low frequencies.
Lower the frequency -> easier to measure timing variations -> less
statistical data required to perform analysis.

> 6/
> I moved the bn_xxx_word to separate bn/asm/(sh4/mips32).S files, only
> the comba functions were left as C file.

It's possible to compile without comba functions. It can be achieved
e.g. by compiling with -DOPENSSL_SMALL_FOOTPRINT, which is more than
appropriate for embedded systems. Benchmark that too...

More comments are likely to come... A.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to