Re: [patch] make AES-cfb128-encrypt faster by uglifying it
In message [EMAIL PROTECTED] on Thu, 25 May 2006 22:50:15 -0700 (PDT), Alex Dubov [EMAIL PROTECTED] said: oakad I thought all major compilers have sort of long long, oakad didn't them? After all, emulated long long is still oakad only two integer xors as opposed to 8 with char. If you look in the script Configure, you'll see what kinds of platforms we claim to support. That means that we have to be careful with the kind of assumptions we make. For example, your patch would fail miserably on VMS for VAX (which I know is still used out there). However, nothing stops you from making variants with different types of integers, maybe with some help from the macros used and defined in crypto/bn/bn.h, which are correctly defined for each platform, as far as we know. Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up. -- C.S. Lewis __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: [patch] make AES-cfb128-encrypt faster by uglifying it
Ok. How about now? --- Richard Levitte - VMS Whacker [EMAIL PROTECTED] wrote: In message [EMAIL PROTECTED] on Thu, 25 May 2006 22:50:15 -0700 (PDT), Alex Dubov [EMAIL PROTECTED] said: oakad I thought all major compilers have sort of long long, oakad didn't them? After all, emulated long long is still oakad only two integer xors as opposed to 8 with char. If you look in the script Configure, you'll see what kinds of platforms we claim to support. That means that we have to be careful with the kind of assumptions we make. For example, your patch would fail miserably on VMS for VAX (which I know is still used out there). However, nothing stops you from making variants with different types of integers, maybe with some help from the macros used and defined in crypto/bn/bn.h, which are correctly defined for each platform, as far as we know. Cheers, Richard - Please consider sponsoring my work on free software. See http://www.free.lp.se/sponsoring.html for details. -- Richard Levitte [EMAIL PROTECTED] http://richard.levitte.org/ When I became a man I put away childish things, including the fear of childishness and the desire to be very grown up. -- C.S. Lewis __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com aes_cfb.c.diff Description: 441793709-aes_cfb.c.diff
Re[2]: [patch] make AES-cfb128-encrypt faster by uglifying it
Hello Alex, Friday, May 26, 2006, 9:50:15 AM, you wrote: AD I thought all major compilers have sort of long long, AD didn't them? I'm on QNX4 with Watcom C v10.6B which has neither int_64 nor long long. So, I'm very anxious about not being able to keep my port current after such improvements... AD After all, emulated long long is still only two integer xors as AD opposed to 8 with char. Please, invent it a bit more portable but faster still! -- Best regards, Tony. mailto:[EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Re[2]: [patch] make AES-cfb128-encrypt faster by uglifying it
On Fri, 26 May 2006 14:32:36 +0400, [EMAIL PROTECTED] wrote: Hello Alex, Friday, May 26, 2006, 9:50:15 AM, you wrote: AD I thought all major compilers have sort of long long, AD didn't them? I'm on QNX4 with Watcom C v10.6B which has neither int_64 nor long long. So, I'm very anxious about not being able to keep my port current after such improvements... Can't you use OpenWatcom? It's had long long for some time and appears to still support QNX. -- __ | Brian Havard | He is not the messiah! | | [EMAIL PROTECTED] | He's a very naughty boy! - Life of Brian | -- __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re[4]: [patch] make AES-cfb128-encrypt faster by uglifying it
Hello Brian, Friday, May 26, 2006, 5:55:34 PM, you wrote: BH Can't you use OpenWatcom? It's had long long for some time and BH appears to still support QNX. Indeed it knows QNX4 still. The problem is that OW is not ported to QNX4 yet (and never will, I'm afraid). So it takes to cross-compile on windows. I use this approach for simpler things, but the serious projects do not work as expected if cross-compiled... I do not know why... Thank you for the suggestion. Still, I'd like OpenSSL to be a bit more portable... Currently I'm lacking the sha512 compilable here. It'd be very sad if more and more code should be configured out to have the package done... -- Best regards, Tony.mailto:[EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
[openssl.org #1333] ssl 64 bit compilation on windows 2003 XP and Visual Studio 2005
Please delete this case. I was able to resolve this error. [EMAIL PROTECTED] - Sat May 20 19:05:06 2006]: Hi The 32 bit openssl compiled fine I am trying to compile the ssl 0.9.8b on Windows 2003 XP 64 system with MS Visual Studio 2005. I am getting the following compilation error. cl /Fotmp32dll\e_ubsec.obj -Iinc32 -Itmp32dll /MD /Ox /W3 /Gs0 /GF /Gy /nologo -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DDSO_WIN32 -DOPENSSL_SYSNAME_WIN32 -DOPENSSL_SYSNAME_WINNT -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE -D_CRT_NONS TDC_NO_DEPRECATE -DOPENSSL_USE_APPLINK -I. /Fdout32dll -DOPENSSL_NO_RC5 -DOPENSSL_NO_MDC2 -DOPENSSL_NO_KRB5 -DOPENSSL_NO _DYNAMIC_ENGINE -D_WINDLL -DOPENSSL_BUILD_SHLIBCRYPTO -c .\engines\e_ubsec.c e_ubsec.c ml /c ms\uptable.asm Microsoft (R) Macro Assembler Version 8.00.50727.42 Copyright (C) Microsoft Corporation. All rights reserved. Assembling: ms\uptable.asm ms\uptable.asm(45) : error A2024: invalid operand size for instruction ms\uptable.asm(62) : error A2024: invalid operand size for instruction ms\uptable.asm(79) : error A2024: invalid operand size for instruction ms\uptable.asm(96) : error A2024: invalid operand size for instruction ms\uptable.asm(113) : error A2024: invalid operand size for instruction ms\uptable.asm(130) : error A2024: invalid operand size for instruction ms\uptable.asm(147) : error A2024: invalid operand size for instruction ms\uptable.asm(164) : error A2024: invalid operand size for instruction ms\uptable.asm(181) : error A2024: invalid operand size for instruction ms\uptable.asm(198) : error A2024: invalid operand size for instruction ms\uptable.asm(215) : error A2024: invalid operand size for instruction ms\uptable.asm(232) : error A2024: invalid operand size for instruction ms\uptable.asm(249) : error A2024: invalid operand size for instruction ms\uptable.asm(266) : error A2024: invalid operand size for instruction ms\uptable.asm(283) : error A2024: invalid operand size for instruction ms\uptable.asm(300) : error A2024: invalid operand size for instruction ms\uptable.asm(317) : error A2024: invalid operand size for instruction ms\uptable.asm(334) : error A2024: invalid operand size for instruction ms\uptable.asm(351) : error A2024: invalid operand size for instruction ms\uptable.asm(368) : error A2024: invalid operand size for instruction ms\uptable.asm(385) : error A2024: invalid operand size for instruction ms\uptable.asm(402) : error A2024: invalid operand size for instruction ms\uptable.asm(32) : error A2006: undefined symbol : r9 ms\uptable.asm(33) : error A2006: undefined symbol : r8 ms\uptable.asm(34) : error A2006: undefined symbol : rdx ms\uptable.asm(35) : error A2006: undefined symbol : rcx ms\uptable.asm(36) : error A2006: undefined symbol : rsp ms\uptable.asm(37) : error A2006: undefined symbol : rcx .. ms\uptable.asm(135) : error A2006: undefined symbol : r8 ms\uptable.asm(136) : error A2006: undefined symbol : rdx ms\uptable.asm(137) : error A2006: undefined symbol : rcx ms\uptable.asm(138) : error A2006: undefined symbol : rsp ms\uptable.asm(139) : error A2006: undefined symbol : rcx ms\uptable.asm(140) : fatal error A1012: error count exceeds 100; stopping assembly NMAKE : fatal error U1077: 'C:\Program Files (x86)\Microsoft Visual Studio 8\VC\BIN\ml.EXE' : return code '0x1' Stop. gmake: *** [compile64] Error 1 __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: [patch] make AES-cfb128-encrypt faster by uglifying it
Thus spake Alex Dubov [EMAIL PROTECTED] Ok. How about now? I'm curious if there's a significant performance difference between using u32 and u64; the former should be portable to all supported platforms, and may make the latter unnecessary. Plus, if we're going to go that route, we should consider that some platforms have 128-bit XOR support in hardware; is it worth implementing that too? How much of this should be extended to other ciphers? Should xorN() and moveN() be part of the bignum code for reuse in other modules? IIRC, I copied the CFB code from another module (DES? IDEA? I forget) with only slight changes; I didn't grok enough of it at the time to worry about performance, just maintaining correctness. S Stephen SprunkStupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them. --Aaron Sorkin __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: [patch] make AES-cfb128-encrypt faster by uglifying it
Ok. How about now? Subject to SIGBUS on most platforms. It's easy to carry away and score on x86 and render support for other platforms void, isn't it? I mean do mind unaligned access! I'm curious if there's a significant performance difference between using u32 and u64; the former should be portable to all supported platforms, and may make the latter unnecessary. I'd recommend [or even insist] on for (i=0;i16/sizeof(long);i++) loops and let compiler unroll them. 4x4-byte chunks on 32-bit platforms and 2x8-byte chunks - on 64-bit ones without a single shred of #if that-or-that spaghetti and no unnecessary dependency on totally unrelated bn.h. And once again, unaligned input/output is to be treated byte by byte. Plus, if we're going to go that route, we should consider that some platforms have 128-bit XOR support in hardware; is it worth implementing that too? Is it really that widely used/important mode? To justify that much extra complexity for little gain? How much of this should be extended to other ciphers? Should xorN() and moveN() be part of the bignum code for reuse in other modules? I'd be opposed to this. If performance gets that important, function call will hardly beat inline code anyway. Even if function is say 128-bit SSE2 and inline is just 4x32-bit. A. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: [patch] make AES-cfb128-encrypt faster by uglifying it
Thus spake Andy Polyakov [EMAIL PROTECTED] Ok. How about now? Subject to SIGBUS on most platforms. It's easy to carry away and score on x86 and render support for other platforms void, isn't it? I mean do mind unaligned access! Ah, that may have been why I didn't fix that code to use u32. More likely, it was a happy accident that I inherited the portability of the code I copied. I certainly introduced a few logic bugs of my own (which were quickly fixed by others)... I'm curious if there's a significant performance difference between using u32 and u64; the former should be portable to all supported platforms, and may make the latter unnecessary. I'd recommend [or even insist] on for (i=0;i16/sizeof(long);i++) loops and let compiler unroll them. 4x4-byte chunks on 32-bit platforms and 2x8-byte chunks - on 64-bit ones without a single shred of #if that-or-that spaghetti and no unnecessary dependency on totally unrelated bn.h. And once again, unaligned input/output is to be treated byte by byte. My experience is that, for blocks as short as we're discussing here, the tests for unaligned blocks usually defeat the benefit you get in the aligned case. Functions like memcpy() generally require a minimum size before they try any such trickery due to the cost of the test, and 16 bytes is probably on the edge for most platforms. If you're using a platform that will transparently handle unaligned access (either in hardware or software), it's worth it, but IMHO not on code that has to work on platforms that don't. Plus, if we're going to go that route, we should consider that some platforms have 128-bit XOR support in hardware; is it worth implementing that too? Is it really that widely used/important mode? To justify that much extra complexity for little gain? I hacked up a version of the AES code a while back that used SSE registers to pass the blocks around, do bitwise operations, etc. It was faster than the current version, but (IMHO) not enough to justify adding so much unportable hackery to the project. If one desperately needs speed, the existing approach is to use platform-specific asm, and that seems sufficient. How much of this should be extended to other ciphers? Should xorN() and moveN() be part of the bignum code for reuse in other modules? I'd be opposed to this. If performance gets that important, function call will hardly beat inline code anyway. Even if function is say 128-bit SSE2 and inline is just 4x32-bit. A. When I find such things useful, I tend to put them in a module's headers as a static inline function; that gets the speed of a macro with the semantics and safety of a real function. Unfortunately, that approach probably won't work on all of the platforms OpenSSL supports due to all the ancient compilers floating around. S Stephen SprunkStupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them. --Aaron Sorkin __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]