Forgive my lack of knowledge in your existing code. But it is really designed with optimization in mind? What was the driving force for the C function?
If it is optimized what is the time required? I jumped way to early at the "fast" conclusion I must admit. Because I really never had speed in mind. As I explained my goal is to make it easy to understand. If it has any performance advantage it is purely a side effect. (You never answer my comment about performance in my last email so I can only guess what the design intent was for you code). I mean if you choose to optimize my code for speed, it's perfectly doable and I have full comfidence anyone else who have read this email thread can do it. But again, I have no idea how much time you spend on your routine so I guess I should refrain from dissing it. My mistake once again. What else will you be teaching me today? =) David On 7/8/05, Andy Polyakov <[EMAIL PROTECTED]> wrote: > > Please do not use previously mentioned routine, it missed 1 corner > > case where 32=num_bits_word(d) > > > > Revised routine that passes (cd test; make bntest). > > Does it mean that previous version didn't actually pass the test? I mean > if it did on your CPU, but not mine, probably we could learn something > else about ways PPC can be implemented... > > > All I had to do is add one more instruction to the routine. > > > > Please test on your ppc32 machines. > > > > Once we are all happy, > > Is this your agenda? Make everybody happy:-):-):-) Good luck:-):-):-) > > > it's a matter of adding the core dump at the beginning. > > Thus you have a fast, > > 32*(div latency + mul latency) is fast? If I call BN_bn2dec in loop it > spins 4 times slower than with current implementation. Well, at least on > computer I have access to... > > > easy to understand, predictable bn_div_words, as > > opposed to that monster in 0.9.8. > > Hostility again? Are you saying that nobody understands current > implementation and that it produces unpredictable results? I disagree:-) > > > Other architectures will benefit if this C function is used in bn_asm.c > > How? And which architectures exactly? Virtually all 32-bit > architectures, including PPC32, opt for > (BN_ULONG)(((((BN_ULLONG)h)<<BN_BITS2)|l)/(BN_ULLONG)d). A. > ______________________________________________________________________ > OpenSSL Project http://www.openssl.org > Development Mailing List openssl-dev@openssl.org > Automated List Manager [EMAIL PROTECTED] > ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]