On December 1, 2003 05:53 pm, Lev Walkin wrote:
> Geoff Thorpe wrote:
> > As part of the general hackathon/audit we're doing in crypto/bn/ I
> > once again came across the curious zeroing code in bn_expand2, only
> > this time I figured it was high time for me to actually ask you about
> > it. :-)
> >
> > I understand the desire to cater for CPU pipelining with the 8-wise
> > loop unrolling, but is this a better solution than just using
> > memset() and letting the compiler take care of the same sort of
> > thing?
>
> It's #ifdef'ed already, so you may've as well tried it out and check
> performance with profiler.

Um, not in bn_expand2() it's not - though there are "#if 1" casings around 
similar code for BN_copy() et al. I'm looking at CVS by the way, so if 
you have stable releases or older snapshots, you may be seeing different 
code than I am.

> Even if profiling would show you that memset() is faster, it may be by
> order of magnitude slower on the machine very next to you (which has
> different version of C library or different compiler settings).

Well this of course is the kind of question that packagers must grapple 
with all the time. Do I build with "-march=<something-decent>" or do I 
maintain support for toasters? This isn't really what interests me in 
this case, though it may of course have interested whoever coded this.

> Bottom line: try it out.

Well I'm more interested in code maintainability than clock-tick 
comparisons of memset-vs-direct-assignments. The thing is, if there's a 
CPU out there that has (or will have) instruction-level support to memset 
a region of memory, then our direct assignments will buy us nothing. 
Conversely, if someone's building without optimisation for lcd-cpus, then 
calling memset() may suck in the most attrocious ways. That's an SEP. 
From a code maintenance point of view (something we are actively trying 
to improve inside crypto/bn/ right now), we already use memset()s, 
memcpy()s, and memmove()s in many places. So basically I'm curious to 
know if there is some real-world issue that justifies these bn-specific 
loops, or whether "it seemed like a good idea at the time".

Cheers,
Geoff

-- 
Geoff Thorpe
[EMAIL PROTECTED]
http://www.geoffthorpe.net/

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to