Oh.  I'm sorry.  Someone needs to use a keyword 'volatile'.

Bingo.  Problem solved on the improper optimization issue.

Can we commit the patch so that we don't have to keep getting hit by 2
or 3 threads about valgrind complaining about reachable pointers at
the end of program execution every month?  This /is/ an OpenSSL
library bug, and it /does/ need to be addressed -- regardless of "it's
only 36 bytes", relying on an operating system cleaning up after you
is sloppy programming at best and bug-laden at worst.

Remember, not all architectures have memory management units, either.

-Kyle H

On 3/27/07, David Schwartz <[EMAIL PROTECTED]> wrote:

> > For POSIX threads, the result of reading a variable in one
> > thread while it
> > might be modified in another thread is undefined. Line 1289 and
> > 1290 should
> > be removed.

> Not this old chestnut again.

Like it or not, it's a fact.

> I can't name a CPU in which an aligned load/store of a natural/primitive
> data size between overlapping threads actually has "undefined" behavior.
>   The underlying assembly instructions generated by the C compiler have
> a defined behavior on multi-CPU systems.

Of course, that's because those CPUs don't exist yet. But I expect code that
I write and compile today to work on future CPUs because I don't rely on
guarantees I don't actually ahve.

> Due to this defined behavior of the underlying CPU at a lower level than
>   the scope of the POSIX specification is designed to cover you should
> treat this behavior as defined.  This would not contradict POSIX which
> is effectively saying "out of the scope of the POSIX spec" by using the
> term "undefined".

Umm, what CPU? *Every* CPU OpenSSL does or might ever run on? And that would
have to include every compiler OpenSSL does or might support, because the
compiler can change things around too.

> The function entrance, the "return" and the "CRYPTO_w_lock()" calls all
> act as the correct memory barriers with regards to compiler
> optimizations and load/store rescheduling concerns.  Even if the
> CRYPTO_w_lock() is a macro or inline, since the writer would have been
> sure to insert the correct barriers if he chose to implement
> CRYPTO_w_lock() as a macro/inline.

Where is that documented or guaranteed? You mean that they *happen* to be
barriers on the platforms you have used. You have no idea what the next
version of GCC might do. We are talking about saving *NOTHING* here.

> > Alternatively, lines 1288, 1289, 1290, and 1297 should be
> > removed. It's hard
> > to understand any reasonable scenario in which one thread might
> > be calling
> > SSL_free_comp_methods while another thread is or might be accessing the
> > ssl_comp_methods variable or any of the registered compression methods.

> I would agree with your overall view, but I would on the basis that this
> function is not performance critical to deserve any special lock
> contention avoidance treatment.

Exactly. This is the worst kind of premature optimization.

> But.... there is a case for locking to not be necessary at all.  If the
> SSL_free_comp_methods() is only designed to be used during the cleanup
> phase.  During a clean shutdown sequence of the OpenSSL library
> I'm pretty sure you are meant to have reverted the OpenSSL call
> execution path to single threaded mode.

Right. You can't let one thread unconditionally shut down a library while
another thread is or might be using it. However, it's possible this function
was also intended to 'unregister' the compression functions without harming
the library. In that case, the lock is needed and the optimization is
broken.

> Meaning while you are executing any cleanup functions it is not safe to
> be calling any other OpenSSL functions at the same time.  By virtue of
> this requirement locking should not be necessary (but should do more
> good than harm if implemented).  If your application needs to repeatedly
> init and cleanup, init and cleanup, then your application needs to also
> manage the state and the serialization of the init/cleanup calls (since
> they may not be safe to be called out-of-order, called twice,
> interleved, etc...).

It will do more good than harm if the locks haven't been cleaned up!

> > If another thread modified 'ssl_comp_methods' in between the
> > read and the
> > cache flush, that modification would be lost. I, for one, don't like
> > needlessly making code potentially disastrous on future CPUs
> > and compilers.
> > (To save what?!)
> >
> > ... the evils of premature optimization.

> No as I said by virtue of it being a function it already forces the
> compiler to perform the correct barriers.  Maybe you can name a CPU
> target which does not automatically flush pending writes for "return
> from function" and "call a function" instructions, I can't.

The problem occurs after the beginning of the function. If the compare is
done on a cached copy in a register. Look at this example:

if (variable!=NULL)
{
 free(variable);
 variable=NULL;
}

This could easily be optimized to (cached_copy is a register or other fast
place to store data):

int cached_copy=variable;
if(cached_copy!=NULL)
{
 acquire_lock();
 cached_copy=variable;
 free(cached_copy);
 cached_copy=NULL;
 variable=cached_copy;
 release_lock();
}
else variable=cached_copy;

If you think this cannot happen, I defy you to explain to me why. What
standard or guarantee is being violated? How can POSIX-compliant or
C-compliant code break with this optimization? How can you say it can never
be faster?

> The example you cite is not valid for line 1289 and 1290.  The only
> write operation to 'ssl_comp_methods' occurs inside the locked region
> and after the text condition has been re-evaluated.

So what? The compiler may not be smart enough to realize that deserves to
undo the optimization and may instead refresh before the lock acquisition.

> So your pseudo-code serves a useful purpose to illustrate to others on
> the list about the concern you raise, but the concern itself is not
> relevant to the example given.

Nonsense. You are assuming that future compilers won't do things that are
perfectly legal for them to do just because it doesn't make sense today.
This type of code is not permitted (except as a platform-specific
optimization) at any of the places I have worked as a professional
programmer and violates most professional coding standards.

DS


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]



--

-Kyle H
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to