Re: X509 reference counts

1999-12-19 Thread Bodo Moeller

Geoff Thorpe [EMAIL PROTECTED]:

 The idea is that you hand those BIOs over to the SSL library, you
 usually don't keep pointers of your own.  SSL_free(ssl) will call
 BIO_free for each of them, but just once if bio_read == bio_write, so
 usually everything works as intended.  Obviously this is not the
 cleanest interface conceivable ...

 Well as I mentioned repeatedly on the SSL_SESSION discussion a while back,
 "the idea is that ..." seems to be common justification for deliberate and
 lazy inconsistency, both from the point of view of the library design, and
 from the point of view of library-usage.

No, there are also inconsistencies that cannot be justified this way :-)

(SSL_CTX_set_tmp_dh does not increase a reference count or copy the
structure, while SSL_CTX_set_tmp_rsa does.  There is an explanation
for this, but I would not call it a "justification".)

  In this case I think the
 programmer(s) just wanted to "throw the BIOs into the SSL and forget about
 them" ... ie. why free your reference when you can just pass it over to
 another structure and forget about your references? Whilst this "works" it
 makes a mockery of the so-called "reference counting mechanism" and leaves
 programmers constantly in a quandry as how to correctly manage references
 - and occasionally how to avoid race-conditions too!

Which is why, one day, we should provide variants of all those "set"s
and "get"s that somehow encode in there names whether reference counts
are increased or not.

[...]
 "handing over" references and such cutting of corners leads to these
 inflexibilities in the way these objects/structures can be used.

Agreed.

[...]  If you
 keep only the bio_ssl pointer, BIO_free_all will free everything
 (bio_read will be freed twice because it's also at the next_bio
 pointer, so its reference count goes down to zero).

 But I don't want to "keep only the bio_ssl pointer" ... for one thing,
 once that handshake is complete I want to examine the session negotiated,
 grab the peer certificate (both of which reveal further reference counting
 anomalies but that's another soap-box), and possibly other SSL things that
 have nothing to do with the BIO which is simply an IO device.

If you want to do those SSL things, then the easiest thing is not to
use a bio_ssl in the first place.  Many programs work directly with
SSL pointers without adding the extra BIO layer; probably this is why
the reference-counting problems inherent in bio_ssl's don't seem to
disturb too many people.  (Or, they avoid bio_ssl's *because* they
look strange.)  Similarly, using memory BIOs like ssl_test.s does
always looked like an ugly kludge to me, presumably because it is --
thus usually the read and write BIOs are the same and the reference
counting problems in this area are not visible to the library user.

[...]
 That's why the comment in bio.h says "WARNING WARNING ... make sure
 you are doing a BIO_free_all()" since SSLeay 0.9.0b.  BIO_free_all
 follows the next_bio chain, BIO_free does not; I'm not sure if
 BIO_free has any use at all except when you are sure that that
 next_bio == NULL in which case both functions do the same anyway.

 Hardly bodes well when one has to sift through the .h file to discover API
 design comments that themselves admit to being a tad obscure and, even
 after finding and reading, leaves us in some doubt as to how things work
 and what the point is.

Finding out how things work is usually possible given that the source
is available; I'm not always sure what the point is though.
__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



Re: X509 reference counts

1999-11-10 Thread Dr Stephen Henson

Ramsay, Ron wrote:
 
 When you speak of breaking existing applications, I guess you mean
 applications based on the s_server code (or those that use sockets). As
 someone who has written code based on BIOs and not sockets, I am somewhat
 concerned that BIO reference counts are broken. Apparently I cannot make my
 code robust!
 
 We're still not at release 1.0. Please, can something be done?
 

Well the problem is this it doesn't just apply to BIOs and SSL it
applies to lots of other things as well.

There are two types of function that obtain and set one structure in
another. There is the XXX_get_YYY() and XXX_set_YYY().

The problem in this case is with the XXX_set_YYY(). Some of these up the
reference count of the passed object and some do not. The general logic
is that those that do are the ones where it makes sense to use the
passed object later and those that do not are where the object is
effectively "taken over". It could be argued that once you've set a BIO
to be used for SSL you aren't going to do anything with it later.

Unfortunately the logic isn't consistent and isn't documented on a
function by function basis.

Anyway the problem for existing code is this. If you have a function
that "takes over" an object then you'll have something like this:

x = X_new();
XXX_set_YYY(y);
... do something with x ...
X_free(x);

Whereas a reference count function will do this:

x = X_new();
XXX_set_YYY(y);
Y_free(y);
... do something with x ...
X_free(x);

If you change the logic then any code using a "take over" function will
need to be converted or it will have a memory leak. For a long running
mutlithreaded SSL server if this is not done it will grow until it eats
all available memory.

A "take over" function can't use the "reference count" version because
it will end up using freed memory and crash...

I think the best long term solution is to include a log of various "top
level" functions that keeps account of which function allocates what in
which thread. Then a thread can call thread_cleanup() or whatever and
automatically have its unfreed memory released or at least have it log
things in a more readable way. That way the logic can be changed and the
cleanup function can handle the mess.

Steve.
-- 
Dr Stephen N. Henson.   http://www.drh-consultancy.demon.co.uk/
Personal Email: [EMAIL PROTECTED] 
Senior crypto engineer, Celo Communications: http://www.celocom.com/
Core developer of the   OpenSSL project: http://www.openssl.org/
Business Email: [EMAIL PROTECTED] PGP key: via homepage.


__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



Re: X509 reference counts

1999-11-10 Thread Geoff Thorpe

Hi,

On Tue, 9 Nov 1999, Dr Stephen Henson wrote:

 Well I don't know the reason myself but from looking at the code I'd
 guess that those functions that return an object with an upped reference
 count tend to be those where its likely the application will want to
 keep the object around. This things like X509_get_version() will return
 an internal pointer because you are more likely to just check the value
 and do nothing else whereas with other examples you might want to keep
 the object afterwards. 

I'd reached a similar guess ... perhaps this could be made at least mildly
sane by restricting this behaviour to "internal" functions (ones beginning
with a lower case name, eg. ssl_get_whatever rather than any
SSL_get_zombied_pointer) but it would still be a bit ugly.

Another way to perhaps legitimise when reference counting can be safely
ignored would be as follows; if it is contained within a function
operating on an object which already has a reference to the thing being
"get"'d. Eg. Inside a function operating on an SSL* object, an internal
function may examine the peer certificate without incrementing the
reference count - the SSL object has already obtained (upped) a reference
in the X509 object so the function could temporarily "get" and use an
X509* pointer without upping the reference count and all would be ok.

However, this really shouldn't be allowed to happen across function
boundaries, and should be restricted to those small internal managings
where it's provably OK not to grab another reference and would otherwise
drag down concurrency by having excess locking going on. Anything less
than that however, and the reference counting can't be trusted by the
programmer and all code will have to carefully hack around what OpenSSL
does (or doesn't do) in each particular case.

 Of course having said that this isn't consistent and I changed the
 X509_get_pubkey() behaviour myself to up the public key reference count.
 Tracing and fixing all the memory leaks this caused was not an easy task
 :-(

I've had to do exactly the same thing with what I believe is the worst
culprit I've found yet; SSL_get_session(). It returns an SSL_SESSION*
pointer that is, from the moment you receive it, unsafe. Even if you
immediately make a manual call to up its reference count you could still
get caught out by a context-switch race. If you don't even try this, and
instead just hold onto the pointer for any length of time then you really
invite trouble. I was using this for a client utility where I didn't want
a cache, just a record of the last successfully used session ... so upon
SSL_is_init_finished(), I would call SSL_get_session and store whatever
was retrieved. Then next time I start a handshake, I'd check that the
session was still within reasonable timeout limits before applying it to a
new client SSL. The problem is that if the session had expired and been
destroyed during that time, the code that stored the SSL_SESSION* pointer
will never know until it segfaults trying to do something with the
SSL_SESSION* pointer.

I'm using the following patch, anyone have any objections to making this
permanent?

--- clip here -
--- openssl-0-9-5-dev/ssl/ssl_sess.cThu Nov  4 21:21:02 1999
+++ openssl-0-9-5-geoff/ssl/ssl_sess.c  Wed Nov 10 15:39:34 1999
@@ -69,7 +69,16 @@
 
 SSL_SESSION *SSL_get_session(SSL *ssl)
{
-   return(ssl-session);
+   SSL_SESSION *sess;
+   /* Need to lock this all up rather than just use CRYPTO_add so that
+* somebody doesn't free ssl-session between when we check it's
+* non-null and when we up the reference count. */
+   CRYPTO_r_lock(CRYPTO_LOCK_SSL_SESSION);
+   sess = ssl-session;
+   if(sess)
+   sess-references++;
+   CRYPTO_r_unlock(CRYPTO_LOCK_SSL_SESSION);
+   return(sess);
}
 
 int SSL_SESSION_get_ex_new_index(long argl, char *argp, int (*new_func)(),
--- clip here -

 Changing the logic is problematical not just because of the effort
 involved. Many programs will expect the old behaviour. In the example of
 a multithreaded server its memory requirements will grow steadily if
 they don't free things up: this is one case where a multithreaded server
 is at a disadvantage compared to a fork() + exit() kind.

Which is probably why secure web-servers haven't blown the lid off OpenSSL
yet. (a) the session caching is generally overriden by a more carefully
managed shared-memory cache, and (b) the child processes can be allowed to
die off after n requests and thus bury any memory leaks that might have
amassed during that time.

I think the best way to attempt to keep the memory under control would be
to get the reference counting consistent and understandable, then fix the
kludged code that used it and is thus temporarily broken, and then we'd
have some hope of being able to keep OpenSSL's memory management stable
and usable without fork()s and exit()s (or ugly code).


Re: X509 reference counts

1999-11-09 Thread Geoff Thorpe

Hi,

On Tue, 9 Nov 1999, Dr Stephen Henson wrote:

 Yes horrible isn't it?
 
 I've also noticed wildly inconsistent behaviour. When you call something
 like XXX_get_YYY() you might get something which will last only as long
 as the parent and it shouldn't be freed at all or something which will
 persist by virtue of the upped reference count.

sure is horrible ... I played around with this quite a bit yesterday and
the further in I looked the weirder it got. I'd ask why it is this way but
I'm not sure I want to know. :-)

 Also the way reference counts are upped is not friendly. I suppose we
 should have functions like _up_reference_count(x); for use at an
 application level rather than the current stuff which messes round with
 structure internals.
 
 I'm currently looking into rewriting the verify code. I'm hoping to
 retain compatability with the old behavior except where it does things
 that are really silly. More consistency with the reference counts is
 certainly one issue...
 
 However if it works the way I hope it will then it will support things
 like proper chain verify and certificate trust settings so hopefully no
 one would *want* to do things the old way.

I suppose the problem is that the more code that is built on top of (and
around the skirting of) the existing code - the more there is that will
break if the underlying machinery is fixed up. Adding code that *does*
work logically will be (almost by definition) inconsistent with the
existing API.

I find reference counts extremely useful, but they do more harm than good
when implemented inconsistently. If I have an X509 object, and a call to
"add it to something" returns successfully, I expect that I can hang on to
my reference for as long or short a period of time as I choose before
freeing it and things will "just work". In my case I wanted to maintain my
pointers to ssl, bio_ssl, bio_read, and bio_write and free them all when I
was done with them (not at the same time I might add). One look at the
reference counts and I knew it was not going to be that straightforward.

The most obvious case is that by setting the read and write BIOs on an
SSL, one should be able to free the BIOs afterward and the SSL will still
have references to them. That is not the case - freeing the BIOs causes
the SSL to go bang (ssl/ssl_lib.c:SSL_set_bio). I also tracked down that
weirdness about connecting a BIO_SSL with an SSL, namely it only grabs a
reference to the SSL's read bio (ssl/bio_ssl.c:ssl_ctrl with
BIO_C_SET_SSL). It's described in a comment in crypto/bio/bio.h about line
415. I can't understand why it's done this way and I know if I carelessly
wade in there trying to change it I will unleash the dominos. The fact
that this jumps around between ssl/ and crypto/bio so much is enough of a
deterrent in itself ...

Cheers,
Geoff


--
Geoff ThorpeEmail: [EMAIL PROTECTED]
Cryptographic Software Engineer, C2Net Europehttp://www.int.c2.net
--
May I just take this opportunity to say that of all the people I have
EVER emailed, you are definitely one of them.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



Re: X509 reference counts

1999-11-09 Thread Dr Stephen Henson

Geoff Thorpe wrote:
 
 Hi,
 
 On Tue, 9 Nov 1999, Dr Stephen Henson wrote:
 
  Yes horrible isn't it?
 
  I've also noticed wildly inconsistent behaviour. When you call something
  like XXX_get_YYY() you might get something which will last only as long
  as the parent and it shouldn't be freed at all or something which will
  persist by virtue of the upped reference count.
 
 sure is horrible ... I played around with this quite a bit yesterday and
 the further in I looked the weirder it got. I'd ask why it is this way but
 I'm not sure I want to know. :-)
 

Well I don't know the reason myself but from looking at the code I'd
guess that those functions that return an object with an upped reference
count tend to be those where its likely the application will want to
keep the object around. This things like X509_get_version() will return
an internal pointer because you are more likely to just check the value
and do nothing else whereas with other examples you might want to keep
the object afterwards. 

Of course having said that this isn't consistent and I changed the
X509_get_pubkey() behaviour myself to up the public key reference count.
Tracing and fixing all the memory leaks this caused was not an easy task
:-(

Changing the logic is problematical not just because of the effort
involved. Many programs will expect the old behaviour. In the example of
a multithreaded server its memory requirements will grow steadily if
they don't free things up: this is one case where a multithreaded server
is at a disadvantage compared to a fork() + exit() kind.

Steve.

-- 
Dr Stephen N. Henson.   http://www.drh-consultancy.demon.co.uk/
Personal Email: [EMAIL PROTECTED] 
Senior crypto engineer, Celo Communications: http://www.celocom.com/
Core developer of the   OpenSSL project: http://www.openssl.org/
Business Email: [EMAIL PROTECTED] PGP key: via homepage.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



RE: X509 reference counts

1999-11-09 Thread Ramsay, Ron

When you speak of breaking existing applications, I guess you mean
applications based on the s_server code (or those that use sockets). As
someone who has written code based on BIOs and not sockets, I am somewhat
concerned that BIO reference counts are broken. Apparently I cannot make my
code robust!

We're still not at release 1.0. Please, can something be done?

 -Original Message-
 From: Dr Stephen Henson [SMTP:[EMAIL PROTECTED]]
 Sent: Wednesday, 10 November 1999 8:19
 To:   [EMAIL PROTECTED]
 Subject:  Re: X509 reference counts
 
 Geoff Thorpe wrote:
  
  Hi,
  
  On Tue, 9 Nov 1999, Dr Stephen Henson wrote:
  
   Yes horrible isn't it?
  
   I've also noticed wildly inconsistent behaviour. When you call
 something
   like XXX_get_YYY() you might get something which will last only as
 long
   as the parent and it shouldn't be freed at all or something which will
   persist by virtue of the upped reference count.
  
  sure is horrible ... I played around with this quite a bit yesterday and
  the further in I looked the weirder it got. I'd ask why it is this way
 but
  I'm not sure I want to know. :-)
  
 
 Well I don't know the reason myself but from looking at the code I'd
 guess that those functions that return an object with an upped reference
 count tend to be those where its likely the application will want to
 keep the object around. This things like X509_get_version() will return
 an internal pointer because you are more likely to just check the value
 and do nothing else whereas with other examples you might want to keep
 the object afterwards. 
 
 Of course having said that this isn't consistent and I changed the
 X509_get_pubkey() behaviour myself to up the public key reference count.
 Tracing and fixing all the memory leaks this caused was not an easy task
 :-(
 
 Changing the logic is problematical not just because of the effort
 involved. Many programs will expect the old behaviour. In the example of
 a multithreaded server its memory requirements will grow steadily if
 they don't free things up: this is one case where a multithreaded server
 is at a disadvantage compared to a fork() + exit() kind.
 
 Steve.
 
 -- 
 Dr Stephen N. Henson.   http://www.drh-consultancy.demon.co.uk/
 Personal Email: [EMAIL PROTECTED] 
 Senior crypto engineer, Celo Communications: http://www.celocom.com/
 Core developer of the   OpenSSL project: http://www.openssl.org/
 Business Email: [EMAIL PROTECTED] PGP key: via homepage.
 
 __
 OpenSSL Project http://www.openssl.org
 Development Mailing List   [EMAIL PROTECTED]
 Automated List Manager   [EMAIL PROTECTED]
__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



X509 reference counts

1999-11-08 Thread Geoff Thorpe

Hi there,

I've been checking up on some memory management issues and have a couple
of questions. If anyone has any thoughts or views I'd very much appreciate
them.

X509_STORE_CTX_get_current_cert() and a few others do not up the reference
count on the X509 object returned. I was carefully X509_free()ing
everything I used and found this anomaly courtesy of some seg-faults. Some
of the functions in x509_vfy.c seem to modify the reference count so I'm
not sure if this is a bug or intentional. It's easy to workaround just by
not freeing, but I'd prefer to know if this should be changed rather than
missing it and having my programs develop leaks.

I'm getting extremely weird reference counts on SSLs and BIOs as below -
this is happening with 0_9_4 and with a recent snapshot of 0_9_5.

(1) I create two BIOs, a read (bio_read) and write (bio_write) that are
both BIO_s_mem()s, an SSL bio (bio_ssl), and an SSL object (ssl). At this
point they all have reference count 1. Ie. in crude ASCII-art form;

   +-+  --- bio_read
bio_ssl ---  | ssl |
   +-+  --- bio_write

(2) I add set the read and write BIOs to "ssl" with
SSL_set_bio(ssl, bio_read, bio_write). This does not change the
reference count on anything (everything remains at 1). I'd have expected
this to up the reference count on bio_read and bio_write.

(3) When I call SSL_free(ssl) later, it causes the reference counts for
all of ssl, bio_read and bio_write to be decremented. This is inconsistent
with (2) but seems the right way to operate.

(4) When I call BIO_set_ssl(bio_ssl, ssl, ...) it cause an increment in
ONLY the reference count of bio_read - this is the weirdest of all. I
would expect this to cause ssl, bio_read AND bio_write to all get
incremented but it only increments bio_read.

(5) Calling BIO_free(bio_ssl) decrements the reference counts for all of
ssl, bio_read, and bio_write. This is reasonable but inconsistent with (4)
and pretty much everything else.

I'm not quite sure what I should be doing with these and what the intended
behaviour of the reference mechanisms are - it would certainly appear that
they don't function correctly whatever the intention. Any thoughts?

Thanks in advance,
Geoff


--
Geoff ThorpeEmail: [EMAIL PROTECTED]
Cryptographic Software Engineer, C2Net Europehttp://www.int.c2.net
--
May I just take this opportunity to say that of all the people I have
EVER emailed, you are definitely one of them.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



Re: X509 reference counts

1999-11-08 Thread Dr Stephen Henson

Geoff Thorpe wrote:
 
 Hi there,
 
 I've been checking up on some memory management issues and have a couple
 of questions. If anyone has any thoughts or views I'd very much appreciate
 them.
 
 X509_STORE_CTX_get_current_cert() and a few others do not up the reference
 count on the X509 object returned. I was carefully X509_free()ing
 everything I used and found this anomaly courtesy of some seg-faults. Some
 of the functions in x509_vfy.c seem to modify the reference count so I'm
 not sure if this is a bug or intentional. It's easy to workaround just by
 not freeing, but I'd prefer to know if this should be changed rather than
 missing it and having my programs develop leaks.
 
[ other examples deleted ]

Yes horrible isn't it?

I've also noticed wildly inconsistent behaviour. When you call something
like XXX_get_YYY() you might get something which will last only as long
as the parent and it shouldn't be freed at all or something which will
persist by virtue of the upped reference count.

Also the way reference counts are upped is not friendly. I suppose we
should have functions like _up_reference_count(x); for use at an
application level rather than the current stuff which messes round with
structure internals.

I'm currently looking into rewriting the verify code. I'm hoping to
retain compatability with the old behavior except where it does things
that are really silly. More consistency with the reference counts is
certainly one issue...

However if it works the way I hope it will then it will support things
like proper chain verify and certificate trust settings so hopefully no
one would *want* to do things the old way.

Steve.
-- 
Dr Stephen N. Henson.   http://www.drh-consultancy.demon.co.uk/
Personal Email: [EMAIL PROTECTED] 
Senior crypto engineer, Celo Communications: http://www.celocom.com/
Core developer of the   OpenSSL project: http://www.openssl.org/
Business Email: [EMAIL PROTECTED] PGP key: via homepage.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]