Re: c verses c++

Terrell Larson Thu, 23 Dec 1999 17:30:36 -0800
I'll go out on a limb and give an answer to this question that I know some will 
disagree with.  The answer is both yes and no.  I'll 
warn you that this emal drags on a bit and some of my ideas may be incorrect.  If so - 
will those who come across these errors 
please advise me.  Unlike some - when faced with new information I will modify my 
opinions.  :-)


First off.  Since OpenSSL is written in C and not C++, you can link it easily into a 
web server regardless of whether the server is 
written in C or C++.  This means language is not really an issue.  On the other hand 
if OpenSSL were written in C++, then people 
like me would have to change the language extention to .cpp rather than .c when we run 
it through the GNU gcc compiler on 
systems such as linux and this will increase the code by about 50K (I tested it).  If 
you are on an NT server - probably it doesn't 
matter because probably M$ has not bothered to support both C++ and the C language 
subset and the VCC compiler 
_probably_ spits out the same code regardless.  Now - I am guessing on this because I 
do not use VCC on NT.  On NT I use 
Imprise (Borland) Proffessional builder 4.0 and everything I compile on the NT 
platform for me is C++ regardless.  And at this 
point I don't really care anyway so I won't be digging into the bowels of Borland's 
compiler.

OK - language issues aside.  

The design of the webserver makes a considerable difference.  I'll talk exculsively 
about the Linux platform because I know 
something about it (firstly) and secondly because I have no intentions of working on 
the NT server side of affairs.  Some may 
disagree but IMHO Linux in the server market is head and shoulders better than NT.  
And in my case I own the company and 
make the decisions so that is the end of it.

There are basically three _workable_ methods I know of which can be used to implement 
a server:

1) Threads

2) Forks with a separate process for each connection

3) Forks with multiple connections handled by each process.

I have been advised that threads are viable.  I personally was not able to find enough 
documentation on threads and I 
dropped the idea of researching any futher because I found I could do everything I 
wanted to do with a fork - and a fork is nice 
and easy and relatively well implemented in Linux.

Ok - I'll talk about implementation methods 2 and 3.  Apache uses #2 from what I can 
tell.  This means that a separate process 
gets dispatched to handle each connection and once that connection is made - that 
process is "busy" until the response has 
been satisfied.  Now - people we step forth and say:  the webserver  server is 
stateless... once the document has been 
shipped out the nic - the process can handle any other request whether related or not. 
 This is true - but it also in not what I'm 
talking about.

Lets say that the apache web request needs a file to be read from the hard drive (it 
could just as easily be a database request 
via PHP3 for instance).  Well - the process sees the request and issues a read from 
the hard drive.  As soon as this happens 
that operating system blocks the process and the process will remain blocked until the 
I/O is satisfied at which point it can fire 
the data out through the nic and advise apache that it is ready to handle another 
request.

So with this design - the present design of OpenSSL works ok.  OpenSSL is linked into 
the Apache server via an apache 
module and it handles the connection and never needs to worry about another request 
once it starts its connection because 
the apache process was never intended to do more than one thing at a time anyway.

Ok - lets look at #3

In this case we have a server design where the webserver forks just like apache does - 
BUT - each process listens on and 
handles many clients.  This means that when a request comes in from a client - the 
process must dispatch an I/O in a fashion 
such that it DOES NOT BLOCK.  If it were to block - even for the few milliseconds that 
an I/O was taking place - then it would 
NOT be in a position to handle requests during this time.  Such a web server might 
have several disk I/O's running 
simultaneously.

I wrote a serve like this just a short while ago and I'm scratching my head how to 
integrate OpenSSL into it.  The problem is that 
a great deal of CPU time needs to be spent in the crypto engine and during this time - 
the process would effectively be blocked 
because it certainly can not issue a select() in the mainline if the function that was 
called to do crypto has not returned yet.  This 
is further complicated by the fact that the BIO routines want to do I/O way down in 
the bowels of OPENSSL and they _might_ be 
talking to a client who starts a connection just as his modem disconnects and it has 
to time out.

You can see just how ugly this might be.

Another factor is that a heavily loaded web server might want to have a group of 
support CPU's to grind through the crypto.  
The numbers I have been able to get seem to indicate that the MAX crypto throughput on 
a HIGH end pentium is about the 
same as a T1 line will carry - and at this point the CPU will be saturated.  So - if 
the crypto is performed in the web server and 
the web involves say a database like Oracle - AND we want to use something like PHP3 
to interface to the database - then that 
machine is going to saturate long before T1 is achieved and I'll bet that it will 
saturate at probably 1/4 of T1 or less.

So - if you are anticipating very high loads - like a few million hits per day - then 
you ain't gonna be able to support it with a 
single CPU running OpenSSL even if you have a really fast server designed along the 
lines of option #3.

What you really need to be able to do is receive the messages into the main server and 
then dispatch them to a backend CPU 
to run the crypto and as that cpu finishes and prepares a response - pass it back to 
the main server so it can be relayed to the 
client.   In this way - 2, 3, 4 or even 100 or more cheap CPU's could be harnessed to 
support the server.

Now - why do it like this?  Answer: economics.  Lets say you set up a server with say 
100 GB  raid 5 and a nice fast CPU and 
loads of RAM.  Lets also say that during a 24 hour period the crypto engine will burn 
up say 12 hours of CPU time.  Well - the 
common contention is that the machine will time share and multi task and blah blah so 
that you'll be able to get really good 
preformance.  The real answer to this is that you PROBABLY WON'T.  Lets say that 
machine can handle oh say 100 disk I/O's 
per second.  It can only do this if the CPU is able to run the code that scheduals the 
I/O - if that CPU is tied up running crypto - 
then the I/O may not get schedualed and the consequence is that the I/O capacilty of 
that machine may be cut by 50% or even 
more.  It is for reasons like this that some machines come with 16 CPU's or more.

Now look what happens if your hit rate were to increase 10x.  In this example you 
needed 12 hours of cpu for crypto per 24 hour 
period and with 10x the work load you are going to have to go buy 5 more machines... 
and this would mean that 100% of the 
cpu on all 6 machines would be allocated to crypto and there is no way that anything 
else would get done.  Ok - so you go buy 
say ANOTHER 6 machines and now you figure you can handle the workload.  On each 
machine you'll probably have to throw 
in 100 GB of RAID mass storage too - unless you can network to a database server but I 
used Oracle as an example here and 
according to SUN , SQL*NET will generate a separate TCP/IP packet for each field 
transfered to or from the RDBMS or your 
network will go nuts with the traffic and your cpu's will be wasting GOBS of cycles 
generating the overhead.

Anyway you look at it - I expect that You'll be buying quite a few 100GB raid 
subsystems and they are going to set you back a 
LOT of bucks.

--------------------

Now - contrast this to a redesign of OpenSSL where a few daemon processes are put in a 
fast processor that can in fact be 
diskless and boot off the network.  The cost of one of these crunchers is less that 
$1000 and if a person wanted they could 
probably use say a celeron 450 on a motherboard with a nic and say 64MB ram and do the 
whole thing for maybe even less 
than $500.00

If you need 10, it'll set you back $5,000 which is a drop in the bucket compaired to 
what you'll be looking at if you try to clone 
your webservers ten times over.

--------------------

I hope this answers your question.




On Thu, 23 Dec 1999 20:13:05 GMT, Niels Heyvaert wrote:

>Calm down. I think we are missing the point here. (I don't want to sound 
>rude)
>My original question still hasn't been answered yet. So please, I'm asking 
>you, this is really important to me. I'm a ICT engineer and I graduate this 
>year. This question is part of my thesis. I think you understand what that 
>means...
>
>Now, not looking at the language used, is it possible to "painless" migrate 
>the openssl project into a dedicated webserver that is allready up and 
>running?
>
>The code is available at www.goahead.com if anyone is interested...
>
>Be kind and stick to the point, please.
>
>Thank you.
>
>Niels.
>
>>From: "Terrell Larson" <[EMAIL PROTECTED]>
>>Reply-To: [EMAIL PROTECTED]
>>To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>Subject: c verses c++
>>Date: Thu, 23 Dec 99 07:45:23 -0600
>>
>>So???  what are you saying?  In general any good design and implementation 
>>is better than a bad one regardless of the
>>choice of the implementation language.  It appears to me that you are 
>>accusing the OpenSSL developers of producing a
>>"hack"...  Or did I interpret what you said properly?
>>
>>On Thu, 23 Dec 1999 09:56:13 +0100, Rene G. Eberhard wrote:
>>
>> >
>> >> WHy woudl you transform it to C++.  It adds about 50K to the
>> >> executable on linux GCC and runs slower.  I can't see much reason
>> >> to use C++ for a library liek OpenSSL
>> >
>> >Your statement is not generally applicable! A C++ binary may be
>> >a bit larger than a C++ binary. Wheter it runs slower depends
>> >on the design. A proper OO design in C++ is (in general) faster
>> >and more stable than a C hack without a design.
>> >
>> >Regards Rene
>
>______________________________________________________
>Get Your Private, Free Email at http://www.hotmail.com
>
>______________________________________________________________________
>OpenSSL Project                                 http://www.openssl.org
>Development Mailing List                       [EMAIL PROTECTED]
>Automated List Manager                           [EMAIL PROTECTED]


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]
Re: c verses c++

Reply via email to