On Jan 28, 2007, at 9:58 PM, Steve wrote:
Well I appreciate the feed back on the overall design model, and I
agree, polling a listener instead of setting the listener to blocking,
and running it in it's own thread, is probably a bad idea, it's on my
list to fix here soon.
The rest of the arguments against threading it this way don't
particularly apply as I see it, since the server app is supposed to be
running on a multicore or SMP setup.
I'ld really, like to maximize CPU utilization, as well as minimizing
the amount of bandwidth.
The memory overhead of kernel threads applies no matter how many CPUs
you have in your system. Furthermore, internet chat is not a
fundamentally compute-intensive problem. There's nothing to compute
at all until data arrives, and the only processing involved is to do
some minimal interpretation of the incoming data and send it right
back out again. Worrying about multiple cores is serious premature
optimization, and premature optimization almost never optimizes the
real bottlenecks.
If you continue in your quest to create a number of kernel threads
that scales linearly with the number of active chat sessions, your
architecture WILL NOT SCALE past a certain level, and that level will
be far lower than if you used a single thread on a single cpu.
If you absolutely MUST utilize multiple processors (and I sincerely
doubt it would ever become necessary unless you completely blundered
the design or somehow convinced a remarkable number of people to chat
on your server) it would be more appropriate to have approximately
the same number of kernel threads as CPU cores. You could easily
extend an event-driven select()-style server to this kind of
architecture by having the events dispatched to worker threads, each
of which would handle multiple communication channels. You might
also consider an IRC-like model, where you can scale through the use
of multiple distinct servers.
The traditional method of the server iterating through a large list of
clients just doesn't seem to me to be particularly efficient.
Especially if you have different clients interested in different data.
Perhaps you're thinking of something different than what I thought
you were. I thought you meant that you would create lists of clients
who were interested in certain kinds of messages, so that when one of
those came in, that list would have the event dispatched to them.
If you instead meant, as you seem to now, that there is a single
global list of clients and you must iterate through them all to
discover where to send messages... well, yes, that's not very
optimal. Don't do that, but don't do what you're planning on, either.
Don't poll. Register interested clients with the message dispatcher,
so a message can be dispatched precisely where it is supposed to as
soon as it comes in. You should do it this way whether you use a lot
of threads or not.
Or as in this case you pay once for a server upgrade, but continously
for bandwidth. The 250ms sleep time is meant for times when lots of
activity needs to be reported to the client, the default could be much
longer.
You're talking about these time intervals and hardware and such like
you have an idea how your proposed system will actually perform vs.
other architectures as it scales. Unless you have done some sort of
simulation, you really don't, and you're prematurely complicating
things.
But I want to thank everyone for the feedback on the pthreads question
which was the crux of my issue. I believe I've found a more compact
way of doing what I was doing now.
Basically we have the runObject function, but instead of sleeping and
calling runObject(data) again, we call a member function run() of the
object handed to runObject, and the run() function is self scheduling.
So no more thunking after the first time :D
Here's a question though, could runObject be handled as a template
instead? Seems to me a template would allow it to run pretty much
anything handed to it, without the need to change the recast for each
object type.
Thanks again for the advice!
p.s. Is there a better threading library than pthreads?
Pthreads is actually very good, for what it is, which is a fairly low-
level standardized interface to basic multithreading primitives.
It's just that multithreaded programming with only the basic
multithreading primitives is very hard, and I wouldn't recommend that
someone that doesn't have a lot of experience with it try to use it
to build software that is supposed to be robust and scalable.
There may be higher-level threading libraries and common practices
for multithreaded C++ that alleviate some of the difficulties, but a C
++ expert would have to fill you in on those. For my part, I
recommend you stay away from threads or make do with a very small
number of threads with very tightly controlled points of
communication. Such programs are usually faster, more reliable, and
scale better anyway.
--Levi
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/