Greetings,

>> hmm, this could be locking overhead as well. One thing that you did
>> early
>> in v5 (I don't think it made it into v4) was to allow the UDP receiver
>> to
>> insert multiple messages into the queue at once. That made a huge
>> difference.
>
> No, I think that was something I did to both versions. At some time, I did
> optimizations to both v4 and v5, things like reducing copies, reducing malloc
> calls and so on. I am pretty sure submission batching was among them.

I agree with David actually.  While multiple tcp threads on the input
side certainly would be helpful, I believe the locking overhead is
likely the real culprit behind the inability to fully utilize a
multi-core machine with a single instance of rsyslog.  In my
experience, while the input thread was certainly relatively busy, the
thread itself wasn't hitting a cpu bottleneck.  Reducing some of the
latencies around queuing and context switching is probably the best
place to spend time if the goal is improved performance.  The earlier
investigations into lockless queues combined with some batching may
help to address these.  As it stands, I don't regularly see specific
threads hitting cpu bottlenecks (assuming top -H is accurate).

Also, if that is the problem (queues and context switching), adding
further division of work into imtcp may actually make the problem
worse.  That said, I'm not against reducing possible bottlenecks to
get into the 1-10 gig input levels (at which this would probably
become an issue) - but I think the queues should be more closely
examined first.

-Aaron
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to