On Fri, 28 Aug 2009, Rainer Gerhards wrote:

>> -----Original Message-----
>> From: [email protected] [mailto:rsyslog-
>> [email protected]] On Behalf Of Rainer Gerhards
>>
>>>
>>> that would be hard to so for a couple reasons
>>>
>>> at 5-10 times slower the system may not be able to keep up (even with
>>> the
>>> 'slower' afternoon traffic)
>>>
>>> this is running on a very hardened production server, getting
>> valgrind
>>> installed there would require permission from the SVP level.
>>>
>>
>> understood. So let me see what else I can come up with :)
>
> I tried a lab yesterday where I sent roughly 1.5 billion messages (based on
> what I saw in the debug logs). Unfortunately, no abort happened. However, my
> traffic patterns was continous traffic of the same message.
>
> So I am now going to create some new tooling that permits me to mimic your
> traffic pattern much better. That will probably require until early next
> week. To make this really work, it would be really useful if you could send
> me some complete messages from your environment. I suggest to forward them
> via private mail. I hope this is possible.
>
> Also, it would be good if you could --enable-rtinst --enable-debug and try
> out that version on your machine. I am a bit concerned about the speed of the
> resulting executable, it may be too slow. You do not need to run it in debug
> mode itself. These option (especially--enable-debug) will activate in-depth
> runtime checks (assert, will abort when something wrong happens) and my hope
> is that they will catch the bug closer to the root cause. If so, I would need
> the gdb abort info (actually enabling debug output would be an option some
> time later).
>
> Please let me know what would be OK with you.

I will give this a try.

I was going to suggest that since we have the message getting corrupted it 
may make sense to make a temporary branch that has multiple message 
buffers and at various times through the message processing it makes a 
copy of the emssage to the buffer. when the system crashes I will be able 
to look at the core and see where the message is getting corrupted.

I will see about doing a tcpdump at the time that I do this and send it to 
you (I'll need to check with management, but since we have a contract in 
place for other reasons I think we can do this)

I can't do this late on a friday, but I should be able to do this monday 
afternoon.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to