Status: New
Owner: ----
Labels: Component-Diameter Type-Enhancement Priority-Medium Version-1.4.0 Release-Type-FINAL Roadmap-Fix

New issue 2446 by [email protected]: Unknown realm name [null] error when sending diameter response under high concurrency
http://code.google.com/p/mobicents/issues/detail?id=2446

What steps will reproduce the problem?
1. Use a diameter client to send a large number of concurrent requests (e.g. 10 threads and 10000 per thread) into the SLEE Diameter RA 2. After a while, an error will be logged indicating that the reponse message could not be sent due to the realm name being null.


What is the expected output? What do you see instead?
The reaml name is known to the stack and this error should not be present

What version of the product are you using? On what operating system?
jdiameter-impl-1.5.4.1-build415.jar on Solaris

Please provide any additional information below.
An investigation shows that this is due to the logic inherent in RouterImpl and the way it looks up a response destination host and realm using a hopbyhopid. The id's are stored in a Map when a request comes in, and retried when a response is sent. The issue is that the code prunes the first half of the map when its size reaches 10240.

Under high load on a powerful server, lets say 1000 requests per second come into the stack. After 10s, the first 5120 hopbyhopid's will be removed. If these are removed, then responses for the messages received in the first 5s that took longer than 5s to process would fail to be sent. Although this is not common, its an undesirable situation to ungracefully loose messages like that under load. I would recommend that the pruning logic be changed to rather remove the first 10% of the current size of the Map when it exceeds 20*1024 messages (as opposed to the current logic of 5120 when it reaches 10240). This still keeps the map from growing infinitely and gives a larger window for high load.



Reply via email to