Hi BIRD team! We found a case when BMP code is trying to connect with BMP collector service with sk_open(), this causes increasing CPU utilization. To reproduce this case, you have just:
1. Server machine where BMP PDU packets will be sent, should be reachable
(so it can be pinged).
2. BMP collector service itself should not be running on this server.
3. Run BIRD with enabled BMP protocol.
After that you should observe that BIRD process has significantly increased CPU
utilization. This is related somehow with “BIRD socket” because when I capture
network traffic on host machine (where BIRD is running), I can see massive
amount of TCP packets which are exchange between BIRD host machine and BMP
collector machine. At the moment socket type related with BMP connection is
SK_TCP_ACTIVE.
Do you have any idea what is going wrong or how BIRD socket should be properly
use?
As a temporary fix, I have provided patch allows to avoid this issue but it is
very ugly hack because it frees BIRD socket outside of IO code (sk_free()) and
initialize again socket again every time when ECONNREFUSED error is passing to
err_hook callback.
I need also a tip if there is a way to get notification from BIRD socket if we
lost connection with BMP collector service? One option is to check if sk_send()
failed but what in situation when there are no updates to send by longer time
and I would like to get a notification ASAP when I lost connection with BMP
collector service. Is this possible with current BIRD implementation or I
should to add some timer's callback which will check somehow if BMP collector
service is alive? This mechanism is needed for me to synchronize/re-send all
BMP data to the collector.
Currently we have switched to BMP code provided on bmp branch from gitlab BIRD
repo.
Additionally I have a question referring to enclosed code. Can I free list node
and node data itself when sk_send() returns value greater or equal to 0 (>= 0),
like in the below code?
WALK_LIST_DELSAFE(tx_data, tx_data_next, p->tx_queue)
{
...
rv = sk_send(p->sk, data_size);
if (rv < 0) {
return;
}
mb_free(tx_data->data);
rem_node((node *) tx_data);
mb_free(tx_data);
if (rv == 0) {
return;
}
...
Or I should to do that only if sk_send() return value greater than 0 (> 0) ? My
goal is sending all data from list if there was only "temporary" problem with
sk_send().
Thanks,
----
Pawel Maslanka
Senior Software Engineer
[signature_1256476543]
Office: +1.617.444.1234
Cell: +1.617.444.1234
Akamai Technologies
150 Broadway
Cambridge, MA 02142
Connect with Us:
[signature_580743884]<https://community.akamai.com/> [signature_1866338322]
<http://blogs.akamai.com/> [signature_2113959087] <https://twitter.com/akamai>
[signature_447607273] <http://www.facebook.com/AkamaiTechnologies>
[signature_1901210113] <http://www.linkedin.com/company/akamai-technologies>
[signature_1973184621]
<http://www.youtube.com/user/akamaitechnologies?feature=results_main>
bmp_connect_failed.patch
Description: bmp_connect_failed.patch
