Hi Frank, thanks for the quick reply.
So if the agent sends back 3 identical REPORTs for example at the same time.
Assume Snmp class is processing them concurrently by using a
MultithreadMessageDispatcher, then the following could happen:
Suppose the following threads reach line 1314 at the same time:
Thread 1 reaches line 1314 if (request.requestStatus <
request.maxRequestStatus)
Thread 2 reaches line 1314 and also checks request.requestStatus < 2
Thread 3 reaches line 1314 and also checks request.requestStatus < 2
Thread 1: Since Request.requestStatus < 2 => thread 1 goes through and sets
resend=true, then goes through until line 1340 and and sends out the message.
Thread 2: Checks Request.requestStatus < 2 => since the value of requestStatus
was not incremented atomically, thread 2 will see the value as 0, and set
resend = true. Then when it reaches line 1340, it will resend the GET request.
Thread 3: Checks Request.requestStatus < 2 => since the value of requestStatus
was not incremented atomically, thread 2 will see the value as 0, and set
resend = true. Then when it reaches line 1340, it will resend the GET request.
By the logic above, if an agent is buggy or sends N REPORTs with the same
handle, then this code might be susceptible to sending / retrying back N number
of requests where N is > request.maxRequestStatus.
This may be a theoretical case, but I only theorized this because we saw a
wireshark trace where 60 identical GETs were issued in 20 milliseconds.
Here is a snippet of the wireshark trace:
No. Time Source Destination Protocol Length
Info
3605 25.115655 10.194.33.80 10.194.30.193 SNMP 173
get-request SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
Frame 3605: 173 bytes on wire (1384 bits), 173 bytes captured (1384 bits)
WTAP_ENCAP: 1
Arrival Time: Jan 16, 2013 12:23:29.392196000 Eastern Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1358357009.392196000 seconds
[Time delta from previous captured frame: 0.000010000 seconds]
[Time delta from previous displayed frame: 0.000010000 seconds]
[Time since reference or first frame: 25.115655000 seconds]
Frame Number: 3605
Frame Length: 173 bytes (1384 bits)
Capture Length: 173 bytes (1384 bits)
[Frame is marked: True]
[Frame is ignored: False]
[Protocols in frame: eth:ip:udp:snmp]
[Coloring Rule Name: UDP]
[Coloring Rule String: udp]
Ethernet II, Src: Cisco_1e:6c:40 (00:19:a9:1e:6c:40), Dst: Cisco_79:df:41
(00:22:55:79:df:41)
Destination: Cisco_79:df:41 (00:22:55:79:df:41)
Address: Cisco_79:df:41 (00:22:55:79:df:41)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
Address: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.194.33.80 (10.194.33.80), Dst:
10.194.30.193 (10.194.30.193)
User Datagram Protocol, Src Port: 39417 (39417), Dst Port: snmp (161)
Source port: 39417 (39417)
Destination port: snmp (161)
Length: 139
Checksum: 0xd96c [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Simple Network Management Protocol
msgVersion: snmpv3 (3)
msgGlobalData
msgID: 745338668
msgMaxSize: 65535
msgFlags: 05
.... .1.. = Reportable: Set
.... ..0. = Encrypted: Not set
.... ...1 = Authenticated: Set
msgSecurityModel: USM (3)
msgAuthoritativeEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
msgAuthoritativeEngineBoots: 14
msgAuthoritativeEngineTime: 5971516
msgUserName: blah
msgAuthenticationParameters: 5913acc5dc2c77534ec09eb8
msgPrivacyParameters: <MISSING>
msgData: plaintext (0)
plaintext
contextEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
contextName:
data: get-request (0)
get-request
request-id: 87745250
error-status: noError (0)
error-index: 50
variable-bindings: 1 item
SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
(1.3.6.1.4.1.9.9.305.1.1.2.0): Value (Null)
Object Name: 1.3.6.1.4.1.9.9.305.1.1.2.0
(SNMPv2-SMI::enterprises.9.9.305.1.1.2.0)
Value (Null)
No. Time Source Destination Protocol Length
Info
3606 25.115950 10.194.33.80 10.194.30.193 SNMP 173
get-request SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
Frame 3606: 173 bytes on wire (1384 bits), 173 bytes captured (1384 bits)
WTAP_ENCAP: 1
Arrival Time: Jan 16, 2013 12:23:29.392491000 Eastern Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1358357009.392491000 seconds
[Time delta from previous captured frame: 0.000295000 seconds]
[Time delta from previous displayed frame: 0.000295000 seconds]
[Time since reference or first frame: 25.115950000 seconds]
Frame Number: 3606
Frame Length: 173 bytes (1384 bits)
Capture Length: 173 bytes (1384 bits)
[Frame is marked: True]
[Frame is ignored: False]
[Protocols in frame: eth:ip:udp:snmp]
[Coloring Rule Name: UDP]
[Coloring Rule String: udp]
Ethernet II, Src: Cisco_79:df:41 (00:22:55:79:df:41), Dst: Cisco_1e:6c:40
(00:19:a9:1e:6c:40)
Destination: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
Address: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_79:df:41 (00:22:55:79:df:41)
Address: Cisco_79:df:41 (00:22:55:79:df:41)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.194.33.80 (10.194.33.80), Dst:
10.194.30.193 (10.194.30.193)
User Datagram Protocol, Src Port: 39417 (39417), Dst Port: snmp (161)
Source port: 39417 (39417)
Destination port: snmp (161)
Length: 139
Checksum: 0xd96c [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Simple Network Management Protocol
msgVersion: snmpv3 (3)
msgGlobalData
msgID: 745338668
msgMaxSize: 65535
msgFlags: 05
.... .1.. = Reportable: Set
.... ..0. = Encrypted: Not set
.... ...1 = Authenticated: Set
msgSecurityModel: USM (3)
msgAuthoritativeEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
msgAuthoritativeEngineBoots: 14
msgAuthoritativeEngineTime: 5971516
msgUserName: blah
msgAuthenticationParameters: 5913acc5dc2c77534ec09eb8
msgPrivacyParameters: <MISSING>
msgData: plaintext (0)
plaintext
contextEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
contextName:
data: get-request (0)
get-request
request-id: 87745250
error-status: noError (0)
error-index: 50
variable-bindings: 1 item
SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
(1.3.6.1.4.1.9.9.305.1.1.2.0): Value (Null)
Object Name: 1.3.6.1.4.1.9.9.305.1.1.2.0
(SNMPv2-SMI::enterprises.9.9.305.1.1.2.0)
Value (Null)
No. Time Source Destination Protocol Length
Info
3607 25.115960 10.194.33.80 10.194.30.193 SNMP 173
get-request SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
Frame 3607: 173 bytes on wire (1384 bits), 173 bytes captured (1384 bits)
WTAP_ENCAP: 1
Arrival Time: Jan 16, 2013 12:23:29.392501000 Eastern Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1358357009.392501000 seconds
[Time delta from previous captured frame: 0.000010000 seconds]
[Time delta from previous displayed frame: 0.000010000 seconds]
[Time since reference or first frame: 25.115960000 seconds]
Frame Number: 3607
Frame Length: 173 bytes (1384 bits)
Capture Length: 173 bytes (1384 bits)
[Frame is marked: True]
[Frame is ignored: False]
[Protocols in frame: eth:ip:udp:snmp]
[Coloring Rule Name: UDP]
[Coloring Rule String: udp]
Ethernet II, Src: Cisco_1e:6c:40 (00:19:a9:1e:6c:40), Dst: Cisco_79:df:41
(00:22:55:79:df:41)
Destination: Cisco_79:df:41 (00:22:55:79:df:41)
Address: Cisco_79:df:41 (00:22:55:79:df:41)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
Address: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.194.33.80 (10.194.33.80), Dst:
10.194.30.193 (10.194.30.193)
User Datagram Protocol, Src Port: 39417 (39417), Dst Port: snmp (161)
Source port: 39417 (39417)
Destination port: snmp (161)
Length: 139
Checksum: 0xd96c [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Simple Network Management Protocol
msgVersion: snmpv3 (3)
msgGlobalData
msgID: 745338668
msgMaxSize: 65535
msgFlags: 05
.... .1.. = Reportable: Set
.... ..0. = Encrypted: Not set
.... ...1 = Authenticated: Set
msgSecurityModel: USM (3)
msgAuthoritativeEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
msgAuthoritativeEngineBoots: 14
msgAuthoritativeEngineTime: 5971516
msgUserName: blah
msgAuthenticationParameters: 5913acc5dc2c77534ec09eb8
msgPrivacyParameters: <MISSING>
msgData: plaintext (0)
plaintext
contextEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
contextName:
data: get-request (0)
get-request
request-id: 87745250
error-status: noError (0)
error-index: 50
variable-bindings: 1 item
SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
(1.3.6.1.4.1.9.9.305.1.1.2.0): Value (Null)
Object Name: 1.3.6.1.4.1.9.9.305.1.1.2.0
(SNMPv2-SMI::enterprises.9.9.305.1.1.2.0)
Value (Null)
No. Time Source Destination Protocol Length
Info
3608 25.116117 10.194.33.80 10.194.30.193 SNMP 173
get-request SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
Frame 3608: 173 bytes on wire (1384 bits), 173 bytes captured (1384 bits)
WTAP_ENCAP: 1
Arrival Time: Jan 16, 2013 12:23:29.392658000 Eastern Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1358357009.392658000 seconds
[Time delta from previous captured frame: 0.000157000 seconds]
[Time delta from previous displayed frame: 0.000157000 seconds]
[Time since reference or first frame: 25.116117000 seconds]
Frame Number: 3608
Frame Length: 173 bytes (1384 bits)
Capture Length: 173 bytes (1384 bits)
[Frame is marked: True]
[Frame is ignored: False]
[Protocols in frame: eth:ip:udp:snmp]
[Coloring Rule Name: UDP]
[Coloring Rule String: udp]
Ethernet II, Src: Cisco_79:df:41 (00:22:55:79:df:41), Dst: Cisco_1e:6c:40
(00:19:a9:1e:6c:40)
Destination: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
Address: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_79:df:41 (00:22:55:79:df:41)
Address: Cisco_79:df:41 (00:22:55:79:df:41)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.194.33.80 (10.194.33.80), Dst:
10.194.30.193 (10.194.30.193)
User Datagram Protocol, Src Port: 39417 (39417), Dst Port: snmp (161)
Source port: 39417 (39417)
Destination port: snmp (161)
Length: 139
Checksum: 0xd96c [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Simple Network Management Protocol
msgVersion: snmpv3 (3)
msgGlobalData
msgID: 745338668
msgMaxSize: 65535
msgFlags: 05
.... .1.. = Reportable: Set
.... ..0. = Encrypted: Not set
.... ...1 = Authenticated: Set
msgSecurityModel: USM (3)
msgAuthoritativeEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
msgAuthoritativeEngineBoots: 14
msgAuthoritativeEngineTime: 5971516
msgUserName: blah
msgAuthenticationParameters: 5913acc5dc2c77534ec09eb8
msgPrivacyParameters: <MISSING>
msgData: plaintext (0)
plaintext
contextEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
contextName:
data: get-request (0)
get-request
request-id: 87745250
error-status: noError (0)
error-index: 50
variable-bindings: 1 item
SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
(1.3.6.1.4.1.9.9.305.1.1.2.0): Value (Null)
Object Name: 1.3.6.1.4.1.9.9.305.1.1.2.0
(SNMPv2-SMI::enterprises.9.9.305.1.1.2.0)
Value (Null)
No. Time Source Destination Protocol Length
Info
3609 25.116126 10.194.33.80 10.194.30.193 SNMP 173
get-request SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
Frame 3609: 173 bytes on wire (1384 bits), 173 bytes captured (1384 bits)
WTAP_ENCAP: 1
Arrival Time: Jan 16, 2013 12:23:29.392667000 Eastern Standard Time
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1358357009.392667000 seconds
[Time delta from previous captured frame: 0.000009000 seconds]
[Time delta from previous displayed frame: 0.000009000 seconds]
[Time since reference or first frame: 25.116126000 seconds]
Frame Number: 3609
Frame Length: 173 bytes (1384 bits)
Capture Length: 173 bytes (1384 bits)
[Frame is marked: True]
[Frame is ignored: False]
[Protocols in frame: eth:ip:udp:snmp]
[Coloring Rule Name: UDP]
[Coloring Rule String: udp]
Ethernet II, Src: Cisco_1e:6c:40 (00:19:a9:1e:6c:40), Dst: Cisco_79:df:41
(00:22:55:79:df:41)
Destination: Cisco_79:df:41 (00:22:55:79:df:41)
Address: Cisco_79:df:41 (00:22:55:79:df:41)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
Address: Cisco_1e:6c:40 (00:19:a9:1e:6c:40)
.... ..0. .... .... .... .... = LG bit: Globally unique address
(factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IP (0x0800)
Internet Protocol Version 4, Src: 10.194.33.80 (10.194.33.80), Dst:
10.194.30.193 (10.194.30.193)
User Datagram Protocol, Src Port: 39417 (39417), Dst Port: snmp (161)
Source port: 39417 (39417)
Destination port: snmp (161)
Length: 139
Checksum: 0xd96c [validation disabled]
[Good Checksum: False]
[Bad Checksum: False]
Simple Network Management Protocol
msgVersion: snmpv3 (3)
msgGlobalData
msgID: 745338668
msgMaxSize: 65535
msgFlags: 05
.... .1.. = Reportable: Set
.... ..0. = Encrypted: Not set
.... ...1 = Authenticated: Set
msgSecurityModel: USM (3)
msgAuthoritativeEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
msgAuthoritativeEngineBoots: 14
msgAuthoritativeEngineTime: 5971516
msgUserName: blah
msgAuthenticationParameters: 5913acc5dc2c77534ec09eb8
msgPrivacyParameters: <MISSING>
msgData: plaintext (0)
plaintext
contextEngineID: 8000000903c0626bae0880
1... .... = Engine ID Conformance: RFC3411 (SNMPv3)
Engine Enterprise ID: ciscoSystems (9)
Engine ID Format: MAC address (3)
Engine ID Data: MAC address: Cisco_ae:08:80 (c0:62:6b:ae:08:80)
contextName:
data: get-request (0)
get-request
request-id: 87745250
error-status: noError (0)
error-index: 50
variable-bindings: 1 item
SNMPv2-SMI::enterprises.9.9.305.1.1.2.0
(1.3.6.1.4.1.9.9.305.1.1.2.0): Value (Null)
Object Name: 1.3.6.1.4.1.9.9.305.1.1.2.0
(SNMPv2-SMI::enterprises.9.9.305.1.1.2.0)
Value (Null)
Syed F. Ali
CA Technologies
Sr Software Engineer
Tel: +1 603 334 2111
[email protected]
-----Original Message-----
From: Ali, Syed F
Sent: Friday, February 01, 2013 7:04 PM
To: [email protected]
Subject: help - possible thread-unsafe code in snmp4j
Hi,
We use snmp4j in one of our applications and while tracking down an issue
reported by our customer, I came across some code in snmp4j that I believe may
be thread-unsafe and that MIGHT occasionally be causing a device to be flooded
with too many GET requests.
Code snippet as follows
In org.snmp4j.Snmp.java, org.snmp4j.Snmp.ReportProcessor class
1294 public void processReport(PduHandle handle, CommandResponderEvent e) {
1295 PDU pdu = e.getPDU();
1296 logger.debug("Searching pending request with handle" + handle); ..
..
...
...
...
..
1313 boolean resend = false;
1314 if (request.requestStatus < request.maxRequestStatus) {
1315 switch (request.requestStatus) {
1316 case 0:
1317 if (SnmpConstants.usmStatsUnknownEngineIDs.equals(firstOID)) {
1318 resend = true;
1319 }
1320 else if (SnmpConstants.usmStatsNotInTimeWindows.equals(firstOID)) {
1321 request.requestStatus++;
1322 resend = true;
1323 }
1324 break;
1325 case 1:
1326 if (SnmpConstants.usmStatsNotInTimeWindows.equals(firstOID)) {
1327 resend = true;
1328 }
1329 break;
1330 }
1331 }
1332 // if legal report PDU received, then resend request
1333 if (resend) {
1334 logger.debug("Send new request after report.");
1335 request.requestStatus++;
1336 try {
1337 // We need no callback here because we already have an equivalent
1338 // handle registered.
1339 PduHandle resentHandle =
1340 sendMessage(request.pdu, request.target, e.getTransportMapping(),
1341 null);
1342 // make sure reference to handle is hold until request is finished,
1343 // because otherwise cache information may get lost (WeakHashMap)
1344 request.key = resentHandle;
1345 }
1346 catch (IOException iox) {
1347 logger.error("Failed to send message to " + request.target + ": " +
1348 iox.getMessage());
1349 return;
1350 }
1351 }
1352 else {
..
....
The value of the resend boolean is based on the value of requestStatus. But the
use of request.requestStatus++ is not threadsafe in this method above - because
a) the ++ operation is not atomic on a primitive integer and
b) the value is read on one line and modified on another.
If multiple threads enter the processReport() method in order to process a
report for the same request object, they may end up seeing the cached value of
0 over and over again and resending the request several times until the
request.requestStatus object value finally hits a memory barrier and the real
value is flushed.
Also, since the value of requestStatus is read first on line 1314, and then
incremented either on line 1335, line 1321, if multiple threads got in here for
the same request object, you wouldn't get atomic behavior.
The customer reported a scenario (and provided a wireshark capture to prove it)
that showed the exact same SNMP GET request (with the same requestID) going out
to a particular snmp device approximately 180 times in ~9 seconds. We are using
timeout=3000 and retries = 2 - so we expect to see 3 requests in 9 seconds with
the same request ID. So in the span of approximately 20 milliseconds, we see
about 60 identical GET requests are being issued with the same requestID. I can
provide the pcap file showing this behavior if necessary. Also, these GETs are
interspersed with many REPORTs coming in from the device so this situation may
have been induced by a bug on the agent - however I'm writing to check whether
the code should protect itself from such agent bugs. Perhaps I am wrong and
this code is synchronized somewhere at a higher layer so that a REPORT for a
request with a particular handle can only be processed on a single thread at a
time.
Thanks for any assistance / clarification in advance!
Syed F. Ali
_______________________________________________
SNMP4J mailing list
[email protected]
http://lists.agentpp.org/mailman/listinfo/snmp4j