Public bug reported:

Description of problem:
When ironic (undercloud) is not able to get reverse DNS entry for IP assigned 
to br-ctlplane (doesn't even receive NXDomain error message in time, e.g. DNS 
server is misconfigured, connectivity issues, ...), all ironic commands take 
very long to execute (they will time out, but they still succeed).

[undercloud]: $ time ironic-node list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | 
Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
...
real    0m55.383s
user    0m0.248s
sys     0m0.043s

Version-Release number of selected component (if applicable):
Tested on OSP director 8

How reproducible (example with IP 10.100.100.1):
[undercloud]: $ ip a
...
7: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UNKNOWN 
    link/ether <macaddr> brd ff:ff:ff:ff:ff:ff
    inet 10.100.100.1/24 brd 10.100.100.255 scope global br-ctlplane
       valid_lft forever preferred_lft forever
...

Configure your DNS server to not respond (even with NXDOMAIN) for
10.100.100.1:

[undercloud]: $ time host 10.100.100.1
;; connection timed out; no servers could be reached
real    0m14.005s
user    0m0.003s
sys     0m0.003s

[undercloud]: $ time dig -x 10.100.100.1
...
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20304
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
...
;; connection timed out; no servers could be reached
real    0m21.007s
user    0m0.003s
sys     0m0.004s

[undercloud]: $ time nslookup 10.100.100.1                                
;; connection timed out; trying next origin
;; connection timed out; trying next origin
;; Got SERVFAIL reply from XYZ, trying next server
;; connection timed out; trying next origin
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached
real    0m50.008s
user    0m0.002s
sys     0m0.009

Actual results:
Ironic commands can take 20-60 seconds per one in this case

Expected results:
Ironic should have mechanism to deal with this, commands shouldn't take tens of 
seconds rather than milliseconds:
[undercloud]: $ time ironic-node list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name | Instance UUID | Power State | 
Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
...
real    0m0.393s
user    0m0.244s
sys     0m0.041s

Originaly created: https://bugzilla.redhat.com/show_bug.cgi?id=1328143

** Affects: ironic
     Importance: Undecided
         Status: New

** Affects: network-manager (Ubuntu)
     Importance: Undecided
         Status: Invalid

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to network-manager in Ubuntu.
https://bugs.launchpad.net/bugs/1572201

Title:
  Long ironic timeouts because of ServFail DNS error

Status in Ironic:
  New
Status in network-manager package in Ubuntu:
  Invalid

Bug description:
  Description of problem:
  When ironic (undercloud) is not able to get reverse DNS entry for IP assigned 
to br-ctlplane (doesn't even receive NXDomain error message in time, e.g. DNS 
server is misconfigured, connectivity issues, ...), all ironic commands take 
very long to execute (they will time out, but they still succeed).

  [undercloud]: $ time ironic-node list
  
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
  | UUID                                 | Name | Instance UUID | Power State | 
Provisioning State | Maintenance |
  
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
  ...
  real    0m55.383s
  user    0m0.248s
  sys     0m0.043s

  Version-Release number of selected component (if applicable):
  Tested on OSP director 8

  How reproducible (example with IP 10.100.100.1):
  [undercloud]: $ ip a
  ...
  7: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
state UNKNOWN 
      link/ether <macaddr> brd ff:ff:ff:ff:ff:ff
      inet 10.100.100.1/24 brd 10.100.100.255 scope global br-ctlplane
         valid_lft forever preferred_lft forever
  ...

  Configure your DNS server to not respond (even with NXDOMAIN) for
  10.100.100.1:

  [undercloud]: $ time host 10.100.100.1
  ;; connection timed out; no servers could be reached
  real    0m14.005s
  user    0m0.003s
  sys     0m0.003s

  [undercloud]: $ time dig -x 10.100.100.1
  ...
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20304
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 4096
  ;; QUESTION SECTION:
  ...
  ;; connection timed out; no servers could be reached
  real    0m21.007s
  user    0m0.003s
  sys     0m0.004s

  [undercloud]: $ time nslookup 10.100.100.1                                
  ;; connection timed out; trying next origin
  ;; connection timed out; trying next origin
  ;; Got SERVFAIL reply from XYZ, trying next server
  ;; connection timed out; trying next origin
  ;; connection timed out; trying next origin
  ;; connection timed out; no servers could be reached
  real    0m50.008s
  user    0m0.002s
  sys     0m0.009

  Actual results:
  Ironic commands can take 20-60 seconds per one in this case

  Expected results:
  Ironic should have mechanism to deal with this, commands shouldn't take tens 
of seconds rather than milliseconds:
  [undercloud]: $ time ironic-node list
  
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
  | UUID                                 | Name | Instance UUID | Power State | 
Provisioning State | Maintenance |
  
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
  ...
  real    0m0.393s
  user    0m0.244s
  sys     0m0.041s

  Originaly created: https://bugzilla.redhat.com/show_bug.cgi?id=1328143

To manage notifications about this bug go to:
https://bugs.launchpad.net/ironic/+bug/1572201/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to