Re: [openstack-dev] [nova][neutron] massive overhead processing "network-changed" events during live migration

Matt Riedemann Fri, 19 May 2017 12:18:41 -0700

On 5/19/2017 1:40 PM, Chris Friesen wrote:

Recently we noticed failures in Newton when we attempted to live-migratean instance with 16 vifs. We tracked it down to an RPC timeout in novawhich timed out waiting for the 'refresh_cache-%s' lock inget_instance_nw_info(). This led to a few other discoveries.
First, we have no fair locking in OpenStack. The live migration codepath was waiting for the lock, but the code processing the incoming"network-changed" events kept getting the lock instead even though theyarrived while the live migration code was already blocked waiting forthe lock.

I'm told that etcd gives us a DLM which is unicorns and rainbows, wouldthat help us here?

Second, it turns out the cost of processing the "network-changed" eventsis astronomical.
1) In Newton nova commit 5de902a was merged to fix evacuate bugs, but itmeant both source and dest compute nodes got the "network-changed"events. This doubled the number of neutron API calls during a livemigration.

As you noted below, that change was made specifically for evacuate. Withthe migration object we know the type of migration and could scope thisbehavior to just evacuate. However, I'm sort of confused by that change- why are we sending external events to the source compute during anevacuation? Isn't the source compute down and thus can't receive andprocess the event?

2) A "network-changed" event is sent from neutron each time somethingchanges. There are multiple of these events for each vif during alive-migration. In the current upstream code the only informationpassed with the event is the instance id, so nova will loop over all theports in the instance and build up all the information aboutsubnets/floatingIP/fixedIP/etc. for that instance. This results inO(N^2) neutron API calls where N is the number of vifs in the instance.

While working on the patches you reference in #3 I was also working onseeing if we can do some bulk queries to Neutron:


https://review.openstack.org/#/c/465792/

It looks like that's not working though. Kevin Benton seemed to think atthe time (it was late the other night) that passing a list of filterparameters would get turned into an OR in the database query, but I'mnot sure that's happening (see that Tempest failed on that patch). Idon't have a devstack handy but it seems we could prove this via simplecurl requests.

3) mriedem has proposed a patch series(https://review.openstack.org/#/c/465783 andhttps://review.openstack.org/#/c/465787) that would change neutron toinclude the port ID, and allow nova to update just that port. Thisreduces the cost to O(N), but it's still significant.

In a hardware lab with 4 compute nodes I created 4 boot-from-volumeinstances, each with 16 vifs. I then live-migrated them all inparallel. (The one on compute-0 was migrated to compute-1, the one oncompute-1 was migrated to compute-2, etc.) The aggregate CPU usage fora few critical components on the controller node is shown below. Notein particular the CPU usage for neutron--it's using most of 10 CPUs for~10 seconds, spiking to 13 CPUs. This seems like an absurd amount ofwork to do just to update the cache in nova.



Labels:
   L0: neutron-server
   L1: nova-conductor
   L2: beam.smp
   L3: postgres
-          -                 -      L0      L1      L2      L3
date       time             dt     occ     occ     occ     occ
yyyy/mm/dd hh:mm:ss.dec    (s)     (%)     (%)     (%)     (%)
2017-05-19 17:51:38.710  2.173   19.75    1.28    2.85    1.96
2017-05-19 17:51:40.012  1.302    1.02    1.75    3.80    5.07
2017-05-19 17:51:41.334  1.322    2.34    2.66    5.25    1.76
2017-05-19 17:51:42.681  1.347   91.79    3.31    5.27    5.64
2017-05-19 17:51:44.035  1.354   40.78    7.27    3.48    7.34
2017-05-19 17:51:45.406  1.371    7.12   21.35    8.66   19.58
2017-05-19 17:51:46.784  1.378   16.71  196.29    6.87   15.93
2017-05-19 17:51:48.133  1.349   18.51  362.46    8.57   25.70
2017-05-19 17:51:49.508  1.375  284.16  199.30    4.58   18.49
2017-05-19 17:51:50.919  1.411  512.88   17.61    7.47   42.88
2017-05-19 17:51:52.322  1.403  412.34    8.90    9.15   19.24
2017-05-19 17:51:53.734  1.411  320.24    5.20   10.59    9.08
2017-05-19 17:51:55.129  1.396  304.92    2.27   10.65   10.29
2017-05-19 17:51:56.551  1.422  556.09   14.56   10.74   18.85
2017-05-19 17:51:57.977  1.426  979.63   43.41   14.17   21.32
2017-05-19 17:51:59.382  1.405  902.56   48.31   13.69   18.59
2017-05-19 17:52:00.808  1.425 1140.99   74.28   15.12   17.18
2017-05-19 17:52:02.238  1.430 1013.91   69.77   16.46   21.19
2017-05-19 17:52:03.647  1.409  964.94  175.09   15.81   27.23
2017-05-19 17:52:05.077  1.430  838.15  109.13   15.70   34.12
2017-05-19 17:52:06.502  1.425  525.88   79.09   14.42   11.09
2017-05-19 17:52:07.954  1.452  614.58   38.38   12.20   17.89
2017-05-19 17:52:09.380  1.426  763.25   68.40   12.36   16.08
2017-05-19 17:52:10.825  1.445  901.57   73.59   15.90   41.12
2017-05-19 17:52:12.252  1.427  966.15   42.97   16.76   23.07
2017-05-19 17:52:13.702  1.450  902.40   70.98   19.66   17.50
2017-05-19 17:52:15.173  1.471 1023.33   59.71   19.78   18.91
2017-05-19 17:52:16.605  1.432 1127.04   64.19   16.41   26.80
2017-05-19 17:52:18.046  1.442 1300.56   68.22   16.29   24.39
2017-05-19 17:52:19.517  1.471 1055.60   71.74   14.39   17.09
2017-05-19 17:52:20.983  1.465  845.30   61.48   15.24   22.86
2017-05-19 17:52:22.447  1.464 1027.33   65.53   15.94   26.85
2017-05-19 17:52:23.919  1.472 1003.08   56.97   14.39   28.93
2017-05-19 17:52:25.367  1.448  702.50   45.42   11.78   20.53
2017-05-19 17:52:26.814  1.448  558.63   66.48   13.22   29.64
2017-05-19 17:52:28.276  1.462  620.34  206.63   14.58   17.17
2017-05-19 17:52:29.749  1.473  555.62  110.37   10.95   13.27
2017-05-19 17:52:31.228  1.479  436.66   33.65    9.00   21.55
2017-05-19 17:52:32.685  1.456  417.12   87.44   13.44   12.27
2017-05-19 17:52:34.128  1.443  368.31   87.08   11.95   14.70
2017-05-19 17:52:35.558  1.430  171.66   11.67    9.28   13.36
2017-05-19 17:52:36.976  1.417  231.82   10.57    7.03   14.08
2017-05-19 17:52:38.413  1.438  241.14   77.78    6.86   15.34
2017-05-19 17:52:39.827  1.413   85.01   63.72    5.85   14.01
2017-05-19 17:52:41.200  1.373    3.31    3.43    7.18    1.78
2017-05-19 17:52:42.556  1.357   60.68    2.94    6.51    6.16
2017-05-19 17:52:44.019  1.463   24.23    5.94    3.45    3.15
2017-05-19 17:52:45.376  1.356    0.93    3.91    5.13    0.83
2017-05-19 17:52:46.699  1.323    7.68    4.12    5.43    0.45
2017-05-19 17:52:48.033  1.334    5.85    1.70    6.00    1.91
2017-05-19 17:52:49.373  1.341   66.28    2.37    4.49   16.40
2017-05-19 17:52:50.715  1.342   31.67    3.03    3.66    6.91
2017-05-19 17:52:52.023  1.308    2.80    2.35    3.30   10.76
2017-05-19 17:52:53.330  1.307    6.94    5.78    3.25    2.30
2017-05-19 17:52:54.699  1.368    3.11    2.67    8.34    1.01
2017-05-19 17:52:56.049  1.351   23.14    2.28    2.83    5.30
2017-05-19 17:52:57.434  1.384   46.86    5.02    6.27   11.93
2017-05-19 17:52:58.803  1.370    3.78   10.26    3.08    2.08
2017-05-19 17:53:00.206  1.403   66.09    8.20    4.07    1.27
2017-05-19 17:53:01.542  1.336   63.71    9.70    3.17    4.89
2017-05-19 17:53:02.855  1.312   21.53    3.99    4.33    5.03

It seems like it should be possible to reduce the amount of workinvolved here. One possibility would be to partially revert nova commit5de902a for the live-migration case, since it seemed to work fine inMitaka and earlier. Another possibility would be to include additionalinformation about what changed in the "network-changed" event, whichwould reduce the amount of queries that nova would need to do.

As noted above, I'm curious how sending events to the source computeduring an evacuate, when the source compute is supposed to be down,helps anything? Maybe artom can help clarify.

As for the live migration case, looking at related changes around thattime it looks like what was really needed was thenetwork-vif-(un)plugged events going to both hosts, but as a resultwe're sending all events, including network-changed (which happens whena port is updated, which could happen when it's bound/unbound to/from ahost during migration). So we could also consider scoping which types ofevents go to both hosts, but that gets a bit hairy in the API makingdecisions about what can or should be ignored - seems like that wouldopen us up to other bugs later.

In a larger cloud this is going to cause major issues for NFV-typeworkloads, where instances having many vifs is going to be relativelycommon.
Chris

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][neutron] massive overhead processing "network-changed" events during live migration

Reply via email to