Hi Eddie,
Looking on the your nova database after the delete looks correct to me. | created_at | updated_at | deleted_at | deleted | id | 2017-06-21 00:56:06 | 2017-07-07 02:27:16 | NULL | 0 | 2 | 2017-07-07 01:42:48 | 2017-07-07 02:13:14 | 2017-07-07 02:13:42 | 9 | 9 See that the second row has deleted_at timestamp and deleted with no zero value (the id of the row). Nova is doing soft delete which is just marking the row as deleted but not actually deleting it from nova pci_devices table. See [1] and [2] There is a bug with the pci_devices in a scenario when we can delete allocated pci device e.g. if pci.passthrough_whitelist is changed commit [3] try to resolve. [1] - https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/models.py#L142-L150 [2] - https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L1411 [3-] - https://review.openstack.org/#/c/426243/ From: Eddie Yen [mailto:missile0...@gmail.com] Sent: Tuesday, July 11, 2017 3:18 AM To: Jay Pipes <jaypi...@gmail.com> Cc: openstack@lists.openstack.org Subject: Re: [Openstack] [nova] Database not delete PCI info after device is removed from host and nova.conf Roger that, I may going to report this bug on the OpenStack Compute (Nova) Launchpad to see what happen. Anyway, thanks for ur help, really appreciate. Eddie. 2017-07-11 8:12 GMT+08:00 Jay Pipes <jaypi...@gmail.com<mailto:jaypi...@gmail.com>>: Unfortunately, Eddie, I'm not entirely sure what is going on with your situation. According to the code, the non-existing PCI device should be removed from the pci_devices table when the PCI manager notices the PCI device is no longer on the local host... On 07/09/2017 08:36 PM, Eddie Yen wrote: Hi there, Does the information already enough or need additional items? Thanks, Eddie. 2017-07-07 10:49 GMT+08:00 Eddie Yen <missile0...@gmail.com<mailto:missile0...@gmail.com> <mailto:missile0...@gmail.com<mailto:missile0...@gmail.com>>>: Sorry, Re-new the nova-compute log after remove "1002:68c8" and restart nova-compute. http://paste.openstack.org/show/qUCOX09jyeMydoYHc8Oz/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FqUCOX09jyeMydoYHc8Oz%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098573075&sdata=brxkAv3AgO%2BwpwPXow5SY%2By0rGZ%2B7STTbEfm3gH1KSM%3D&reserved=0> <http://paste.openstack.org/show/qUCOX09jyeMydoYHc8Oz/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FqUCOX09jyeMydoYHc8Oz%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098573075&sdata=brxkAv3AgO%2BwpwPXow5SY%2By0rGZ%2B7STTbEfm3gH1KSM%3D&reserved=0>> 2017-07-07 10:37 GMT+08:00 Eddie Yen <missile0...@gmail.com<mailto:missile0...@gmail.com> <mailto:missile0...@gmail.com<mailto:missile0...@gmail.com>>>: Hi Jay, Below are few logs and information you may want to check. I wrote GPU inforamtion into nova.conf like this. pci_passthrough_whitelist = [{ "product_id":"0ff3", "vendor_id":"10de"}, { "product_id":"68c8", "vendor_id":"1002"}] pci_alias = [{ "product_id":"0ff3", "vendor_id":"10de", "device_type":"type-PCI", "name":"k420"}, { "product_id":"68c8", "vendor_id":"1002", "device_type":"type-PCI", "name":"v4800"}] Then restart the services. nova-compute log when insert new GPU device info into nova.conf and restart service: http://paste.openstack.org/show/z015rYGXaxYhVoafKdbx/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2Fz015rYGXaxYhVoafKdbx%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=Jc1%2B7Uexui%2FFfEL%2FdADTp6tVa9ssIBPGabGwA85Qm2E%3D&reserved=0> <http://paste.openstack.org/show/z015rYGXaxYhVoafKdbx/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2Fz015rYGXaxYhVoafKdbx%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=Jc1%2B7Uexui%2FFfEL%2FdADTp6tVa9ssIBPGabGwA85Qm2E%3D&reserved=0>> Strange is, the log shows that resource tracker only collect information of new setup GPU, not included the old one. But If I do some actions on the instance contained old GPU, the tracker will get both GPU. http://paste.openstack.org/show/614658/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2F614658%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=EvEVi1mhEAbVLK7NQppVJX8i7aqkgCtwbH8GRFr81Fo%3D&reserved=0> <http://paste.openstack.org/show/614658/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2F614658%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=EvEVi1mhEAbVLK7NQppVJX8i7aqkgCtwbH8GRFr81Fo%3D&reserved=0>> Nova database shows correct information on both GPU http://paste.openstack.org/show/8JS0i6BMitjeBVRJTkRo/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2F8JS0i6BMitjeBVRJTkRo%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=V%2BFxNgTY2N3hDU6gK31axnLCf1bvz7B7Lw%2FmqY%2BrhT8%3D&reserved=0> <http://paste.openstack.org/show/8JS0i6BMitjeBVRJTkRo/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2F8JS0i6BMitjeBVRJTkRo%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=V%2BFxNgTY2N3hDU6gK31axnLCf1bvz7B7Lw%2FmqY%2BrhT8%3D&reserved=0>> Now remove ID "1002:68c8" from nova.conf and compute node, and restart services. The pci_passthrough_whitelist and pci_alias only keep "10de:0ff3" GPU info. pci_passthrough_whitelist = { "product_id":"0ff3", "vendor_id":"10de" } pci_alias = { "product_id":"0ff3", "vendor_id":"10de", "device_type":"type-PCI", "name":"k420" } nova-compute log shows resource tracker report node only have "10de:0ff3" PCI resource http://paste.openstack.org/show/VjLinsipne5nM8o0TYcJ/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FVjLinsipne5nM8o0TYcJ%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=GmWsrHuv1DphNJXSKDils8iUWn%2BKbeihjmfDQHQHOMY%3D&reserved=0> <http://paste.openstack.org/show/VjLinsipne5nM8o0TYcJ/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FVjLinsipne5nM8o0TYcJ%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=GmWsrHuv1DphNJXSKDils8iUWn%2BKbeihjmfDQHQHOMY%3D&reserved=0>> But in Nova database, "1002:68c8" still exist, and stayed in "Available" status. Even "deleted" value shows not zero. http://paste.openstack.org/show/SnJ8AzJYD6wCo7jslIc2/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FSnJ8AzJYD6wCo7jslIc2%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=9bzrGFdYVtAtwKdTu0ZaxegUah3ZTBbNqAGjCrsT9lk%3D&reserved=0> <http://paste.openstack.org/show/SnJ8AzJYD6wCo7jslIc2/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org%2Fshow%2FSnJ8AzJYD6wCo7jslIc2%2F&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=9bzrGFdYVtAtwKdTu0ZaxegUah3ZTBbNqAGjCrsT9lk%3D&reserved=0>> Many thanks, Eddie. 2017-07-07 9:05 GMT+08:00 Eddie Yen <missile0...@gmail.com<mailto:missile0...@gmail.com> <mailto:missile0...@gmail.com<mailto:missile0...@gmail.com>>>: Uh wait, Is that possible it still shows available if PCI device still exist in the same address? Because when I remove the GPU card, I replace it to a SFP+ network card in the same slot. So when I type lspci the SFP+ card stay in the same address. But it still doesn't make any sense because these two cards definitely not a same VID:PID. And I set the information as VID:PID in nova.conf I'll try reproduce this issue and put a log on this list. Thanks, 2017-07-07 9:01 GMT+08:00 Jay Pipes <jaypi...@gmail.com<mailto:jaypi...@gmail.com> <mailto:jaypi...@gmail.com<mailto:jaypi...@gmail.com>>>: Hmm, very odd indeed. Any way you can save the nova-compute logs from when you removed the GPU and restarted the nova-compute service and paste those logs to paste.openstack.org<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=%2B6nouVdZuiGwaywLl%2BYGqbqDIbZZIjagLykv6%2BEYrf8%3D&reserved=0> <http://paste.openstack.org<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpaste.openstack.org&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=%2B6nouVdZuiGwaywLl%2BYGqbqDIbZZIjagLykv6%2BEYrf8%3D&reserved=0>>? Would be useful in tracking down this buggy behaviour... Best, -jay On 07/06/2017 08:54 PM, Eddie Yen wrote: Hi Jay, The status of the "removed" GPU still shows as "Available" in pci_devices table. 2017-07-07 8:34 GMT+08:00 Jay Pipes <jaypi...@gmail.com<mailto:jaypi...@gmail.com> <mailto:jaypi...@gmail.com<mailto:jaypi...@gmail.com>> <mailto:jaypi...@gmail.com<mailto:jaypi...@gmail.com> <mailto:jaypi...@gmail.com<mailto:jaypi...@gmail.com>>>>: Hi again, Eddie :) Answer inline... On 07/06/2017 08:14 PM, Eddie Yen wrote: Hi everyone, I'm using OpenStack Mitaka version (deployed from Fuel 9.2) In present, I installed two different model of GPU card. And wrote these information into pci_alias and pci_passthrough_whitelist in nova.conf on Controller and Compute (the node which installed GPU). Then restart nova-api, nova-scheduler,and nova-compute. When I check database, both of GPU info registered in pci_devices table. Now I removed one of the GPU from compute node, and remove the information from nova.conf, then restart services. But I check database again, the information of the removed card still exist in pci_devices table. How can I do to fix this problem? So, when you removed the GPU from the compute node and restarted the nova-compute service, it *should* have noticed you had removed the GPU and marked that PCI device as deleted. At least, according to this code in the PCI manager: https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenstack%2Fnova%2Fblob%2Fmaster%2Fnova%2Fpci%2Fmanager.py%23L168-L183&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=CYb%2Fec5fiAkU9LfJ7W6eMxXsS%2F2VpdfaVYSAdcGRy94%3D&reserved=0> <https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenstack%2Fnova%2Fblob%2Fmaster%2Fnova%2Fpci%2Fmanager.py%23L168-L183&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=CYb%2Fec5fiAkU9LfJ7W6eMxXsS%2F2VpdfaVYSAdcGRy94%3D&reserved=0>> <https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenstack%2Fnova%2Fblob%2Fmaster%2Fnova%2Fpci%2Fmanager.py%23L168-L183&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=CYb%2Fec5fiAkU9LfJ7W6eMxXsS%2F2VpdfaVYSAdcGRy94%3D&reserved=0> <https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenstack%2Fnova%2Fblob%2Fmaster%2Fnova%2Fpci%2Fmanager.py%23L168-L183&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=CYb%2Fec5fiAkU9LfJ7W6eMxXsS%2F2VpdfaVYSAdcGRy94%3D&reserved=0>>> Question for you: what is the value of the status field in the pci_devices table for the GPU that you removed? Best, -jay p.s. If you really want to get rid of that device, simply remove that record from the pci_devices table. But, again, it *should* be removed automatically... _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=ZpzNaE0Wra4KGRWcluDSyq9lIWTjcOa%2F0uEzllZ6ofI%3D&reserved=0> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098583083&sdata=ZpzNaE0Wra4KGRWcluDSyq9lIWTjcOa%2F0uEzllZ6ofI%3D&reserved=0>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0>>> Post to : openstack@lists.openstack.org<mailto:openstack@lists.openstack.org> <mailto:openstack@lists.openstack.org<mailto:openstack@lists.openstack.org>> <mailto:openstack@lists.openstack.org<mailto:openstack@lists.openstack.org> <mailto:openstack@lists.openstack.org<mailto:openstack@lists.openstack.org>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fopenstack&data=02%7C01%7Cmoshele%40mellanox.com%7C21206586310a435b1ddf08d4c7f436df%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636353299098593092&sdata=EM1gsCu55xLMlaPGl5QumwnCR%2FEfgNEEF3GpXOCDshE%3D&reserved=0>>>
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack