Paul Cassella wrote: > On Fri, 2 Jun 2023, Ira Weiny wrote: > > Paul Cassella wrote: > > > On Sat, 3 Dec 2022, Ira Weiny wrote: > > > > On Sat, Dec 03, 2022 at 09:58:58AM +0000, Yongqiang Liu wrote: > > > > > > We should always call dax_region_put() whenever devm_create_dev_dax() > > > > > succeed or fail to avoid refcount leak of dax_region. Move the return > > > > > value check after dax_region_put(). > > > > > I think dax_region_put is called from dax_region_unregister() > > > > automatically on > > > > tear down. > > > > Note the reference dax_region_unregister() will be putting is the one > > > devm_create_dev_dax() takes by kref_get(&dax_region->kref). I think > > > dax_hmem_probe() needs to put its reference in the error case, as in the > > > successful case. > > > Looking at this again I'm inclined to agree that something is wrong. But > > I'm not sure this patch fixes it. anything. > > > When you say: > > > > > ... I think > > > dax_hmem_probe() needs to put its reference in the error case, as in the > > > successful case. > > > > ... which kref_get() is dax_hmem_probe() letting go? > > Isn't it letting go of the initial kref_init() reference from > alloc_dax_region()?
Oh wow! I did not realize that about kref_init()... :-( > > Sorry, I had gone through the code a little carelessly yesterday. Now I > think that kref_init() reference is the one that dax_hmem_probe() is > dropping in the success case, and which still needs to be dropped in the > error case. yea ok... > > (If so, I think the alloc_dax_region() path that returns NULL on > devm_add_action_or_reset() failure, releasing the kref_get reference, will > leak the kref_init reference and the region.) I see now. The ref is taken prior to the add action or reset to ensure the dax_region does not go away should the platform device go away... However, in all the call paths currently the device passed to alloc_dax_region() can't go away prior to the dev_dax taking a reference. So I don't think this extra reference is required. :-/ As you say the ref counting right now is not correct on error flows. But I'm torn on the correct solution. I think a small series broken out so it can be backported if needed would be best. Ira