Agree, existing retire logic in dupreq_finish will not be changed. In addition to this, stale drc objects will be handled in the timeout function. Stale drc objects are drcs which aren't referenced in a while (maintain last_used timestamp in drc and update it on every ref). This stale drc can have upto 1000 dupreq objects.
static void handle_stale_drcs(drc_t *drc) { lock_drc; while (entry_exists_in_list(dupreq_q)) { dv = TAILQ_REMOVE(dupreq_q); dv->next = prev; prev = dv; } unlock_drc; Now the dupreq entries are in the list pointed to by dv; /* At this point only other references to dv are threads which are actively using the dv. * These threads will just do entry_put and not add the dv back to dupreq_q list. * So, we arrested all the double frees. */ while (entry_exists_in_list(dv)) { hk = dv->hk; lock_partition(hk); remove_from_rb_tree(dv); entry_put(dv); unlock_partition(hk); } put_drc(drc); } This is what I had in mind. I could possibly have missed some race. Thanks, Satya. On Thu, May 4, 2017 at 8:07 AM, Malahal Naineni <mala...@gmail.com> wrote: > Matt, you are correct. We lose some memory (drc and dupreqs) for a client > that never reconnects. Doing solely time based strategy is not scalable as > well unless we fork multiple threads for doing this. My understanding is > that there will be one time based strategy (hopefully, the time is long > enough that it does not interfere with current strategy) in __addition__ to > the current retiring strategy. > > Regards, Malahal. > > On Thu, May 4, 2017 at 3:56 AM, Matt Benjamin <mbenja...@redhat.com> wrote: >> >> Hi Guys, >> >> To get on the record here, the current retire strategy using new requests >> to retire old ones is an intrinsic good, particularly with TCP and related >> cots-ord transports where requests are totally ordered. I don't think >> moving to a strictly time-based strategy is preferable. Apparently the >> actually observed or theorized issue has to do with not disposing of >> requests in invalidated DRCs? That seems to be a special case, no? >> >> Matt >> >> ----- Original Message ----- >> > From: "Malahal Naineni" <mala...@gmail.com> >> > To: "Satya Prakash GS" <g.satyaprak...@gmail.com> >> > Cc: "Matt Benjamin" <mbenja...@redhat.com>, >> > nfs-ganesha-devel@lists.sourceforge.net >> > Sent: Tuesday, May 2, 2017 2:21:48 AM >> > Subject: Re: [Nfs-ganesha-devel] drc refcnt >> > >> > Sorry, every cacheable request holds a ref on its DRC as well as its >> > DUPREQ. The ref on DUPREQ should be released when the request goes away >> > (via nfs_dupreq_rele). The ref on DRC will be released when the >> > corresponding DUPREQ request gets released. Since we release DUPREQs >> > while >> > processing other requests, you are right that the DRC won't be freed if >> > there are no more requests that would use the same DRC. >> > >> > I think we should be freeing dupreq periodically using a timed function, >> > something like that drc_free_expired. >> > >> > Regards, Malahal. >> > >> > >> > >> > On Tue, May 2, 2017 at 10:38 AM, Satya Prakash GS >> > <g.satyaprak...@gmail.com> >> > wrote: >> > >> > > > On Tue, May 2, 2017 at 7:58 AM, Malahal Naineni <mala...@gmail.com> >> > > wrote: >> > > > A dupreq will place a refcount on its DRC when it calls xxx_get_drc, >> > > > so >> > > we >> > > > will release that DRC refcount when we free the dupreq. >> > > >> > > Ok, so every dupreq holds a ref on the drc. In case of drc cache hit, >> > > a dupreq entry can ref the >> > > drc more than once. This is still fine because unless the dupreq entry >> > > ref goes to zero the drc isn't freed. >> > > >> > > > nfs_dupreq_finish() shouldn't free its own dupreq. When it does free >> > > > some >> > > > other dupreq, we will release DRC refcount corresponding to that >> > > > dupreq. >> > > >> > > > When we free all dupreqs that belong to a DRC >> > > >> > > In the case of a disconnected client when are all the dupreqs freed ? >> > > >> > > When all the filesystem operations subside from a client (mount point >> > > is no longer in use), >> > > nfs_dupreq_finish doesn't get called anymore. This is the only place >> > > where dupreq entries are removed from >> > > the drc. If the entries aren't removed from drc, drc refcnt doesn't go >> > > to >> > > 0. >> > > >> > > >, its refcount should go to >> > > > zero (maybe another ref is held by the socket itself, so the socket >> > > > has >> > > to >> > > > be closed as well). >> > > > >> > > > >> > > > In fact, if we release DRC refcount without freeing the dupreq, that >> > > would >> > > > be a bug! >> > > > >> > > > Regards, Malahal. >> > > > >> > > Thanks, >> > > Satya. >> > > >> > >> >> -- >> Matt Benjamin >> Red Hat, Inc. >> 315 West Huron Street, Suite 140A >> Ann Arbor, Michigan 48103 >> >> http://www.redhat.com/en/technologies/storage >> >> tel. 734-821-5101 >> fax. 734-769-8938 >> cel. 734-216-5309 > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel