To conclude and ensure I understood both: - I've sent a patch to remove extra_free parameter (and the associated gf_memdup() in 2 places) - https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting for CI to finish working on it). - I'll send a patch to remove extra_stdfree - but this one is slightly bigger: 1. There are more places it's set (18). 2. Somehow, we need to release that memory. I kinda assume the sooner the better? So perhaps I can just replace: dict->extra_stdfree = cli_req.dict.dict_val; with: free(cli_req.dict.dict_val) ? 3. In some places, there was actually a call to GF_FREE, which is kind of confusing. For example, in __server_get_snap_info() : if (snap_info_rsp.dict.dict_val) { GF_FREE(snap_info_rsp.dict.dict_val); }
I think I should remove that and stick to freeing right after unserialization? Y. On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <jaher...@redhat.com> wrote: > On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <yk...@redhat.com> wrote: > >> >> >> On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <jaher...@redhat.com> >> wrote: >> >>> On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <ama...@gmail.com> wrote: >>> >>>> >>>> >>>> On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <jaher...@redhat.com> >>>> wrote: >>>> >>>>> On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <ama...@gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <jaher...@redhat.com> >>>>>> wrote: >>>>>> >>>>>>> On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <yk...@redhat.com> wrote: >>>>>>> >>>>>>>> I could not find a relevant use for them. Can anyone enlighten me? >>>>>>>> >>>>>>> >>>>>>> I'm not sure why they are needed. They seem to be used to keep the >>>>>>> unserialized version of a dict around until the dict is destroyed. I >>>>>>> thought this could be because we were using pointers to the unserialized >>>>>>> data inside dict, but that's not the case currently. However, checking >>>>>>> very >>>>>>> old versions (pre 3.2), I see that dict values were not allocated, but a >>>>>>> pointer to the unserialized data was used. >>>>>>> >>>>>> >>>>>> Xavi, >>>>>> >>>>>> While you are right about the intent, it is used still, at least when >>>>>> I grepped latest repo to keep a reference in protocol layer. >>>>>> >>>>>> This is done to reduce a copy after the dictionary's binary content >>>>>> is received from RPC. The 'extra_free' flag is used when we use a >>>>>> GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree >>>>>> is used when RPC itself allocates the buffer and hence uses 'free()' to >>>>>> free the buffer. >>>>>> >>>>> >>>>> I don't see it. When dict_unserialize() is called, key and value are >>>>> allocated and copied, so why do we need to keep the raw data after that ? >>>>> >>>>> In 3.1 the value was simply a pointer to the unserialized data, but >>>>> starting with 3.2, value is memdup'ed. Key is always copied. I don't see >>>>> any other reference to the unserialized data right now. I think that >>>>> instead of assigning the raw data to extra_(std)free, we should simply >>>>> release that memory and remove those fields. >>>>> >>>>> Am I missing something else ? >>>>> >>>> >>>> I did grep on 'extra_stdfree' and 'extra_free' and saw that many >>>> handshake/ and protocol code seemed to use it. Haven't gone deeper to check >>>> which part. >>>> >>>> [amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l >>>> 40 >>>> [amar@kadalu glusterfs]$ git grep extra_free | wc -l >>>> 5 >>>> >>> >>> Yes, they call dict_unserialize() and then store the unserialized data >>> into those variables. That's what I'm saying it's not necessary. >>> >> >> In at least 2 cases, there's even something stranger I could not >> understand (see in bold - from server_setvolume() function) : >> * params = dict_new();* >> reply = dict_new(); >> ret = xdr_to_generic(req->msg[0], &args, >> (xdrproc_t)xdr_gf_setvolume_req); >> if (ret < 0) { >> // failed to decode msg; >> req->rpc_err = GARBAGE_ARGS; >> goto fail; >> } >> ctx = THIS->ctx; >> >> this = req->svc->xl; >> /* this is to ensure config_params is populated with the first brick >> * details at first place if brick multiplexing is enabled >> */ >> config_params = dict_copy_with_ref(this->options, NULL); >> >> * buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);* >> > > This is probably unnecessary if we can really remove extra_free. Probably > it's here because args.dict.dict_val will be destroyed later. > > >> if (buf == NULL) { >> op_ret = -1; >> op_errno = ENOMEM; >> goto fail; >> } >> >> * ret = dict_unserialize(buf, args.dict.dict_len, ¶ms);* >> if (ret < 0) { >> ret = dict_set_str(reply, "ERROR", >> "Internal error: failed to unserialize " >> "request dictionary"); >> if (ret < 0) >> gf_msg_debug(this->name, 0, >> "failed to set error " >> "msg \"%s\"", >> "Internal error: failed " >> "to unserialize request dictionary"); >> >> op_ret = -1; >> op_errno = EINVAL; >> goto fail; >> } >> >> >> >> * params->extra_free = buf; buf = NULL;* >> > > This is needed to prevent memory pointed by buf from being destroyed once > if has been assigned to extra_free. Also unnecessary if extra_free can be > removed. > > So we could eliminate one memory copy here if those fields are not really > necessary. > > >>> >>>> >>>>> >>>>> >>>>>> >>>>>>> I think this is not needed anymore. Probably we could remove these >>>>>>> fields if that's the only reason. >>>>>>> >>>>>> >>>>>> If keeping them is hard to maintain, we can add few allocation to >>>>>> remove those elements, that shouldn't matter much IMO too. We are not >>>>>> using >>>>>> dictionary itself as protocol now (which we did in 1.x series though). >>>>>> >>>>>> Regards, >>>>>> Amar >>>>>> --- >>>>>> https://kadalu.io >>>>>> >>>>>> >>>>>> >>>>>>> TIA, >>>>>>>> Y. >>>>>>>> _______________________________________________ >>>>>>>> >>>>>>>> Community Meeting Calendar: >>>>>>>> >>>>>>>> APAC Schedule - >>>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST >>>>>>>> Bridge: https://bluejeans.com/441850968 >>>>>>>> >>>>>>>> >>>>>>>> NA/EMEA Schedule - >>>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT >>>>>>>> Bridge: https://bluejeans.com/441850968 >>>>>>>> >>>>>>>> Gluster-devel mailing list >>>>>>>> Gluster-devel@gluster.org >>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> >>>>>>> Community Meeting Calendar: >>>>>>> >>>>>>> APAC Schedule - >>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST >>>>>>> Bridge: https://bluejeans.com/441850968 >>>>>>> >>>>>>> >>>>>>> NA/EMEA Schedule - >>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT >>>>>>> Bridge: https://bluejeans.com/441850968 >>>>>>> >>>>>>> Gluster-devel mailing list >>>>>>> Gluster-devel@gluster.org >>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>>> >>>>>>>
_______________________________________________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel