On 10/02/2016 11:51 PM, Amos Jeffries wrote: > On 3/10/2016 1:03 p.m., Alex Rousskov wrote: >> On 10/02/2016 03:25 PM, Kinkie wrote: >>> On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov >>>> Overall, I know of three primary ways to implement c_str(): >>>> >>>> 1. Always 0-terminate SBuf-used storage. Safest implementation possible, >>>> but has serious negative performance overheads, especially for tokenizing.
>>> If this is the decision, might as well drop sbuf altogether in favor >>> of std::string (or possibly std::string + custom mempools-friendly >>> Allocator). >> You are probably right, although std::string may have other funny things >> going that we might want to avoid/control. > Yes. [...] I am dropping this part of the discussion until somebody proposes to do #1. The discussion contains wrong statements IMO but I will ignore them in hope that they remain irrelevant (because nobody will do #1). >>>> 2. Terminate the used storage at the time of c_str(). >>>> >>>> 2a. Increment the used storage size. Current implementation. >>>> Leads to problems discussed in this thread. >>>> >>>> 2b. Do not increment the used storage size. >>>> Leads to other problems discussed in this thread. >>>> >>>> 3. Allocate and store a new c-string as needed, copying from the >>>> storage. Safest implementation possible but has serious negative >>>> performance overheads. These overheads affect different use cases than >>>> those of #1. >>> If we are aware of the underlying issue, there's no reason not to save >>> the temporary to a naked char* and use that, as long as the SBuf is >>> not touched it'll work fine. > I imagined a cached const char* pointer inside SBuf that c_str() > produces if it is unset. The cow() operations freeing that cached pointer. A pointer to what? A pointer to buf() is approach #2. A pointer to specially allocated storage is approach #3. >>>> Going forward, do you think we should: >>>> >>>> i. Keep using #2a and rewrite the known double-c_str() code. > > That is two options. I think we should fix the double-c_str() code > regardless of any other changes. Implementations #1, #2b, and #3 make double-c_str() code perfectly safe. We should not rewrite it. > These are definitely performance bugs > with the 2nd call triggeringd reallocate regardless of other bad side > effects. Implementations #1, #2b, and #3 do not result in poor double-c_str() performance. They make double-c_str() performance very close to a single c_str() performance. >>>> ii. Make double-c_str() code safe (i.e., switch to #2b). >>>> iii. Add code to make the existing #2 implementation safer. >>>> iiii. Switch to a completely different implementation like #1 or #4. > There are very few things we can do will resolve this in such a short > timeframe. Agreed. > Any choice we make to alter the design will take a long time to > check every single code path works okay. Not really -- if the API stays the same or becomes even safer, then the impossible task of checking every single code path becomes unnecessary. > Another option is: > > vi. drop c_str() completely and use SBufToCstring() or SBufToString() as > ways to obtain char*. How will the proposed SBufToString() differ from the existing "c_str"? If you are proposing that SBufToString() will allocate memory that the calling code will have to clean up explicitly, then I am against that idea because it makes the API worse, especially in the presence of exceptions. > Adjusting callers as necessary for those not to leak memory or be too > bad for performance will be easier than a full code audit. I do not think any of the other options require a "full code audit" either so this is not a valid comparison point. Alex. _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev