Merged as revision 14210 with some minor added polish (ostream<< for HttpHdrCcType and Http::HdrType).
On Wed, Aug 5, 2015 at 3:58 PM, Kinkie <[email protected]> wrote: > > > On Wed, Aug 5, 2015 at 4:08 AM, Amos Jeffries <[email protected]> > wrote: > >> On 5/08/2015 9:08 a.m., Kinkie wrote: >> > On Tue, Aug 4, 2015 at 2:58 PM, Amos Jeffries wrote: >> > >> >> On 4/08/2015 11:22 p.m., Kinkie wrote: >> >>> Hi all, >> >>> the attached patch is a build- and run-tested merge proposal for >> next >> >>> round of the coverity-fixes branch, currently focusing on more >> effective >> >>> header name -> id lookups. >> >>> >> >>> Here's the status with the current todo checklist: >> >>> >> >>> * DONE 1. shift HDR_BAD_HDR to end of enum >> >>> * DONE 2. shift headers data array to http/RegistredHeaders.cc >> >>> * DONE 3. creatign LookupTable object from teh enum and array >> >>> * (with HDR_BAD_HDR as invalid value) >> >>> * DONE 4. replacing httpHeaderIdByName() uses with the lookup table >> >>> * NOT POSSIBLE 5. merge HDR_BAD_HDR and HDR_ENUM_END into one thing - >> >>> HDR_ENUM_END is overloaded meaning "All" headers in Manglers. >> >>> * DONE 6. replacing httpHeaderNameById with direct array lookups >> >>> * DONE 7. being looking at the other arrays removal >> >>> >> >>> In working on this I found out several instances of enum abuses - >> >> tracking >> >>> those down has been the hardest part of the effort. >> >>> HttpHeader::parse is being used to parse error page templates - thus >> the >> >>> relaxed any_registered_header() checks in some methods, e.g. >> >>> HttpHdeader::addEntry(). >> >>> >> >>> Next steps, if there is consensus: >> >>> - moving LookupTable away from std::map to std::hash with custom >> >>> gperf-derived Hashers for extra boost >> >>> - investigating whether strongly-typed enums can be used instead of >> >> C-style >> >>> enums in more places. >> >>> - moving away from homegrown bitfields (CBIT_TEST etc.) towards >> >>> std::vector<bool> or std::vector<unsigned char>, possibly via a class >> >>> bitfield or somesuch. >> >>> >> >> >> >> audit results: >> >> >> >> * httpHdrCcCleanModule() and httpHdrScCleanModule() can both be fully >> >> deleted now. >> >> >> > >> > Yes. I chose to let them in: they are NOPs, not in the critical path, >> and >> > may be useful in the future. Let me know if you still want them removed. >> > >> >> :-( more 'dead' code. This kind of thing is useful for >> ctpr/dtor/functor/virtual definition. But C functions that are not even >> used as functors is a waste of LOC and also compiler time dealing with >> the symbols. >> Its minor but still technical debt. The long-term plan is also to use >> RegisteredRunners for this type of thing not C functions. >> > > Ok. Removed all empty module cleanup functions - which are only invoked if > LEAKCHECK is enabled anyway. > > >> * LookupTableRecord should be a class. >> >> - custom storage types can then inherit from it, with it as the first >> >> parent >> >> >> > >> > Well, templatization can prevent that need, but sure. Should I implement >> > that? >> > Isn't nowadays a struct == all-public class though? It looks odd, but >> > functionally is the same >> > >> >> Avoid "looks odd" whenever possible :-) as you say, the compiler dont >> care. But this is about us humans quick and easy reading it in 10 years >> time. >> >> I think its okay for useful code optimizations. Bt then the design >> choice needs documenting so nobody breaks or undoes it casually. >> > > Ok. > > >> > >> >> in src/HttpHeader.cc: >> >> >> >> * can you move the headerLookupTable to src/http/RegisteredHeaders.cc >> >> and the Http:: namespace as well please ? >> >> >> > >> > Moved to RegisteredHeaders and renamed HDR_FOO to Http::HdrType::FOO . >> > Moving to a strongly-typed enum is unfortunately not feasible as eCAP >> > requires integer-enum equivalence; it may be that the whole change has >> to >> > be reverted. >> > >> >> I thought "enum class" with first parameter value assigned 0 >> ("HDR_ACCEPT = 0,") should do that. ie. strongly typed to unsigned >> integer values. >> > > No. > strongly-typed == doesn't automatically cast to int, if you use > strongly-typed you have to static_cast, while for old-style enum casting is > automatic. > > in c++11 the full enum syntax (square brackets mean optional) is > enum [ class ] enumName [ : storage_specification ] { > enum_elem_1 [ = elem-1-representation ], > ... > }; > > where storage_specification is the underlying integral data type. > Using the optional "class" specification disables auto-cast-to-int. > Elements can then be referenced by enumName::enum_elem_name , where the > enumName specification is mandatory for "enum class", optional for > old-style enum. > > BTW: I've tried changing the underlying storage to unsigned, and I get > asserts in masks calculation. > > >> >> * any_registered_header() is wrong. >> >> - it matches for HDR_OTHER which by definition is a non-registered >> header >> >> - assert_eid() is equivalent to any_valid_header() >> >> >> > >> > Then we need to change name. We need two semantics: >> > 1) any header which is valid and defined (including OTHER) >> > 2) any header ID which will not go out-of-bounds (same as (1) + >> HDR_BAD_HDR) >> > >> > any suggestion? >> > >> >> (1) is already any_valid_header(). >> >> (2) any_HdrType_enum_value() >> >> But where/why do we need (2) ? >> Any value input or received that is non-(1) needs to be represented as >> BAD_HDR when parsing/validating the input to an enum value. >> > > It's sprinkled all over the place, mostly meant to protect against > int-related abuses which may end up causing out of bounds in arrays, > vectors and masks. Moving to enum class would mean implicit safety in this > regard, but would require accessor functions to ensure consistent checks in > typecasts. Hello again HdrIdFromName and HdrDescFromId. I'm cool with it, > if that's a wise design decision in your opinion. It'd save the need of a > dozen asserts or so. > > >> Are you looking at things HttpHeader.cc line ~524: >> >> AA) >> >> + if (e->id >= Http::HdrType::ENUM_END) { >> debugs(55, DBG_CRITICAL, "BUG: invalid entry (" << e->id << ")... >> >> That should probably be if(!any_valid_header(e->id)) since BAD is also >> invalid entry value for a header. >> > > aha, got it and turned in all places where ENUM_END is used to this end. > > >> If the logic there explicitly requires BAD handling, it should be: >> // some reason for why BAD is accepted as 'valid'. >> if(!any_valid_header(e->id) && e->id != BAD) >> >> Although note that when parsing is fixed BAD will represent *any* >> invalid value. >> >> >> or BB) the header manglers "All" ? >> >> For that I think we need a new enum value like "Other," ("All," ?) >> which is outside registered headers, but valid only as an enum entry. >> Currently END, but that needs a rename and never to be used by >> non-mangler code. >> > > > > >> >> >> >> >> >> - please consider removing the assert(assert(any_valid_header(id))) >> >> it could probably be replaced by: >> >> if (!assert(any_valid_header(id))) >> >> id = HDR_OTHER >> >> >> > >> > Huh? I can't find any nested assert >> > >> >> copy-paste error on my part. It is the assert you are adding three lines >> below what is now "got hdr-id=" in the chunk @1512. >> >> I can't see the function name, but it looks like parsing some input. >> Avoid assert in all parsers. Output BAD or OTHER as appropriate instead. >> > > It's HttpHeader::parse. > That assert is actually useless: it's set by a HeaderLookupTable.lookup. > It's guaranteed to be valid (if BAD, and the BAD case is handled by > reassigning to OTHER. > Removing that assert full stop. > > >> I been thinking a bit more (sorry dangerous). >> >> It may be worth adjusting the src/mk-string-arrays.awk script to also be >> capable of output LookupTable<X>::Record arrays. That script is designed >> to guarantee invariance. Though it does leave us with another .cc. >> Anyhow thats a followup to think about, not in this scope. >> > > I agree on it not being in scope. Also because if we wish to use gperf to > generate the perfect hashes, then gperf could (and thus should) do that for > us (see > http://www.gnu.org/software/gperf/manual/html_node/User_002dsupplied-Struct.html#User_002dsupplied-Struct > ) > > >> second round audit: >> >> in src/HttpHdrCc.cc: >> >> * valid_id in httpHdrCcStatDumper() can probably be a const bool like >> the other. >> > > Yes. Done. > > in src/HttpHdrSc.cc: >> >> * useless include of dlink.h since its done in the .h >> > > Ok. > > >> >> * a duplicate include of HttpHdrSc.h since its pulled in by >> HttpHdrScTarget.h. >> - but in this case I think replece it with: >> //#include "HttpHdrSc.h" // pulled in by HttpHdrScTarget.h >> > > Done. Good idea, it'd look odd otherwise. > > >> >> in src/HttpHdrSc.h: >> >> * please use mem/forward.h instead of mem/AllocatorProxy.h >> > > Ok > > >> >> * please store the class pre-defines alphabetically. >> - Packable above StatHist. >> > > Done. > >> >> >> in src/HttpHdrScTarget.cc: >> >> * useless include of HttpHdrSc.h >> - thanks for fixing the missing HttpHdrScTarget.h >> > > Done. NP :) > > >> in src/HttpHdrScTarget.h: >> >> * now has useless include of mem/forward.h (after above fix), >> SquidString.h, dlink.h, typedefs.h >> >> Ok. > > >> in src/HttpHeader.cc: >> >> * please take the opportunity to remove useless includes overlapping >> with HttpHdrCc.h and HttpHdrSc.h >> > > Done. I've commented them and mentioned the indirect inclusion in a > comment. > > >> * please replace include fof HttpHdrSc.h with HttpHdrScTarget.h in the >> new include nesting. >> - also checking for duplicate includes between HttpHdrScTarget.h and >> HttpHeader.cc >> > > Ok. > > >> * chunk @562, s/NULL/nullptr/ >> - maybe others in the touched lines >> > > Done. > > >> >> * HttpHeaderEntry dtor use of assert(any_valid_header(id)); at the top >> is needless. >> - protect the stats accounting against BAD though >> > > Done. > > >> >> * HttpHeaderEntry ctor accepts OTHER but also "anId != BAD" implies that >> it accepts BAD as well despite the previous >> assert(any_registered_header(anId)). >> - wrap the stats accounting in if (id != BAD). >> > > done > > >> >> - fixing those ctor/dtor asserts will avoid any potential for assertion >> from temporary default- or partially- constructed objects, or bad >> emplacement destructions. >> > > ..which are not there, or we'd assert all over the place. But sure. > > >> >> >> in src/acl/RequestHeaderStrategy.h: >> >> * please take the opportunity to remove that empty line between >> 'template' and 'class' at lines at ~14 >> > > Done > > >> in src/http/RegisteredHeaders.cc: >> >> * please add empty line between namespace '{' and start of comment. >> > > Ok > > >> >> >> in src/http/RegisteredHeaders.h: >> >> * please document on HeaderTable that lookup() will return BAD if the >> header is unknown / unregistered. callers need to check and handle that >> case properly. >> > > Ok. > > >> I think that is all. And all polish :-) >> >> +1 on this so far. Though you/we still need to sort the valid registered >> ID tests before commit. >> > > I think I have fixed those, although it's hard to be sure - layering > violations all over the place. > > >> NOTE: these are about old bugs you are uncovering. I'd like to take the >> opportunity to fix them. But this is already huge, so for now just XXX >> mark it and followup patch. >> > > > Like many other areas, there is (plenty of) room for improvement :) > If not performance-wise, certainly code-wise. > I'd leave them as a follow-up effort. > > >> >> * I'm still not clear on why insertEntry() checks for valid header enum >> values but addEntry() needs to accept BAD. >> - if that need is true its worth a comment to document. otherwise >> update top of addEntry() to match insertEntry(). >> - NP: we should *never* be adding a BAD/invalid header entry to a >> message object. Custom ones yes, but they are OTHER. >> >> >> * like has() all the has/put/get*(ID, ...) methods need to exclude BAD >> and OTHER as a valid inputs. >> - OTHER means we are accessing and using the Entry by string name, not >> by ID value. >> - BAD means we should never have put it into object in the first place. >> - the put*(ID, ...) accepting OTHER may be where addEntry(ID,...) goes >> wrong. >> - sorry these seem to have got caught in cleaning up one of my earlier >> comments which was about another changed use of assert_eid(). >> assert_eid() test was wrong for these get/put/has*(ID,...) to begin with. >> >> >> [[ httpHeaderFieldStatDumper() earning itself a place in the hall of >> horrors :-( >> ..., double val, ... >> const int id = static_cast<int>(val); >> const bool valid_id = id < Http::HdrType::ENUM_END; >> // back away, quietly ... >> ]] >> >> > > -- > Francesco > -- Francesco
_______________________________________________ squid-dev mailing list [email protected] http://lists.squid-cache.org/listinfo/squid-dev
