FWIW, it may be worth pointing out the existence of llvm/include/llvm/Support/ConstantRange.h
Of course that isn't going to deal with symbolic values... - Daniel On Thu, Feb 5, 2009 at 3:55 PM, Ted Kremenek <[email protected]> wrote: > > On Feb 1, 2009, at 12:06 AM, Ben Laurie wrote: > >> Not complete - no self-tests and full of diagnostics, but I'd love to >> get comments about the approach. Also hints on implementing tests (Ted >> told me in IM, but I forgot), and how to preserve/add more debugging >> stuff in an approved way, since I find that this stuff hurts my head >> without it... >> >> Also, I have cases which fail because higher-level stuff seems to be >> doing the wrong thing, but not sure how to go about debugging the rest >> of static analysis... >> <clang.patch>_______________________________________________ >> cfe-commits mailing list >> [email protected] >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > > Hi Ben, > > Sorry for the delay. Comments inline: > > Index: lib/Analysis/BasicConstraintManager.cpp > =================================================================== > --- lib/Analysis/BasicConstraintManager.cpp (revision 63488) > +++ lib/Analysis/BasicConstraintManager.cpp (working copy) > @@ -17,29 +17,364 @@ > #include "clang/Analysis/PathSensitive/GRStateTrait.h" > #include "clang/Analysis/PathSensitive/GRTransferFuncs.h" > #include "llvm/Support/Compiler.h" > +#include "llvm/ADT/FoldingSet.h" > #include "llvm/Support/raw_ostream.h" > > +#include <iostream> > > We try and avoid <iostream>. For my reasons why, see the following in > the LLVM coding standards: > > http://llvm.org/docs/CodingStandards.html#ll_iostream > > +#include <set> > > You're probably going to want to use ImmutableSet, since its a > functional data structure (which works nicely when you embed sets in > state objects). It also doesn't have the malloc traffic that > std::set<> does. > > +class Range : public std::pair<llvm::APSInt, llvm::APSInt> { > +public: > + Range(const llvm::APSInt &from, const llvm::APSInt &to) > + : std::pair<llvm::APSInt, llvm::APSInt>(from, to) { > + std::cerr << "Range(" << from.toString(10) << ", " << > to.toString(10) > + << ')' << std::endl; > + assert(from <= to); > + } > + bool Includes(const llvm::APSInt &v) const { > + return first <= v && v <= second; > + } > + const llvm::APSInt &From() const { > + return first; > + } > + const llvm::APSInt &To() const { > + return second; > + } > > Cute. Does a Range represent "[a, b]" or "(a, b)". Please document > this with comments. > > I also take it that ranges are only over concrete values? e.g., [0, > 2]. What about ranges where: > > (1) one of the ends is "symbolic", e.g. [a, 2] > (2) one of the ends is Unknown, e.g. [4, unknown]. > > You don't have to model (1) now, but I think you probably will need to > model (2). Frequently we have a lower bound but no upper bound > (although I imagine you could use maxint, minint, etc.). > > Note that this approach will inherently not cover relationships > between symbolic values, e.g., tracking if 'a > b' when the values of > 'a' and 'b' are symbolic. > > +struct RangeCmp { > + bool operator()(const Range &r1, const Range &r2) { > + if(r1.From() < r2.From()) { > + assert(!r1.Includes(r2.From())); > + assert(!r2.Includes(r1.To())); > + return true; > + } else if(r1.From() > r2.From()) { > + assert(!r1.Includes(r2.To())); > + assert(!r2.Includes(r1.From())); > + return false; > + } else > + assert(!"Ranges should never be equal in the same set"); > + } > +}; > > I don't this will work. The 'else' branch can be easily triggered if > ranges overlap. Also, is there a reason you are using operator() > instead of operator>() and operator<()? > > + > +typedef std::set<Range, RangeCmp> PrimRangeSet; > > I think you will want to use ImmutableSet. > > + > +class RangeSet; > +std::ostream &operator<<(std::ostream &os, const RangeSet &r); > > Don't use std::ostream. Use llvm::raw_ostream&. > > + > +class RangeSet { > + PrimRangeSet ranges; > + bool noValues; // if true, no value is possible > > My meta-level comment about this class is that isn't clear what it > represents. The code afterwards also seems not sure of what > assumptions to make about RangeSet. It seems that RangeSet represents > a set of (non-overlapping) ranges, e.g. which represent the union of > the possible values for some symbolic integer. In other words, it > represents a disjunction: > > (x >= set1_lb && x <= set1_ub) || (x >= set2_lb && x <= set2_ub) || ... > > With this formalism it should be much more straightforward to answer > the implementation questions for the methods below. It will also tell > you what is the representational power of this constraint model. > > Here "lb" and "ub" mean "lower bound" and "upper bound" respectively. > > + > + static const llvm::APSInt Max(const llvm::APSInt &v) { > + return llvm::APSInt::getMaxValue(v.getBitWidth(), v.isUnsigned()); > + } > + static const llvm::APSInt Min(const llvm::APSInt &v) { > + return llvm::APSInt::getMinValue(v.getBitWidth(), v.isUnsigned()); > + } > + static const llvm::APSInt One(const llvm::APSInt &v) { > + return llvm::APSInt(llvm::APInt(v.getBitWidth(), 1), > v.isUnsigned()); > + } > + > +public: > + RangeSet() : noValues(false) { > + } > + void Profile(llvm::FoldingSetNodeID &ID) const { > + for(PrimRangeSet::const_iterator i = ranges.begin() ; i != > ranges.end() ; ++i) { > + i->Profile(ID); > + } > + } > > This kind of Profile implementation won't be necessary if you use > ImmutableSet. It will do it for you. Moreover, with ImmutableSet, two > sets (created using the same ImmutableSet::Factory) will always be > referentially equivalent, meaning they are a quick pointer > comparison. ImmutableSets also use techniques similar to hash-consing > to share data between sets. > > + > + bool CouldBeNE(const llvm::APSInt &ne) const { > + std::cerr << "CouldBeNE(" << ne.toString(10) << ") " << *this << > std::endl; > > Please don't have debugging messages like this enabled by default. > Either comment it out or #ifdef it. > > + // FIXME: do I mean this? empty means all values are possible, > therefore != possible, right? > + if(ranges.empty()) > + return false; > > I think to answer the questions of these FIXMEs you just need to > define what the invariants are for PrimRangeSet and what it represents. > > + // FIXME: if there are no possible values, then can't be !=, > surely? > + if(noValues) > + return true; > + for(PrimRangeSet::const_iterator i = ranges.begin() ; i != > ranges.end() ; ++i) > + if(i->Includes(ne)) > + return false; > + return true; > + } > > + > + bool CouldBeEQ(const llvm::APSInt &eq) const { > + std::cerr << "CouldBeEQ(" << eq.toString(10) << ") " << *this << > std::endl; > + if(ranges.empty()) > + return true; > + // FIXME: if there are no possible values, then can't be ==, > surely? > + if(noValues) > + return true; > + for(PrimRangeSet::const_iterator i = ranges.begin() ; i != > ranges.end() ; ++i) > + if(i->Includes(eq)) > + return true; > + return false; > + } > > Same comments with CouldBeNE, CouldBeLT, etc.. > > .... > > + // Make all existing ranges fall within this new range > + void Restrict(const llvm::APSInt &from, const llvm::APSInt &to) { > > The comment for this method is a little too terse for me to understand > your algorithm. What is it trying to do? It also looks like you are > modifying the RangeSet object in place. In the static analyzer, we > use functional data structures that preserve old data so that each > GRState object (which represents the analysis state at a particular > point within a path) never have their values mutated after the GRState > object is created. This allows us to faithfully keep full path > information. This allows us to report full path diagnostics when > reporting bugs but also (from a correctness standpoint) allows path > caching/merging to work at all. > > Without making comments on the rest of the patch (which looks similar > to the other parts), I think there just needs to be a high-level > comment of what is the algorithmic model for the constraints, how they > are resolved, what is represented, etc. Right now it is difficult for > me to evaluate the approach itself since I don't really understand the > algorithm. Also, I think that the implementation itself should come > in a straightforward manner once a few key methods (e.g., "CouldBeEQ") > get fleshed out. I'm more than happy to help with this process. > > Ted > _______________________________________________ > cfe-commits mailing list > [email protected] > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
