On Wed Feb 11 2015 at 2:04:38 PM Greg Clayton <gclay...@apple.com> wrote:
> > > On Feb 11, 2015, at 11:23 AM, Zachary Turner <ztur...@google.com> wrote: > > > > > > > > On Wed Feb 11 2015 at 11:13:22 AM Greg Clayton <gclay...@apple.com> > wrote: > > > > > On Feb 11, 2015, at 10:18 AM, Zachary Turner <ztur...@google.com> > wrote: > > > > > > A patch is up which fixes a number of issues with strings not being > null terminated in LLDB. The issues and error-proneness of using raw c > strings is well documented and understood, so I would like to propose > banning them in LLDB for new code [1]. > > > > > > I realize that's a tall order, and I don't expect all the occurrences > of them to go away overnight, but issues related to strings have been > popping up over and over again, and it would be nice if we could deal with > more interesting problems. > > > > > > LLVM has a very robust string manipulation library, and while it's not > perfect, it would have prevented almost all of these issues from ever > happening in the first place. Here is a quick summary of common C string > types / operations and their corresponding LLVM equivalents: > > > > > > Types: > > > char stack_buffer[n]; // Use llvm::SmallString<n>. > > > > That would be fine. > > > > > const char* foo; // Use llvm::StringRef foo. > > > > This may be fine as long as there is absolutely _no_ string manipulation > on "foo" (like "strcat(...) etc. Use std::string for anything that requires > storage. And we don't want to start using any code that does "let me assign > an llvm::StringRef() with a ConstString("hello").GetCString() so I don't > have to worry about ownership. Adding string to the constant string pool > should be reserved for things that actually need it (any const names like > namespaces, variable names, type names, paths, etc) and also for when we > return "const char *" through the public API. > > > > This also means that we can't trust that "foo" is NULL terminated > (llvm::StringRef objects have a data pointer and a length and aren't > required to be terminated) and it means we must call "foo.str().c_str()" > any time we require NULL termination. So I actually don't like > llvm::StringRef as a replacement "const char *"... > > > We can't assume that const char*s are null terminated either. > > Yes we can, that is was is assumed by const char * right now for every > call we use it with. > Right, but my point is that if we change a const char * to a StringRef(), and that const char* was assumed to be null terminated, then the StringRef is also assumed to be null terminated. You can just call StringRef::data() on it. > > But that isn't what is currently meant by _any_ of the functions that take > a "const char *" with no length parameter, so it means that any switch to > using llvm::StringRef requires we use ".str().c_str()" which I would like > to avoid. > I don't see why this is necessary. Just because StringRef supports non-null terminated strings doesn't mean we have to use non null terminated strings with it. declaring a SmallString<> and passing by SmallVectorImpl& supports null terminating strings. You push_back(0) and then pop_back() before returning. This is no different from what we're already doing by setting foo[n] = '\0' except that it's safer since there's no chance of a buffer overrun. We are already taking care to make sure all of our const char*s are properly null terminated. Therefore when they are passed to functions that accept a StringRef, they will continue to be null terminated. > > > > > > > > > Operations: > > > strcpy, strncpy, etc // Use member functions of SmallString / > StringRef > > > sprintf, snprintf, etc // Use llvm::raw_ostream > > > > We have StringStream, people should be using that. raw_ostream doesn't > support all printf style formats. So all sprintf and snprintf function > calls should switch to using lldb_private::StreamString. > > > I actually disagree here. Printf() has plenty of problems, such as > non-portable and confusing arguments, > > I agree on this one, but we are getting around this with using PRI macros. > > > lack of type safety, > > most of this is taken care of when you have the printf format warnings on > your class. > > > possibility of buffer overruns. > > There are no buffer overruns in the Stream::Printf(). Check the > implementation if you need to. > > > I would actually prefer to use raw_ostream instead of StringStream. > > I don't. I have written tools using streaming style stuff before (like > std::cout) and also written using printf style stuff and I find the printf > style stuff more readable and maintainable. I also don't like streams that > maintain state like the std::ostream does. Not sure if raw_ostream does any > state management, but when they do it causes bugs later for code that > doesn't always explicitly set the format, width, precision and much more: > > std::ostream &os; > os << std::hex << hex_num; > dump(os); > os << hex_num2; > > little did we know the stream was modified by "dump()" to set the format > to decimal so now you must always use: > > os << std::hex << hex_num2; > > (repeat for number width, precision and many other factors). > I agree, that's not a pleasant experience. but as an aside, llvm::raw_ostream does not do this.
_______________________________________________ lldb-dev mailing list lldb-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev