Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-22 Thread Zachary Turner via lldb-commits
+1. One of the points I outlined on my list was to use llvm streams. So
that also solves the problem
On Thu, Sep 22, 2016 at 4:54 AM Pavel Labath  wrote:

> On 22 September 2016 at 01:00, Greg Clayton via lldb-commits
>  wrote:
> > If you use a StringRef in printf, please use "%*s" and then put the
> count and data pointer into the printf, so the above line would become:
> >
> > +  s->Printf("source regex = \"%*s\", exact_match = %d",
> > +(int)m_regex.GetText().size(), m_regex.GetText().data(),
> m_exact_match);
>
> I think this is a very good argument to use a c++-like api for
> streams/logging:
>
> s << "source regex = \"" << m_regex.GetText() << "\", exact_match = "
> << m_exact_match;
>
> It's shorter and much less error-prone.
>
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-22 Thread Pavel Labath via lldb-commits
On 22 September 2016 at 01:00, Greg Clayton via lldb-commits
 wrote:
> If you use a StringRef in printf, please use "%*s" and then put the count and 
> data pointer into the printf, so the above line would become:
>
> +  s->Printf("source regex = \"%*s\", exact_match = %d",
> +(int)m_regex.GetText().size(), m_regex.GetText().data(), 
> m_exact_match);

I think this is a very good argument to use a c++-like api for streams/logging:

s << "source regex = \"" << m_regex.GetText() << "\", exact_match = "
<< m_exact_match;

It's shorter and much less error-prone.
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-21 Thread Zachary Turner via lldb-commits
But the thing is, there were plenty of copies being made before.  People
just didn't see them.  StringRef provides functions like find_first,
substr, and all kinds of stuff.  So before, any time someone wanted to use
one of those functions, what did they do?  Put it in a std::string!
There's your copy.  Even worse, you might think that once you've got a
std::string, at least you don't have to make any more copies, but even
*that* isn't true.  If you want to call std::string::substr(), that
mandates a copy.  Even if you don't put it in a std::string, if you use
strchr or whatever to find the index and calculate your substring c-style,
and you want to pass that to a function taking a const char*, now you have
to put it in a std::string so you can null terminate it, so there's another
copy.

So yes, there's a few more temporary ones while the conversion is in
progress, but there's also some that are removed such as in the above
examples.  I'm not worried too much about the temporary ones because in the
long run the net effect will be a huge reduction in number of copies and
strlens.

On Wed, Sep 21, 2016 at 5:23 PM Greg Clayton  wrote:

>
> > On Sep 21, 2016, at 5:14 PM, Zachary Turner  wrote:
> >
> >
> >
> > On Wed, Sep 21, 2016 at 5:00 PM Greg Clayton  wrote:
> > Please submit a change requests when doing these kinds of things and let
> people comment on the changes before committing such things.
> >
> > We deleted functions that were correctly using "const char *" like:
> >
> >   bool Execute(llvm::StringRef string, Match *match = nullptr) const;
> >   bool Execute(const char *, Match * = nullptr) = delete;
> >
> > Yet inside these functions you do "string.str().c_str()" because these
> need to be null terminated? this indicates that StringRef is NOT the
> correct choice here. You will make heap copy of all strings (if they are
> long enough and regex matching string can be very long) that you compile or
> execute. We already make a copy of the string when we compile, so StringRef
> didn't really save us anything on compile and execute is always making a
> copy just to run the expression on. I am fine with both being there in case
> one might be more efficient, but taking them out just to use a less
> efficient version that uses llvm::StringRef is not the kind of changes we
> should be doing all over.
> >
> > We will make copies of all strings in all of the following changes:
> >
> > - Unneeded copy:
> >
> > -new TypeNameSpecifierImpl(regex->GetText(), true));
> > +new TypeNameSpecifierImpl(regex->GetText().str().c_str(),
> true));
> > All of these copies are only temporary.  As I've said over and over,
> once everything is StringRef all the way down, the copies all disappear.
> It's only a copy because TypeNameSpecifierImpl doesn't take a STringRef,
> and I didn't want to change the whole codebase all at once, but rather
> incrementally.
>
> As part of these changes we absolutely should add StringRef to these
> classes and we should just make it part of the changes. It know these
> strings are temporary, I am not worried about memory, I am worried about
> efficiency of the heap copy of the string that temporarily is created
> during this call. If things were taking "const char *" before, copies were
> not being made. If they were being made, then they were probably being made
> for good reason. I would love to see the places where all of these copies
> were removed, but if they were "const char *" there were no copies going
> in, unless they needed to. If they didn't need to make the copies, then we
> can fix that function.
>
>
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-21 Thread Greg Clayton via lldb-commits

> On Sep 21, 2016, at 5:14 PM, Zachary Turner  wrote:
> 
> 
> 
> On Wed, Sep 21, 2016 at 5:00 PM Greg Clayton  wrote:
> Please submit a change requests when doing these kinds of things and let 
> people comment on the changes before committing such things.
> 
> We deleted functions that were correctly using "const char *" like:
> 
>   bool Execute(llvm::StringRef string, Match *match = nullptr) const;
>   bool Execute(const char *, Match * = nullptr) = delete;
> 
> Yet inside these functions you do "string.str().c_str()" because these need 
> to be null terminated? this indicates that StringRef is NOT the correct 
> choice here. You will make heap copy of all strings (if they are long enough 
> and regex matching string can be very long) that you compile or execute. We 
> already make a copy of the string when we compile, so StringRef didn't really 
> save us anything on compile and execute is always making a copy just to run 
> the expression on. I am fine with both being there in case one might be more 
> efficient, but taking them out just to use a less efficient version that uses 
> llvm::StringRef is not the kind of changes we should be doing all over.
> 
> We will make copies of all strings in all of the following changes:
> 
> - Unneeded copy:
> 
> -new TypeNameSpecifierImpl(regex->GetText(), true));
> +new TypeNameSpecifierImpl(regex->GetText().str().c_str(), true));
> All of these copies are only temporary.  As I've said over and over, once 
> everything is StringRef all the way down, the copies all disappear.  It's 
> only a copy because TypeNameSpecifierImpl doesn't take a STringRef, and I 
> didn't want to change the whole codebase all at once, but rather 
> incrementally.

As part of these changes we absolutely should add StringRef to these classes 
and we should just make it part of the changes. It know these strings are 
temporary, I am not worried about memory, I am worried about efficiency of the 
heap copy of the string that temporarily is created during this call. If things 
were taking "const char *" before, copies were not being made. If they were 
being made, then they were probably being made for good reason. I would love to 
see the places where all of these copies were removed, but if they were "const 
char *" there were no copies going in, unless they needed to. If they didn't 
need to make the copies, then we can fix that function.

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-21 Thread Zachary Turner via lldb-commits
On Wed, Sep 21, 2016 at 5:00 PM Greg Clayton  wrote:

> Please submit a change requests when doing these kinds of things and let
> people comment on the changes before committing such things.
>
> We deleted functions that were correctly using "const char *" like:
>
>   bool Execute(llvm::StringRef string, Match *match = nullptr) const;
>   bool Execute(const char *, Match * = nullptr) = delete;
>
> Yet inside these functions you do "string.str().c_str()" because these
> need to be null terminated? this indicates that StringRef is NOT the
> correct choice here. You will make heap copy of all strings (if they are
> long enough and regex matching string can be very long) that you compile or
> execute. We already make a copy of the string when we compile, so StringRef
> didn't really save us anything on compile and execute is always making a
> copy just to run the expression on. I am fine with both being there in case
> one might be more efficient, but taking them out just to use a less
> efficient version that uses llvm::StringRef is not the kind of changes we
> should be doing all over.
>
> We will make copies of all strings in all of the following changes:
>
> - Unneeded copy:
>
> -new TypeNameSpecifierImpl(regex->GetText(), true));
> +new TypeNameSpecifierImpl(regex->GetText().str().c_str(), true));
>
All of these copies are only temporary.  As I've said over and over, once
everything is StringRef all the way down, the copies all disappear.  It's
only a copy because TypeNameSpecifierImpl doesn't take a STringRef, and I
didn't want to change the whole codebase all at once, but rather
incrementally.
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


Re: [Lldb-commits] [lldb] r282079 - Make lldb::Regex use StringRef.

2016-09-21 Thread Greg Clayton via lldb-commits
Please submit a change requests when doing these kinds of things and let people 
comment on the changes before committing such things.

We deleted functions that were correctly using "const char *" like:

  bool Execute(llvm::StringRef string, Match *match = nullptr) const;
  bool Execute(const char *, Match * = nullptr) = delete;

Yet inside these functions you do "string.str().c_str()" because these need to 
be null terminated? this indicates that StringRef is NOT the correct choice 
here. You will make heap copy of all strings (if they are long enough and regex 
matching string can be very long) that you compile or execute. We already make 
a copy of the string when we compile, so StringRef didn't really save us 
anything on compile and execute is always making a copy just to run the 
expression on. I am fine with both being there in case one might be more 
efficient, but taking them out just to use a less efficient version that uses 
llvm::StringRef is not the kind of changes we should be doing all over.

We will make copies of all strings in all of the following changes:

- Unneeded copy:

-new TypeNameSpecifierImpl(regex->GetText(), true));
+new TypeNameSpecifierImpl(regex->GetText().str().c_str(), true));


- Unneeded copy just to print in print (there are tons of these):

void BreakpointResolverFileRegex::GetDescription(Stream *s) {
-  s->Printf("source regex = \"%s\", exact_match = %d", m_regex.GetText(),
-m_exact_match);
+  s->Printf("source regex = \"%s\", exact_match = %d",
+m_regex.GetText().str().c_str(), m_exact_match);
}

If you use a StringRef in printf, please use "%*s" and then put the count and 
data pointer into the printf, so the above line would become:

+  s->Printf("source regex = \"%*s\", exact_match = %d",
+(int)m_regex.GetText().size(), m_regex.GetText().data(), 
m_exact_match);


- Unneeded copy:

lldb::OptionValueSP OptionValueRegex::DeepCopy() const {
-  return OptionValueSP(new OptionValueRegex(m_regex.GetText()));
+  return OptionValueSP(new OptionValueRegex(m_regex.GetText().str().c_str()));
}

So please submit things for review before committing them so we can catch 
things like this.

Things I would like to see fixed:
- convert all Printf functions that are not in "if (log) log->Printf(...)" 
(which eliminates most of the changes you need to make) uses of StringRef over 
to use the "%*s" and be careful that the "*" requires an integer length, not a 
size_t length, so you will need to cast the size of the string to int.
- Revert changing RegularExpression::Execute over to using StringRef as it is 
the wrong thing to do
- Either convert functions over to use StringRef as parameters so we can avoid 
extra copies (anything that calls "str().c_str()" in a parameters to functions) 
or revert back to "const char *" and don't use StringRef

> On Sep 21, 2016, at 9:01 AM, Zachary Turner via lldb-commits 
>  wrote:
> 
> Author: zturner
> Date: Wed Sep 21 11:01:28 2016
> New Revision: 282079
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=282079=rev
> Log:
> Make lldb::Regex use StringRef.
> 
> This updates getters and setters to use StringRef instead of
> const char *.  I tested the build on Linux, Windows, and OSX
> and saw no build or test failures.  I cannot test any BSD
> or Android variants, however I expect the required changes
> to be minimal or non-existant.
> 
> Modified:
>lldb/trunk/include/lldb/Breakpoint/BreakpointResolver.h
>lldb/trunk/include/lldb/Core/RegularExpression.h
>lldb/trunk/include/lldb/Core/Stream.h
>lldb/trunk/include/lldb/DataFormatters/FormattersContainer.h
>lldb/trunk/include/lldb/Interpreter/OptionValueRegex.h
>lldb/trunk/include/lldb/Interpreter/OptionValueString.h
>lldb/trunk/source/API/SBTarget.cpp
>lldb/trunk/source/API/SBTypeCategory.cpp
>lldb/trunk/source/Breakpoint/BreakpointResolver.cpp
>lldb/trunk/source/Breakpoint/BreakpointResolverFileLine.cpp
>lldb/trunk/source/Breakpoint/BreakpointResolverFileRegex.cpp
>lldb/trunk/source/Breakpoint/BreakpointResolverName.cpp
>lldb/trunk/source/Commands/CommandCompletions.cpp
>lldb/trunk/source/Commands/CommandObjectBreakpoint.cpp
>lldb/trunk/source/Commands/CommandObjectFrame.cpp
>lldb/trunk/source/Commands/CommandObjectTarget.cpp
>lldb/trunk/source/Commands/CommandObjectType.cpp
>lldb/trunk/source/Core/AddressResolverName.cpp
>lldb/trunk/source/Core/Disassembler.cpp
>lldb/trunk/source/Core/Module.cpp
>lldb/trunk/source/Core/RegularExpression.cpp
>lldb/trunk/source/Core/SourceManager.cpp
>lldb/trunk/source/DataFormatters/FormatManager.cpp
>lldb/trunk/source/DataFormatters/FormattersHelpers.cpp
>lldb/trunk/source/Host/common/FileSpec.cpp
>lldb/trunk/source/Host/common/Socket.cpp
>lldb/trunk/source/Interpreter/Args.cpp
>lldb/trunk/source/Interpreter/CommandObjectRegexCommand.cpp
>