Adding another thing to my list (thanks to Mehdi and Eric Christopher for
the idea).

Apply libfuzzer to LLDB.  Details sparse on what parse of LLDB and how, but
I think it would be easy to come up with candidates.

On Mon, Sep 19, 2016 at 1:18 PM Zachary Turner <ztur...@google.com> wrote:

> Following up with Kate's post from a few weeks ago, I think the dust has
> settled on the code reformat and it went over pretty smoothly for the most
> part.  So I thought it might be worth throwing out some ideas for where we
> go from here.  I have a large list of ideas (more ideas than time, sadly)
> that I've been collecting over the past few weeks, so I figured I would
> throw them out in the open for discussion.
>
> I’ve grouped the areas for improvement into 3 high level categories.
>
>
>    1.
>
>    De-inventing the wheel - We should use more code from LLVM, and delete
>    code in LLDB where LLVM provides a solution. In cases where there is an
>    LLVM thing that is *similar* to what we need, we should extend the LLVM
>    thing to support what we need, and then use it. Following are some areas
>    I've identified. This list is by no means complete. For each one, I've
>    given a personal assessment of how likely it is to cause some (temporary)
>    hiccups, how much it would help us in the long run, and how difficult it
>    would be to do. Without further ado:
>    1.
>
>       Use llvm::Regex instead of lldb::Regex
>       1.
>
>          llvm::Regex doesn’t support enhanced mode.  Could we add support
>          for this to llvm::Regex?
>          2.
>
>          Risk: 6
>          3.
>
>          Impact: 3
>          4.
>
>          Difficulty / Effort: 3 (5 if we have to add enhanced mode
>          support)
>          2.
>
>       Use llvm streams instead of lldb::StreamString
>       1.
>
>          Supports output re-targeting (stderr, stdout, std::string, etc),
>          printf style formatting, and type-safe streaming operators.
>          2.
>
>          Interoperates nicely with many existing llvm utility classes
>          3.
>
>          Risk: 4
>          4.
>
>          Impact: 5
>          5.
>
>          Difficulty / Effort: 7
>          3.
>
>       Use llvm::Error instead of lldb::Error
>       1.
>
>          llvm::Error is an error class that *requires* you to check
>          whether it succeeded or it will assert. In a way, it's similar to a 
> C++
>          exception, except that it doesn't come with the performance hit 
> associated
>          with exceptions. It's extensible, and can be easily extended to 
> support the
>          various ways LLDB needs to construct errors and error messages.
>          2.
>
>          Would need to first rename lldb::Error to LLDBError so that te
>          conversion from LLDBError to llvm::Error could be done
>          incrementally.
>          3.
>
>          Risk: 7
>          4.
>
>          Impact: 7
>          5.
>
>          Difficulty / Effort: 8
>          4.
>
>       StringRef instead of const char *, len everywhere
>       1.
>
>          Can do most common string operations in a way that is guaranteed
>          to be safe.
>          2.
>
>          Reduces string manipulation algorithm complexity by an order of
>          magnitude.
>          3.
>
>          Can potentially eliminate tens of thousands of string copies
>          across the codebase.
>          4.
>
>          Simplifies code.
>          5.
>
>          Risk: 3
>          6.
>
>          Impact: 8
>          7.
>
>          Difficulty / Effort: 7
>          5.
>
>       ArrayRef instead of const void *, len everywhere
>       1.
>
>          Same analysis as StringRef
>          6.
>
>       MutableArrayRef instead of void *, len everywhere
>       1.
>
>          Same analysis as StringRef
>          7.
>
>       Delete ConstString, use a modified StringPool that is thread-safe.
>       1.
>
>          StringPool is a non thread-safe version of ConstString.
>          2.
>
>          Strings are internally refcounted so they can be cleaned up when
>          they are no longer used.  ConstStrings are a large source of
>          memory in LLDB, so ref-counting and removing stale strings has the
>          potential to be a huge savings.
>          3.
>
>          Risk: 2
>          4.
>
>          Impact: 9
>          5.
>
>          Difficulty / Effort: 6
>          8.
>
>       thread_local instead of lldb::ThreadLocal
>       1.
>
>          This fixes a number of bugs on Windows that cannot be fixed
>          otherwise, as they require compiler support.
>          2.
>
>          Some other compilers may not support this yet?
>          3.
>
>          Risk: 2
>          4.
>
>          Impact: 3
>          5.
>
>          Difficulty: 3
>          9.
>
>       Use llvm::cl for the command line arguments to the primary lldb
>       executable.
>       1.
>
>          Risk: 2
>          2.
>
>          Impact: 3
>          3.
>
>          Difficulty / Effort: 4
>          2.
>
>    Testing - Our testing infrastructure is unstable, and our test
>    coverage is lacking. We should take steps to improve this.
>    1.
>
>       Port as much as possible to lit
>       1.
>
>          Simple tests should be trivial to port to lit today.  If nothing
>          else this serves as a proof of concept while increasing the speed and
>          stability of the test suite, since lit is a more stable harness.
>          2.
>
>       Separate testing tools
>       1.
>
>          One question that remains open is how to represent the
>          complicated needs of a debugger in lit tests.  Part a) above covers 
> the
>          trivial cases, but what about the difficult cases?  In
>          https://reviews.llvm.org/D24591 a number of ideas were
>          discussed.  We started getting to this idea towards the end, about a
>          separate tool which has an interface independent of the command line
>          interface and which can be used to test.  lldb-mi was mentioned.  
> While I
>          have serious concerns about lldb-mi due to its poorly written and 
> tested
>          codebase, I do agree in principle with the methodology.  In fact, 
> this is
>          the entire philosophy behind lit as used with LLVM, clang, lld, etc.
>
>
> I don’t take full credit for this idea.  I had been toying with a similar
> idea for some time, but it was further cemented in an offline discussion
> with a co-worker.
>
> There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump,
> etc) whose purpose are to be chained together to do interesting things.
> Instead of a command line api as we think of in LLDB where you type
> commands from an interactive prompt, they have a command line api as you
> would expect from any tool which is launched from a shell.
>
> I can imagine many potential candidates for lldb tools of this nature.
> Off the top of my head:
>
>    1.
>
>    lldb-unwind - A tool for testing the unwinder.  Accepts byte code as
>    input and passes it through to the unwinder, outputting a compressed
>    summary of the steps taken while unwinding, which could be pattern matched
>    in lit.  The output format is entirely controlled by the tool, and not by
>    the unwinder itself, so it would be stable in the face of changes to the
>    underlying unwinder.  Could have various options to enable or disable
>    features of the unwinder in order to force the unwinder into modes that can
>    be tricky to encounter in the wild.
>    2.
>
>    lldb-symbol - A tool for testing symbol resolution.  Could have
>    options for testing things like:
>    1.
>
>       Determining if a symbol matches an executable
>       2.
>
>       looking up a symbol by name in the debug info, and mapping it to an
>       address in the process.
>       3.
>
>       Displaying candidate symbols when doing name lookup in a particular
>       scope (e.g. while stopped at a breakpoint).
>       3.
>
>    lldb-breakpoint - A tool for testing breakpoints and stepping.
>    Various options could include:
>    1.
>
>       Set breakpoints and out addresses and/or symbol names where they
>       were resolved to.
>       2.
>
>       Trigger commands, so that when a breakpoint is hit the tool could
>       automatically continue and try to run to another breakpoint, etc.
>       3.
>
>       options to inspect certain useful pieces of state about an
>       inferior, to be matched in lit.
>       4.
>
>    lldb-interpreter - tests the jitter etc.  I don’t know much about
>    this, but I don’t see why this couldn’t be tested in a manner similar to
>    how lli is tested.
>    5.
>
>    lldb-platform - tests lldb local and remote platform interfaces.
>    6.
>
>    lldb-cli -- lldb interactive command line.
>    7.
>
>    lldb-format - lldb data formatters etc.
>
>
>    1.
>
>    Tests NOW, not later.
>    1.
>
>       I know we’ve been over this a million times and it’s not worth
>       going over the arguments again.  And I know it’s hard to write tests, 
> often
>       requiring the invention of new SB APIs.  Hopefully those issues will be
>       addressed by above a) and b) above and writing tests will be easier.
>       Vedant Kumar ran some analytics on the various codebases and found that
>       LLDB has the lowest test / commit ratio of any LLVM project (He didn’t 
> post
>       numbers for lld, so I’m not sure what it is there).
>       1.
>
>          lldb: 287 of the past 1000 commits
>          2.
>
>          llvm: 511 of the past 1000 commits
>          3.
>
>          clang: 622 of the past 1000 commits
>          4.
>
>          compiler-rt: 543 of the past 1000 commits
>
> This is an alarming statistic, and I would love to see this number closer
> to 50%.
>
>    1.
>
>    Code style / development conventions - Aside from just the column
>    limitations and bracing styles, there are other areas where LLDB differs
>    from LLVM on code style. We should continue to adopt more of LLVM's style
>    where it makes sense. I've identified a couple of areas (incomplete list)
>    which I outline below.
>    1.
>
>       Clean up the mess of cyclical dependencies and properly layer the
>       libraries. This is especially important for things like lldb-server that
>       need to link in as little as possible, but regardless it leads to a more
>       robust architecture, faster build and link times, better testability, 
> and
>       is required if we ever want to do a modules build of LLDB
>       2.
>
>       Use CMake instead of Xcode project (CMake supports Frameworks).
>       CMake supports Apple Frameworks, so the main roadblock to getting this
>       working is just someone doing it. Segmenting the build process by 
> platform
>       doesn't make sense for the upstream, especially when there is a 
> perfectly
>       workable solution. I have no doubt that the resulting Xcode workspace
>       generated automatically by CMake will *not *be as "nice" as one
>       that is maintained by hand. We face this problem with Visual Studio
>       on Windows as well. The solution that most people have adopted is to
>       continue using the IDE for code editing and debugging, but for actually
>       running the build, use CMake with Ninja. A similar workflow should 
> still be
>       possible with an OSX CMake build, but as I do not work every day on a 
> Mac,
>       all I can say is that it's possible, I have no idea how impactful it 
> would
>       be on peoples' workflows.
>       3.
>
>       Variable naming conventions
>       1.
>
>          I don’t expect anyone is too fond of LLDB’s naming conventions,
>          but if we’re committed to joining the LLVM ecosystem, then let’s go 
> all the
>          way.
>          4.
>
>       Use more modern C++ and less C
>       1.
>
>          Old habits die hard, but this isn’t just a matter of style.  It
>          leads to safer, more robust, and less fragile code as well.
>          5.
>
>       Shorter functions and classes with more narrowly targeted
>       responsibilities
>       1.
>
>          It’s not uncommon to find functions that are hundreds (and in a
>          few cases even 1,000+) of lines long.  We really need to be better 
> about
>          breaking functions and classes down into smaller responsibilities.  
> This
>          helps not just for someone coming in to read the function, but also 
> for
>          testing.  Smaller functions are easier to unit test.
>          6.
>
>       Convert T foo(X, Y, Error &error) functions to Expected<T> foo(X, Y)
>       style (Depends on 1.c)
>       1.
>
>          llvm::Expected is based on the llvm::Error class described
>          earlier.  It’s used when a function is supposed to return a value, 
> but it
>          could fail.  By packaging the error with the return value, it’s 
> impossible
>          to have a situation where you use the return value even in case of an
>          error, and because llvm::Error has mandatory checking, it’s also
>          impossible to have a sitaution where you don’t check the error.  So 
> it’s
>          very safe.
>
>
> Whew. That was a lot. If you made it this far, thanks for reading!
>
> Obviously if we were to embark on all of the above, it would take many
> months to complete everything. So I'm not proposing anyone stop what
> they're doing to work on this. This is just my own personal wishlist
>
_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to