Following up with Kate's post from a few weeks ago, I think the dust has
settled on the code reformat and it went over pretty smoothly for the most
part.  So I thought it might be worth throwing out some ideas for where we
go from here.  I have a large list of ideas (more ideas than time, sadly)
that I've been collecting over the past few weeks, so I figured I would
throw them out in the open for discussion.

I’ve grouped the areas for improvement into 3 high level categories.


   De-inventing the wheel - We should use more code from LLVM, and delete
   code in LLDB where LLVM provides a solution. In cases where there is an
   LLVM thing that is *similar* to what we need, we should extend the LLVM
   thing to support what we need, and then use it. Following are some areas
   I've identified. This list is by no means complete. For each one, I've
   given a personal assessment of how likely it is to cause some (temporary)
   hiccups, how much it would help us in the long run, and how difficult it
   would be to do. Without further ado:

      Use llvm::Regex instead of lldb::Regex

         llvm::Regex doesn’t support enhanced mode.  Could we add support
         for this to llvm::Regex?

         Risk: 6

         Impact: 3

         Difficulty / Effort: 3 (5 if we have to add enhanced mode support)

      Use llvm streams instead of lldb::StreamString

         Supports output re-targeting (stderr, stdout, std::string, etc),
         printf style formatting, and type-safe streaming operators.

         Interoperates nicely with many existing llvm utility classes

         Risk: 4

         Impact: 5

         Difficulty / Effort: 7

      Use llvm::Error instead of lldb::Error

         llvm::Error is an error class that *requires* you to check whether
         it succeeded or it will assert. In a way, it's similar to a
C++ exception,
         except that it doesn't come with the performance hit associated with
         exceptions. It's extensible, and can be easily extended to support the
         various ways LLDB needs to construct errors and error messages.

         Would need to first rename lldb::Error to LLDBError so that te
         conversion from LLDBError to llvm::Error could be done

         Risk: 7

         Impact: 7

         Difficulty / Effort: 8

      StringRef instead of const char *, len everywhere

         Can do most common string operations in a way that is guaranteed
         to be safe.

         Reduces string manipulation algorithm complexity by an order of

         Can potentially eliminate tens of thousands of string copies
         across the codebase.

         Simplifies code.

         Risk: 3

         Impact: 8

         Difficulty / Effort: 7

      ArrayRef instead of const void *, len everywhere

         Same analysis as StringRef

      MutableArrayRef instead of void *, len everywhere

         Same analysis as StringRef

      Delete ConstString, use a modified StringPool that is thread-safe.

         StringPool is a non thread-safe version of ConstString.

         Strings are internally refcounted so they can be cleaned up when
         they are no longer used.  ConstStrings are a large source of
         memory in LLDB, so ref-counting and removing stale strings has the
         potential to be a huge savings.

         Risk: 2

         Impact: 9

         Difficulty / Effort: 6

      thread_local instead of lldb::ThreadLocal

         This fixes a number of bugs on Windows that cannot be fixed
         otherwise, as they require compiler support.

         Some other compilers may not support this yet?

         Risk: 2

         Impact: 3

         Difficulty: 3

      Use llvm::cl for the command line arguments to the primary lldb

         Risk: 2

         Impact: 3

         Difficulty / Effort: 4

   Testing - Our testing infrastructure is unstable, and our test coverage
   is lacking. We should take steps to improve this.

      Port as much as possible to lit

         Simple tests should be trivial to port to lit today.  If nothing
         else this serves as a proof of concept while increasing the speed and
         stability of the test suite, since lit is a more stable harness.

      Separate testing tools

         One question that remains open is how to represent the complicated
         needs of a debugger in lit tests.  Part a) above covers the
trivial cases,
         but what about the difficult cases?  In a number of ideas were discussed.
         We started getting to this idea towards the end, about a separate tool
         which has an interface independent of the command line
interface and which
         can be used to test.  lldb-mi was mentioned.  While I have
serious concerns
         about lldb-mi due to its poorly written and tested codebase,
I do agree in
         principle with the methodology.  In fact, this is the entire
         behind lit as used with LLVM, clang, lld, etc.

I don’t take full credit for this idea.  I had been toying with a similar
idea for some time, but it was further cemented in an offline discussion
with a co-worker.

There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump, etc)
whose purpose are to be chained together to do interesting things.  Instead
of a command line api as we think of in LLDB where you type commands from
an interactive prompt, they have a command line api as you would expect
from any tool which is launched from a shell.

I can imagine many potential candidates for lldb tools of this nature.  Off
the top of my head:


   lldb-unwind - A tool for testing the unwinder.  Accepts byte code as
   input and passes it through to the unwinder, outputting a compressed
   summary of the steps taken while unwinding, which could be pattern matched
   in lit.  The output format is entirely controlled by the tool, and not by
   the unwinder itself, so it would be stable in the face of changes to the
   underlying unwinder.  Could have various options to enable or disable
   features of the unwinder in order to force the unwinder into modes that can
   be tricky to encounter in the wild.

   lldb-symbol - A tool for testing symbol resolution.  Could have options
   for testing things like:

      Determining if a symbol matches an executable

      looking up a symbol by name in the debug info, and mapping it to an
      address in the process.

      Displaying candidate symbols when doing name lookup in a particular
      scope (e.g. while stopped at a breakpoint).

   lldb-breakpoint - A tool for testing breakpoints and stepping.  Various
   options could include:

      Set breakpoints and out addresses and/or symbol names where they were
      resolved to.

      Trigger commands, so that when a breakpoint is hit the tool could
      automatically continue and try to run to another breakpoint, etc.

      options to inspect certain useful pieces of state about an inferior,
      to be matched in lit.

   lldb-interpreter - tests the jitter etc.  I don’t know much about this,
   but I don’t see why this couldn’t be tested in a manner similar to how lli
   is tested.

   lldb-platform - tests lldb local and remote platform interfaces.

   lldb-cli -- lldb interactive command line.

   lldb-format - lldb data formatters etc.


   Tests NOW, not later.

      I know we’ve been over this a million times and it’s not worth going
      over the arguments again.  And I know it’s hard to write tests, often
      requiring the invention of new SB APIs.  Hopefully those issues will be
      addressed by above a) and b) above and writing tests will be easier.
      Vedant Kumar ran some analytics on the various codebases and found that
      LLDB has the lowest test / commit ratio of any LLVM project (He
didn’t post
      numbers for lld, so I’m not sure what it is there).

         lldb: 287 of the past 1000 commits

         llvm: 511 of the past 1000 commits

         clang: 622 of the past 1000 commits

         compiler-rt: 543 of the past 1000 commits

This is an alarming statistic, and I would love to see this number closer
to 50%.


   Code style / development conventions - Aside from just the column
   limitations and bracing styles, there are other areas where LLDB differs
   from LLVM on code style. We should continue to adopt more of LLVM's style
   where it makes sense. I've identified a couple of areas (incomplete list)
   which I outline below.

      Clean up the mess of cyclical dependencies and properly layer the
      libraries. This is especially important for things like lldb-server that
      need to link in as little as possible, but regardless it leads to a more
      robust architecture, faster build and link times, better testability, and
      is required if we ever want to do a modules build of LLDB

      Use CMake instead of Xcode project (CMake supports Frameworks). CMake
      supports Apple Frameworks, so the main roadblock to getting this
working is
      just someone doing it. Segmenting the build process by platform doesn't
      make sense for the upstream, especially when there is a
perfectly workable
      solution. I have no doubt that the resulting Xcode workspace generated
      automatically by CMake will *not *be as "nice" as one that is
      maintained by hand. We face this problem with Visual Studio on
      Windows as well. The solution that most people have adopted is
to continue
      using the IDE for code editing and debugging, but for actually
running the
      build, use CMake with Ninja. A similar workflow should still be possible
      with an OSX CMake build, but as I do not work every day on a
Mac, all I can
      say is that it's possible, I have no idea how impactful it would be on
      peoples' workflows.

      Variable naming conventions

         I don’t expect anyone is too fond of LLDB’s naming conventions,
         but if we’re committed to joining the LLVM ecosystem, then
let’s go all the

      Use more modern C++ and less C

         Old habits die hard, but this isn’t just a matter of style.  It
         leads to safer, more robust, and less fragile code as well.

      Shorter functions and classes with more narrowly targeted

         It’s not uncommon to find functions that are hundreds (and in a
         few cases even 1,000+) of lines long.  We really need to be
better about
         breaking functions and classes down into smaller
responsibilities.  This
         helps not just for someone coming in to read the function,
but also for
         testing.  Smaller functions are easier to unit test.

      Convert T foo(X, Y, Error &error) functions to Expected<T> foo(X, Y)
      style (Depends on 1.c)

         llvm::Expected is based on the llvm::Error class described
         earlier.  It’s used when a function is supposed to return a
value, but it
         could fail.  By packaging the error with the return value,
it’s impossible
         to have a situation where you use the return value even in case of an
         error, and because llvm::Error has mandatory checking, it’s also
         impossible to have a sitaution where you don’t check the
error.  So it’s
         very safe.

Whew. That was a lot. If you made it this far, thanks for reading!

Obviously if we were to embark on all of the above, it would take many
months to complete everything. So I'm not proposing anyone stop what
they're doing to work on this. This is just my own personal wishlist
lldb-dev mailing list

Reply via email to