Adding another thing to my list (thanks to Mehdi and Eric Christopher for the idea).
Apply libfuzzer to LLDB. Details sparse on what parse of LLDB and how, but I think it would be easy to come up with candidates. On Mon, Sep 19, 2016 at 1:18 PM Zachary Turner <ztur...@google.com> wrote: > Following up with Kate's post from a few weeks ago, I think the dust has > settled on the code reformat and it went over pretty smoothly for the most > part. So I thought it might be worth throwing out some ideas for where we > go from here. I have a large list of ideas (more ideas than time, sadly) > that I've been collecting over the past few weeks, so I figured I would > throw them out in the open for discussion. > > I’ve grouped the areas for improvement into 3 high level categories. > > > 1. > > De-inventing the wheel - We should use more code from LLVM, and delete > code in LLDB where LLVM provides a solution. In cases where there is an > LLVM thing that is *similar* to what we need, we should extend the LLVM > thing to support what we need, and then use it. Following are some areas > I've identified. This list is by no means complete. For each one, I've > given a personal assessment of how likely it is to cause some (temporary) > hiccups, how much it would help us in the long run, and how difficult it > would be to do. Without further ado: > 1. > > Use llvm::Regex instead of lldb::Regex > 1. > > llvm::Regex doesn’t support enhanced mode. Could we add support > for this to llvm::Regex? > 2. > > Risk: 6 > 3. > > Impact: 3 > 4. > > Difficulty / Effort: 3 (5 if we have to add enhanced mode > support) > 2. > > Use llvm streams instead of lldb::StreamString > 1. > > Supports output re-targeting (stderr, stdout, std::string, etc), > printf style formatting, and type-safe streaming operators. > 2. > > Interoperates nicely with many existing llvm utility classes > 3. > > Risk: 4 > 4. > > Impact: 5 > 5. > > Difficulty / Effort: 7 > 3. > > Use llvm::Error instead of lldb::Error > 1. > > llvm::Error is an error class that *requires* you to check > whether it succeeded or it will assert. In a way, it's similar to a > C++ > exception, except that it doesn't come with the performance hit > associated > with exceptions. It's extensible, and can be easily extended to > support the > various ways LLDB needs to construct errors and error messages. > 2. > > Would need to first rename lldb::Error to LLDBError so that te > conversion from LLDBError to llvm::Error could be done > incrementally. > 3. > > Risk: 7 > 4. > > Impact: 7 > 5. > > Difficulty / Effort: 8 > 4. > > StringRef instead of const char *, len everywhere > 1. > > Can do most common string operations in a way that is guaranteed > to be safe. > 2. > > Reduces string manipulation algorithm complexity by an order of > magnitude. > 3. > > Can potentially eliminate tens of thousands of string copies > across the codebase. > 4. > > Simplifies code. > 5. > > Risk: 3 > 6. > > Impact: 8 > 7. > > Difficulty / Effort: 7 > 5. > > ArrayRef instead of const void *, len everywhere > 1. > > Same analysis as StringRef > 6. > > MutableArrayRef instead of void *, len everywhere > 1. > > Same analysis as StringRef > 7. > > Delete ConstString, use a modified StringPool that is thread-safe. > 1. > > StringPool is a non thread-safe version of ConstString. > 2. > > Strings are internally refcounted so they can be cleaned up when > they are no longer used. ConstStrings are a large source of > memory in LLDB, so ref-counting and removing stale strings has the > potential to be a huge savings. > 3. > > Risk: 2 > 4. > > Impact: 9 > 5. > > Difficulty / Effort: 6 > 8. > > thread_local instead of lldb::ThreadLocal > 1. > > This fixes a number of bugs on Windows that cannot be fixed > otherwise, as they require compiler support. > 2. > > Some other compilers may not support this yet? > 3. > > Risk: 2 > 4. > > Impact: 3 > 5. > > Difficulty: 3 > 9. > > Use llvm::cl for the command line arguments to the primary lldb > executable. > 1. > > Risk: 2 > 2. > > Impact: 3 > 3. > > Difficulty / Effort: 4 > 2. > > Testing - Our testing infrastructure is unstable, and our test > coverage is lacking. We should take steps to improve this. > 1. > > Port as much as possible to lit > 1. > > Simple tests should be trivial to port to lit today. If nothing > else this serves as a proof of concept while increasing the speed and > stability of the test suite, since lit is a more stable harness. > 2. > > Separate testing tools > 1. > > One question that remains open is how to represent the > complicated needs of a debugger in lit tests. Part a) above covers > the > trivial cases, but what about the difficult cases? In > https://reviews.llvm.org/D24591 a number of ideas were > discussed. We started getting to this idea towards the end, about a > separate tool which has an interface independent of the command line > interface and which can be used to test. lldb-mi was mentioned. > While I > have serious concerns about lldb-mi due to its poorly written and > tested > codebase, I do agree in principle with the methodology. In fact, > this is > the entire philosophy behind lit as used with LLVM, clang, lld, etc. > > > I don’t take full credit for this idea. I had been toying with a similar > idea for some time, but it was further cemented in an offline discussion > with a co-worker. > > There many small, targeted tools in LLVM (e.g. llc, lli, llvm-objdump, > etc) whose purpose are to be chained together to do interesting things. > Instead of a command line api as we think of in LLDB where you type > commands from an interactive prompt, they have a command line api as you > would expect from any tool which is launched from a shell. > > I can imagine many potential candidates for lldb tools of this nature. > Off the top of my head: > > 1. > > lldb-unwind - A tool for testing the unwinder. Accepts byte code as > input and passes it through to the unwinder, outputting a compressed > summary of the steps taken while unwinding, which could be pattern matched > in lit. The output format is entirely controlled by the tool, and not by > the unwinder itself, so it would be stable in the face of changes to the > underlying unwinder. Could have various options to enable or disable > features of the unwinder in order to force the unwinder into modes that can > be tricky to encounter in the wild. > 2. > > lldb-symbol - A tool for testing symbol resolution. Could have > options for testing things like: > 1. > > Determining if a symbol matches an executable > 2. > > looking up a symbol by name in the debug info, and mapping it to an > address in the process. > 3. > > Displaying candidate symbols when doing name lookup in a particular > scope (e.g. while stopped at a breakpoint). > 3. > > lldb-breakpoint - A tool for testing breakpoints and stepping. > Various options could include: > 1. > > Set breakpoints and out addresses and/or symbol names where they > were resolved to. > 2. > > Trigger commands, so that when a breakpoint is hit the tool could > automatically continue and try to run to another breakpoint, etc. > 3. > > options to inspect certain useful pieces of state about an > inferior, to be matched in lit. > 4. > > lldb-interpreter - tests the jitter etc. I don’t know much about > this, but I don’t see why this couldn’t be tested in a manner similar to > how lli is tested. > 5. > > lldb-platform - tests lldb local and remote platform interfaces. > 6. > > lldb-cli -- lldb interactive command line. > 7. > > lldb-format - lldb data formatters etc. > > > 1. > > Tests NOW, not later. > 1. > > I know we’ve been over this a million times and it’s not worth > going over the arguments again. And I know it’s hard to write tests, > often > requiring the invention of new SB APIs. Hopefully those issues will be > addressed by above a) and b) above and writing tests will be easier. > Vedant Kumar ran some analytics on the various codebases and found that > LLDB has the lowest test / commit ratio of any LLVM project (He didn’t > post > numbers for lld, so I’m not sure what it is there). > 1. > > lldb: 287 of the past 1000 commits > 2. > > llvm: 511 of the past 1000 commits > 3. > > clang: 622 of the past 1000 commits > 4. > > compiler-rt: 543 of the past 1000 commits > > This is an alarming statistic, and I would love to see this number closer > to 50%. > > 1. > > Code style / development conventions - Aside from just the column > limitations and bracing styles, there are other areas where LLDB differs > from LLVM on code style. We should continue to adopt more of LLVM's style > where it makes sense. I've identified a couple of areas (incomplete list) > which I outline below. > 1. > > Clean up the mess of cyclical dependencies and properly layer the > libraries. This is especially important for things like lldb-server that > need to link in as little as possible, but regardless it leads to a more > robust architecture, faster build and link times, better testability, > and > is required if we ever want to do a modules build of LLDB > 2. > > Use CMake instead of Xcode project (CMake supports Frameworks). > CMake supports Apple Frameworks, so the main roadblock to getting this > working is just someone doing it. Segmenting the build process by > platform > doesn't make sense for the upstream, especially when there is a > perfectly > workable solution. I have no doubt that the resulting Xcode workspace > generated automatically by CMake will *not *be as "nice" as one > that is maintained by hand. We face this problem with Visual Studio > on Windows as well. The solution that most people have adopted is to > continue using the IDE for code editing and debugging, but for actually > running the build, use CMake with Ninja. A similar workflow should > still be > possible with an OSX CMake build, but as I do not work every day on a > Mac, > all I can say is that it's possible, I have no idea how impactful it > would > be on peoples' workflows. > 3. > > Variable naming conventions > 1. > > I don’t expect anyone is too fond of LLDB’s naming conventions, > but if we’re committed to joining the LLVM ecosystem, then let’s go > all the > way. > 4. > > Use more modern C++ and less C > 1. > > Old habits die hard, but this isn’t just a matter of style. It > leads to safer, more robust, and less fragile code as well. > 5. > > Shorter functions and classes with more narrowly targeted > responsibilities > 1. > > It’s not uncommon to find functions that are hundreds (and in a > few cases even 1,000+) of lines long. We really need to be better > about > breaking functions and classes down into smaller responsibilities. > This > helps not just for someone coming in to read the function, but also > for > testing. Smaller functions are easier to unit test. > 6. > > Convert T foo(X, Y, Error &error) functions to Expected<T> foo(X, Y) > style (Depends on 1.c) > 1. > > llvm::Expected is based on the llvm::Error class described > earlier. It’s used when a function is supposed to return a value, > but it > could fail. By packaging the error with the return value, it’s > impossible > to have a situation where you use the return value even in case of an > error, and because llvm::Error has mandatory checking, it’s also > impossible to have a sitaution where you don’t check the error. So > it’s > very safe. > > > Whew. That was a lot. If you made it this far, thanks for reading! > > Obviously if we were to embark on all of the above, it would take many > months to complete everything. So I'm not proposing anyone stop what > they're doing to work on this. This is just my own personal wishlist >
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev