Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-20 Thread Jim Ingham via lldb-dev


> On Sep 20, 2018, at 4:16 AM, Zachary Turner  wrote:
> 
> For the first, I think 99% of the time the bug is not caused by the sequence 
> of gdb remote packets.  The sequence of gdb remote packets just happens to be 
> the means by which the debugger was put into the state in which it failed.  
> If there is another, stable way of getting the debugger into the same state 
> this part is solvable.

Yes, I don't actually think we are in disagreement.  

When you are trying to provide a way to actually gather the information to 
reproduce a bug that is happening in the wild, you want to be able to ensure 
that you gather whatever you need to do so.  You are trying to avoid a bunch of 
round-trips with the reporter.  How would you know that the problem is all 
above the basic Process model, in which case gathering just higher level 
behaviors will be enough to reproduce the bug?  That seems like a hard problem 
to me.  So instead, you try to be conservative and gather the information at 
the level that you know drives the system, in this case for process level bugs 
this is the gdb-packet traffic.  And if you are gathering at that level, the 
trace you've gathered is somewhat fragile.

So if you wanted to generate tests, you would do a post-processing step that 
takes the states generated by the lower level events from the reproducer and 
converts that into an object model level replayer.  That should be doable if 
you had the ability to mock behavior at the level of lldb_private::Process, 
etc.  I'm only saying I think the needs of the Reproducer are such that a 
direct use of its provider's data is unlikely to be a good way to implement 
tests.

Making a converter from "reproducer trace to "test" is a separate piece of 
work, and not directly related to reproducers.  After all, if done right this 
would be able to observe a user-driven debug session and make a test out of it 
as well.  But I don't think this level or recording is right for the Reproducer.

Another problem with using the reproducer traces to directly produce tests is 
that for good tests you generally want to reduce them to the simplest set of 
steps possible to show just this bug.  What you are going to get from an 
in-the-wild reproducer is unlikely to be that.  That's actually part of the 
point of the reproducer, to relieve bug reporters of the necessity of reducing 
their problem before submitting.  So you'd have to have some way of figuring 
out when the world was still correct, just before going bad, and then only 
include the set of steps leading from good to bad (there may be lots of 
extraneous actions in the trace.)

> 
> The second issue you raised does seem like something that would require human 
> intervention to specify the expected state though as part of a test

I also think some automated way to gather "what did you expect" would be 
helpful.  I have very often had to go many rounds with reporters to figure out 
what they are actually reporting as wrong.  But that isn't part of Jonas' 
proposal, and maybe is more work than is justified by saving the few of us who 
work on lldb from the "many rounds" described above.

Jim


> 
> On Wed, Sep 19, 2018 at 11:17 AM Jim Ingham  wrote:
> There are a couple of problems with using these reproducers in the testsuite.
> 
> The first is that we make no commitments that the a future lldb will 
> implement the "same" session with the same sequence of gdb-remote packet 
> requests.  We often monkey around with lldb's sequences of requests to make 
> things go faster.  So some future lldb will end up making a request that 
> wasn't in the data from the reproducer, and at that point we won't really 
> know what to do.  The Provider for gdb-remote packets should record the 
> packets it receives - not just the answers it gives - so it can detect this 
> error and not go off the rails.  But I'm pretty sure it isn't worth the 
> effort to try to get lldb to maintain all the old sequences it used in the 
> past in order to support keeping the reproducers alive.  But this does mean 
> that this is an unreliable way to write tests.
> 
> The second is that the reproducers as described have no notion of "expected 
> state".  They are meant to go along with a bug report where the "x was wrong" 
> part is not contained in the reproducer.  That would be an interesting thing 
> to think about adding, but I think the problem space here is complicated 
> enough already...  You can't write a test if you don't know the correct end 
> state.
> 
> Jim
> 
> 
> > On Sep 19, 2018, at 10:59 AM, Zachary Turner via lldb-dev 
> >  wrote:
> > 
> > I assume that reproducing race conditions is out of scope?
> > 
> > Also, will it be possible to incorporate these reproducers into the test 
> > suite somehow?  It would be nice if we could create a tar file similar to a 
> > linkrepro, check in the tar file, and then have a test where you don't have 
> > to write any python code, any Makefile, any source code, or any 

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-20 Thread Zachary Turner via lldb-dev
For the first, I think 99% of the time the bug is not caused by the
sequence of gdb remote packets.  The sequence of gdb remote packets just
happens to be the means by which the debugger was put into the state in
which it failed.  If there is another, stable way of getting the debugger
into the same state this part is solvable.

The second issue you raised does seem like something that would require
human intervention to specify the expected state though as part of a test

On Wed, Sep 19, 2018 at 11:17 AM Jim Ingham  wrote:

> There are a couple of problems with using these reproducers in the
> testsuite.
>
> The first is that we make no commitments that the a future lldb will
> implement the "same" session with the same sequence of gdb-remote packet
> requests.  We often monkey around with lldb's sequences of requests to make
> things go faster.  So some future lldb will end up making a request that
> wasn't in the data from the reproducer, and at that point we won't really
> know what to do.  The Provider for gdb-remote packets should record the
> packets it receives - not just the answers it gives - so it can detect this
> error and not go off the rails.  But I'm pretty sure it isn't worth the
> effort to try to get lldb to maintain all the old sequences it used in the
> past in order to support keeping the reproducers alive.  But this does mean
> that this is an unreliable way to write tests.
>
> The second is that the reproducers as described have no notion of
> "expected state".  They are meant to go along with a bug report where the
> "x was wrong" part is not contained in the reproducer.  That would be an
> interesting thing to think about adding, but I think the problem space here
> is complicated enough already...  You can't write a test if you don't know
> the correct end state.
>
> Jim
>
>
> > On Sep 19, 2018, at 10:59 AM, Zachary Turner via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> >
> > I assume that reproducing race conditions is out of scope?
> >
> > Also, will it be possible to incorporate these reproducers into the test
> suite somehow?  It would be nice if we could create a tar file similar to a
> linkrepro, check in the tar file, and then have a test where you don't have
> to write any python code, any Makefile, any source code, or any anything
> for that matter.  It just enumerates all of these repro tar files in a
> certain location and runs that test.
> >
> > On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> > Great, thanks. This means that the lldb-server issues are not in scope
> for this feature, right?
> >
> > On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere <
> jdevliegh...@apple.com> wrote:
> >
> >
> >> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu 
> wrote:
> >>
> >> Sounds like a fantastic idea.
> >>
> >> How would this work when the behavior of the debugee process is
> non-deterministic?
> >
> > All the communication between the debugger and the inferior goes through
> the
> > GDB remote protocol. Because we capture and replay this, we can reproduce
> > without running the executable, which is particularly convenient when
> you were
> > originally debugging something on a different device for example.
> >
> >>
> >> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> >> Hi everyone,
> >>
> >> We all know how hard it can be to reproduce an issue or crash in LLDB.
> There
> >> are a lot of moving parts and subtle differences can easily add up. We
> want to
> >> make this easier by generating reproducers in LLDB, similar to what
> clang does
> >> today.
> >>
> >> The core idea is as follows: during normal operation we capture whatever
> >> information is needed to recreate the current state of the debugger.
> When
> >> something goes wrong, this becomes available to the user. Someone else
> should
> >> then be able to reproduce the same issue with only this data, for
> example on a
> >> different machine.
> >>
> >> It's important to note that we want to replay the debug session from the
> >> reproducer, rather than just recreating the current state. This ensures
> that we
> >> have access to all the events leading up to the problem, which are
> usually far
> >> more important than the error state itself.
> >>
> >> # High Level Design
> >>
> >> Concretely we want to extend LLDB in two ways:
> >>
> >> 1.  We need to add infrastructure to _generate_ the data necessary for
> >> reproducing.
> >> 2.  We need to add infrastructure to _use_ the data in the reproducer
> to replay
> >> the debugging session.
> >>
> >> Different parts of LLDB will have different definitions of what data
> they need
> >> to reproduce their path to the issue. For example, capturing the
> commands
> >> executed by the user is very different from tracking the dSYM bundles
> on disk.
> >> Therefore, we propose to have each component deal with its needs in a
> localized
> >> way. This has the advan

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Jim Ingham via lldb-dev
Yes, I think that would be pretty cool.  It is along the same lines we've been 
talking about with using "ProcessMock", "ThreadMock" etc. plugins.  However, I 
think you need both.  For instance if we bobble a gdb-remote packet, you will 
see that in a bad state of one of these higher level state descriptions, but 
without the actual packet traffic you wouldn't have that much help figuring out 
what actually went wrong.  OTOH, things like packet level recording will likely 
be much less stable than capturing state at a higher level.

Jim


> On Sep 19, 2018, at 11:10 AM, Zachary Turner via lldb-dev 
>  wrote:
> 
> By the way, several weeks / months ago I had an idea for exposing a debugger 
> object model.  That would be one very powerful way to create reproducers, but 
> it would be a large effort.  The idea is that if every important part of your 
> debugger is represented by some component in a debugger object model, and all 
> interactions (including internal interactions) go through the object model, 
> then you can record every state change to the object model and replay it.
> 
> On Wed, Sep 19, 2018 at 10:59 AM Zachary Turner  wrote:
> I assume that reproducing race conditions is out of scope?
> 
> Also, will it be possible to incorporate these reproducers into the test 
> suite somehow?  It would be nice if we could create a tar file similar to a 
> linkrepro, check in the tar file, and then have a test where you don't have 
> to write any python code, any Makefile, any source code, or any anything for 
> that matter.  It just enumerates all of these repro tar files in a certain 
> location and runs that test.
> 
> On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev 
>  wrote:
> Great, thanks. This means that the lldb-server issues are not in scope for 
> this feature, right?
> 
> On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere  
> wrote:
> 
> 
>> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
>> 
>> Sounds like a fantastic idea. 
>> 
>> How would this work when the behavior of the debugee process is 
>> non-deterministic?
> 
> All the communication between the debugger and the inferior goes through the
> GDB remote protocol. Because we capture and replay this, we can reproduce
> without running the executable, which is particularly convenient when you were
> originally debugging something on a different device for example. 
> 
>> 
>> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev 
>>  wrote:
>> Hi everyone,
>> 
>> We all know how hard it can be to reproduce an issue or crash in LLDB. There
>> are a lot of moving parts and subtle differences can easily add up. We want 
>> to
>> make this easier by generating reproducers in LLDB, similar to what clang 
>> does
>> today.
>> 
>> The core idea is as follows: during normal operation we capture whatever
>> information is needed to recreate the current state of the debugger. When
>> something goes wrong, this becomes available to the user. Someone else should
>> then be able to reproduce the same issue with only this data, for example on 
>> a
>> different machine.
>> 
>> It's important to note that we want to replay the debug session from the
>> reproducer, rather than just recreating the current state. This ensures that 
>> we
>> have access to all the events leading up to the problem, which are usually 
>> far
>> more important than the error state itself.
>> 
>> # High Level Design
>> 
>> Concretely we want to extend LLDB in two ways:
>> 
>> 1.  We need to add infrastructure to _generate_ the data necessary for
>> reproducing.
>> 2.  We need to add infrastructure to _use_ the data in the reproducer to 
>> replay
>> the debugging session.
>> 
>> Different parts of LLDB will have different definitions of what data they 
>> need
>> to reproduce their path to the issue. For example, capturing the commands
>> executed by the user is very different from tracking the dSYM bundles on 
>> disk.
>> Therefore, we propose to have each component deal with its needs in a 
>> localized
>> way. This has the advantage that the functionality can be developed and 
>> tested
>> independently.
>> 
>> ## Providers
>> 
>> We'll call a combination of (1) and (2) for a given component a `Provider`. 
>> For
>> example, we'd have an provider for user commands and a provider for dSYM 
>> files.
>> A provider will know how to keep track of its information, how to serialize 
>> it
>> as part of the reproducer as well as how to deserialize it again and use it 
>> to
>> recreate the state of the debugger.
>> 
>> With one exception, the lifetime of the provider coincides with that of the
>> `SBDebugger`, because that is the scope of what we consider here to be a 
>> single
>> debug session. The exception would be the provider for the global module 
>> cache,
>> because it is shared between multiple debuggers. Although it would be
>> conceptually straightforward to add a provider for the shared module cache,
>> this significantly increases 

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Jim Ingham via lldb-dev
There are a couple of problems with using these reproducers in the testsuite.

The first is that we make no commitments that the a future lldb will implement 
the "same" session with the same sequence of gdb-remote packet requests.  We 
often monkey around with lldb's sequences of requests to make things go faster. 
 So some future lldb will end up making a request that wasn't in the data from 
the reproducer, and at that point we won't really know what to do.  The 
Provider for gdb-remote packets should record the packets it receives - not 
just the answers it gives - so it can detect this error and not go off the 
rails.  But I'm pretty sure it isn't worth the effort to try to get lldb to 
maintain all the old sequences it used in the past in order to support keeping 
the reproducers alive.  But this does mean that this is an unreliable way to 
write tests.

The second is that the reproducers as described have no notion of "expected 
state".  They are meant to go along with a bug report where the "x was wrong" 
part is not contained in the reproducer.  That would be an interesting thing to 
think about adding, but I think the problem space here is complicated enough 
already...  You can't write a test if you don't know the correct end state.

Jim
 

> On Sep 19, 2018, at 10:59 AM, Zachary Turner via lldb-dev 
>  wrote:
> 
> I assume that reproducing race conditions is out of scope?
> 
> Also, will it be possible to incorporate these reproducers into the test 
> suite somehow?  It would be nice if we could create a tar file similar to a 
> linkrepro, check in the tar file, and then have a test where you don't have 
> to write any python code, any Makefile, any source code, or any anything for 
> that matter.  It just enumerates all of these repro tar files in a certain 
> location and runs that test.
> 
> On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev 
>  wrote:
> Great, thanks. This means that the lldb-server issues are not in scope for 
> this feature, right?
> 
> On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere  
> wrote:
> 
> 
>> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
>> 
>> Sounds like a fantastic idea. 
>> 
>> How would this work when the behavior of the debugee process is 
>> non-deterministic?
> 
> All the communication between the debugger and the inferior goes through the
> GDB remote protocol. Because we capture and replay this, we can reproduce
> without running the executable, which is particularly convenient when you were
> originally debugging something on a different device for example. 
> 
>> 
>> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev 
>>  wrote:
>> Hi everyone,
>> 
>> We all know how hard it can be to reproduce an issue or crash in LLDB. There
>> are a lot of moving parts and subtle differences can easily add up. We want 
>> to
>> make this easier by generating reproducers in LLDB, similar to what clang 
>> does
>> today.
>> 
>> The core idea is as follows: during normal operation we capture whatever
>> information is needed to recreate the current state of the debugger. When
>> something goes wrong, this becomes available to the user. Someone else should
>> then be able to reproduce the same issue with only this data, for example on 
>> a
>> different machine.
>> 
>> It's important to note that we want to replay the debug session from the
>> reproducer, rather than just recreating the current state. This ensures that 
>> we
>> have access to all the events leading up to the problem, which are usually 
>> far
>> more important than the error state itself.
>> 
>> # High Level Design
>> 
>> Concretely we want to extend LLDB in two ways:
>> 
>> 1.  We need to add infrastructure to _generate_ the data necessary for
>> reproducing.
>> 2.  We need to add infrastructure to _use_ the data in the reproducer to 
>> replay
>> the debugging session.
>> 
>> Different parts of LLDB will have different definitions of what data they 
>> need
>> to reproduce their path to the issue. For example, capturing the commands
>> executed by the user is very different from tracking the dSYM bundles on 
>> disk.
>> Therefore, we propose to have each component deal with its needs in a 
>> localized
>> way. This has the advantage that the functionality can be developed and 
>> tested
>> independently.
>> 
>> ## Providers
>> 
>> We'll call a combination of (1) and (2) for a given component a `Provider`. 
>> For
>> example, we'd have an provider for user commands and a provider for dSYM 
>> files.
>> A provider will know how to keep track of its information, how to serialize 
>> it
>> as part of the reproducer as well as how to deserialize it again and use it 
>> to
>> recreate the state of the debugger.
>> 
>> With one exception, the lifetime of the provider coincides with that of the
>> `SBDebugger`, because that is the scope of what we consider here to be a 
>> single
>> debug session. The exception would be the provider for the global module 
>> cach

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Zachary Turner via lldb-dev
By the way, several weeks / months ago I had an idea for exposing a
debugger object model.  That would be one very powerful way to create
reproducers, but it would be a large effort.  The idea is that if every
important part of your debugger is represented by some component in a
debugger object model, and all interactions (including internal
interactions) go through the object model, then you can record every state
change to the object model and replay it.

On Wed, Sep 19, 2018 at 10:59 AM Zachary Turner  wrote:

> I assume that reproducing race conditions is out of scope?
>
> Also, will it be possible to incorporate these reproducers into the test
> suite somehow?  It would be nice if we could create a tar file similar to a
> linkrepro, check in the tar file, and then have a test where you don't have
> to write any python code, any Makefile, any source code, or any anything
> for that matter.  It just enumerates all of these repro tar files in a
> certain location and runs that test.
>
> On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
>
>> Great, thanks. This means that the lldb-server issues are not in scope
>> for this feature, right?
>>
>> On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere <
>> jdevliegh...@apple.com> wrote:
>>
>>>
>>>
>>> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
>>>
>>> Sounds like a fantastic idea.
>>>
>>> How would this work when the behavior of the debugee process is
>>> non-deterministic?
>>>
>>>
>>> All the communication between the debugger and the inferior goes through
>>> the
>>> GDB remote protocol. Because we capture and replay this, we can reproduce
>>> without running the executable, which is particularly convenient when
>>> you were
>>> originally debugging something on a different device for example.
>>>
>>>
>>> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
>>> lldb-dev@lists.llvm.org> wrote:
>>>
 Hi everyone,

 We all know how hard it can be to reproduce an issue or crash in LLDB.
 There
 are a lot of moving parts and subtle differences can easily add up. We
 want to
 make this easier by generating reproducers in LLDB, similar to what
 clang does
 today.

 The core idea is as follows: during normal operation we capture whatever
 information is needed to recreate the current state of the debugger.
 When
 something goes wrong, this becomes available to the user. Someone else
 should
 then be able to reproduce the same issue with only this data, for
 example on a
 different machine.

 It's important to note that we want to replay the debug session from the
 reproducer, rather than just recreating the current state. This ensures
 that we
 have access to all the events leading up to the problem, which are
 usually far
 more important than the error state itself.

 # High Level Design

 Concretely we want to extend LLDB in two ways:

 1.  We need to add infrastructure to _generate_ the data necessary for
 reproducing.
 2.  We need to add infrastructure to _use_ the data in the reproducer
 to replay
 the debugging session.

 Different parts of LLDB will have different definitions of what data
 they need
 to reproduce their path to the issue. For example, capturing the
 commands
 executed by the user is very different from tracking the dSYM bundles
 on disk.
 Therefore, we propose to have each component deal with its needs in a
 localized
 way. This has the advantage that the functionality can be developed and
 tested
 independently.

 ## Providers

 We'll call a combination of (1) and (2) for a given component a
 `Provider`. For
 example, we'd have an provider for user commands and a provider for
 dSYM files.
 A provider will know how to keep track of its information, how to
 serialize it
 as part of the reproducer as well as how to deserialize it again and
 use it to
 recreate the state of the debugger.

 With one exception, the lifetime of the provider coincides with that of
 the
 `SBDebugger`, because that is the scope of what we consider here to be
 a single
 debug session. The exception would be the provider for the global
 module cache,
 because it is shared between multiple debuggers. Although it would be
 conceptually straightforward to add a provider for the shared module
 cache,
 this significantly increases the complexity of the reproducer framework
 because
 of its implication on the lifetime and everything related to that.

 For now we will ignore this problem which means we will not replay the
 construction of the shared module cache but rather build it up during
 replaying, as if the current debug session was the first and only one
 using it.
 The impact of doing so 

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Zachary Turner via lldb-dev
I assume that reproducing race conditions is out of scope?

Also, will it be possible to incorporate these reproducers into the test
suite somehow?  It would be nice if we could create a tar file similar to a
linkrepro, check in the tar file, and then have a test where you don't have
to write any python code, any Makefile, any source code, or any anything
for that matter.  It just enumerates all of these repro tar files in a
certain location and runs that test.

On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> Great, thanks. This means that the lldb-server issues are not in scope for
> this feature, right?
>
> On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere <
> jdevliegh...@apple.com> wrote:
>
>>
>>
>> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
>>
>> Sounds like a fantastic idea.
>>
>> How would this work when the behavior of the debugee process is
>> non-deterministic?
>>
>>
>> All the communication between the debugger and the inferior goes through
>> the
>> GDB remote protocol. Because we capture and replay this, we can reproduce
>> without running the executable, which is particularly convenient when you
>> were
>> originally debugging something on a different device for example.
>>
>>
>> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
>> lldb-dev@lists.llvm.org> wrote:
>>
>>> Hi everyone,
>>>
>>> We all know how hard it can be to reproduce an issue or crash in LLDB.
>>> There
>>> are a lot of moving parts and subtle differences can easily add up. We
>>> want to
>>> make this easier by generating reproducers in LLDB, similar to what
>>> clang does
>>> today.
>>>
>>> The core idea is as follows: during normal operation we capture whatever
>>> information is needed to recreate the current state of the debugger. When
>>> something goes wrong, this becomes available to the user. Someone else
>>> should
>>> then be able to reproduce the same issue with only this data, for
>>> example on a
>>> different machine.
>>>
>>> It's important to note that we want to replay the debug session from the
>>> reproducer, rather than just recreating the current state. This ensures
>>> that we
>>> have access to all the events leading up to the problem, which are
>>> usually far
>>> more important than the error state itself.
>>>
>>> # High Level Design
>>>
>>> Concretely we want to extend LLDB in two ways:
>>>
>>> 1.  We need to add infrastructure to _generate_ the data necessary for
>>> reproducing.
>>> 2.  We need to add infrastructure to _use_ the data in the reproducer to
>>> replay
>>> the debugging session.
>>>
>>> Different parts of LLDB will have different definitions of what data
>>> they need
>>> to reproduce their path to the issue. For example, capturing the commands
>>> executed by the user is very different from tracking the dSYM bundles on
>>> disk.
>>> Therefore, we propose to have each component deal with its needs in a
>>> localized
>>> way. This has the advantage that the functionality can be developed and
>>> tested
>>> independently.
>>>
>>> ## Providers
>>>
>>> We'll call a combination of (1) and (2) for a given component a
>>> `Provider`. For
>>> example, we'd have an provider for user commands and a provider for dSYM
>>> files.
>>> A provider will know how to keep track of its information, how to
>>> serialize it
>>> as part of the reproducer as well as how to deserialize it again and use
>>> it to
>>> recreate the state of the debugger.
>>>
>>> With one exception, the lifetime of the provider coincides with that of
>>> the
>>> `SBDebugger`, because that is the scope of what we consider here to be a
>>> single
>>> debug session. The exception would be the provider for the global module
>>> cache,
>>> because it is shared between multiple debuggers. Although it would be
>>> conceptually straightforward to add a provider for the shared module
>>> cache,
>>> this significantly increases the complexity of the reproducer framework
>>> because
>>> of its implication on the lifetime and everything related to that.
>>>
>>> For now we will ignore this problem which means we will not replay the
>>> construction of the shared module cache but rather build it up during
>>> replaying, as if the current debug session was the first and only one
>>> using it.
>>> The impact of doing so is significant, as no issue caused by the shared
>>> module
>>> cache will be reproducible, but does not limit reproducing any issue
>>> unrelated
>>> to it.
>>>
>>> ## Reproducer Framework
>>>
>>> To coordinate between the data from different components, we'll need to
>>> introduce a global reproducer infrastructure. We have a component
>>> responsible
>>> for reproducer generation (the `Generator`) and for using the reproducer
>>> (the
>>> `Loader`). They are essentially two ways of looking at the same unit of
>>> repayable work.
>>>
>>> The Generator keeps track of its providers and whether or not we need to
>>> generate a reproducer. When a 

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Leonard Mosescu via lldb-dev
Great, thanks. This means that the lldb-server issues are not in scope for
this feature, right?

On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere 
wrote:

>
>
> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
>
> Sounds like a fantastic idea.
>
> How would this work when the behavior of the debugee process is
> non-deterministic?
>
>
> All the communication between the debugger and the inferior goes through
> the
> GDB remote protocol. Because we capture and replay this, we can reproduce
> without running the executable, which is particularly convenient when you
> were
> originally debugging something on a different device for example.
>
>
> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
>
>> Hi everyone,
>>
>> We all know how hard it can be to reproduce an issue or crash in LLDB.
>> There
>> are a lot of moving parts and subtle differences can easily add up. We
>> want to
>> make this easier by generating reproducers in LLDB, similar to what clang
>> does
>> today.
>>
>> The core idea is as follows: during normal operation we capture whatever
>> information is needed to recreate the current state of the debugger. When
>> something goes wrong, this becomes available to the user. Someone else
>> should
>> then be able to reproduce the same issue with only this data, for example
>> on a
>> different machine.
>>
>> It's important to note that we want to replay the debug session from the
>> reproducer, rather than just recreating the current state. This ensures
>> that we
>> have access to all the events leading up to the problem, which are
>> usually far
>> more important than the error state itself.
>>
>> # High Level Design
>>
>> Concretely we want to extend LLDB in two ways:
>>
>> 1.  We need to add infrastructure to _generate_ the data necessary for
>> reproducing.
>> 2.  We need to add infrastructure to _use_ the data in the reproducer to
>> replay
>> the debugging session.
>>
>> Different parts of LLDB will have different definitions of what data they
>> need
>> to reproduce their path to the issue. For example, capturing the commands
>> executed by the user is very different from tracking the dSYM bundles on
>> disk.
>> Therefore, we propose to have each component deal with its needs in a
>> localized
>> way. This has the advantage that the functionality can be developed and
>> tested
>> independently.
>>
>> ## Providers
>>
>> We'll call a combination of (1) and (2) for a given component a
>> `Provider`. For
>> example, we'd have an provider for user commands and a provider for dSYM
>> files.
>> A provider will know how to keep track of its information, how to
>> serialize it
>> as part of the reproducer as well as how to deserialize it again and use
>> it to
>> recreate the state of the debugger.
>>
>> With one exception, the lifetime of the provider coincides with that of
>> the
>> `SBDebugger`, because that is the scope of what we consider here to be a
>> single
>> debug session. The exception would be the provider for the global module
>> cache,
>> because it is shared between multiple debuggers. Although it would be
>> conceptually straightforward to add a provider for the shared module
>> cache,
>> this significantly increases the complexity of the reproducer framework
>> because
>> of its implication on the lifetime and everything related to that.
>>
>> For now we will ignore this problem which means we will not replay the
>> construction of the shared module cache but rather build it up during
>> replaying, as if the current debug session was the first and only one
>> using it.
>> The impact of doing so is significant, as no issue caused by the shared
>> module
>> cache will be reproducible, but does not limit reproducing any issue
>> unrelated
>> to it.
>>
>> ## Reproducer Framework
>>
>> To coordinate between the data from different components, we'll need to
>> introduce a global reproducer infrastructure. We have a component
>> responsible
>> for reproducer generation (the `Generator`) and for using the reproducer
>> (the
>> `Loader`). They are essentially two ways of looking at the same unit of
>> repayable work.
>>
>> The Generator keeps track of its providers and whether or not we need to
>> generate a reproducer. When a problem occurs, LLDB will request the
>> Generator
>> to generate a reproducer. When LLDB finishes successfully, the Generator
>> cleans
>> up anything it might have created during the session. Additionally, the
>> Generator populates an index, which is part of the reproducer, and used
>> by the
>> Loader to discover what information is available.
>>
>> When a reproducer is passed to LLDB, we want to use its data to replay the
>> debug session. This is coordinated by the Loader. Through the index
>> created by
>> the Generator, different components know what data (Providers) are
>> available,
>> and how to use them.
>>
>> It's important to note that in order to create a complete reproducer, we
>> will
>>

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Jonas Devlieghere via lldb-dev


> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu  wrote:
> 
> Sounds like a fantastic idea. 
> 
> How would this work when the behavior of the debugee process is 
> non-deterministic?

All the communication between the debugger and the inferior goes through the
GDB remote protocol. Because we capture and replay this, we can reproduce
without running the executable, which is particularly convenient when you were
originally debugging something on a different device for example. 

> 
> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev 
> mailto:lldb-dev@lists.llvm.org>> wrote:
> Hi everyone,
> 
> We all know how hard it can be to reproduce an issue or crash in LLDB. There
> are a lot of moving parts and subtle differences can easily add up. We want to
> make this easier by generating reproducers in LLDB, similar to what clang does
> today.
> 
> The core idea is as follows: during normal operation we capture whatever
> information is needed to recreate the current state of the debugger. When
> something goes wrong, this becomes available to the user. Someone else should
> then be able to reproduce the same issue with only this data, for example on a
> different machine.
> 
> It's important to note that we want to replay the debug session from the
> reproducer, rather than just recreating the current state. This ensures that 
> we
> have access to all the events leading up to the problem, which are usually far
> more important than the error state itself.
> 
> # High Level Design
> 
> Concretely we want to extend LLDB in two ways:
> 
> 1.  We need to add infrastructure to _generate_ the data necessary for
> reproducing.
> 2.  We need to add infrastructure to _use_ the data in the reproducer to 
> replay
> the debugging session.
> 
> Different parts of LLDB will have different definitions of what data they need
> to reproduce their path to the issue. For example, capturing the commands
> executed by the user is very different from tracking the dSYM bundles on disk.
> Therefore, we propose to have each component deal with its needs in a 
> localized
> way. This has the advantage that the functionality can be developed and tested
> independently.
> 
> ## Providers
> 
> We'll call a combination of (1) and (2) for a given component a `Provider`. 
> For
> example, we'd have an provider for user commands and a provider for dSYM 
> files.
> A provider will know how to keep track of its information, how to serialize it
> as part of the reproducer as well as how to deserialize it again and use it to
> recreate the state of the debugger.
> 
> With one exception, the lifetime of the provider coincides with that of the
> `SBDebugger`, because that is the scope of what we consider here to be a 
> single
> debug session. The exception would be the provider for the global module 
> cache,
> because it is shared between multiple debuggers. Although it would be
> conceptually straightforward to add a provider for the shared module cache,
> this significantly increases the complexity of the reproducer framework 
> because
> of its implication on the lifetime and everything related to that.
> 
> For now we will ignore this problem which means we will not replay the
> construction of the shared module cache but rather build it up during
> replaying, as if the current debug session was the first and only one using 
> it.
> The impact of doing so is significant, as no issue caused by the shared module
> cache will be reproducible, but does not limit reproducing any issue unrelated
> to it.
> 
> ## Reproducer Framework
> 
> To coordinate between the data from different components, we'll need to
> introduce a global reproducer infrastructure. We have a component responsible
> for reproducer generation (the `Generator`) and for using the reproducer (the
> `Loader`). They are essentially two ways of looking at the same unit of
> repayable work.
> 
> The Generator keeps track of its providers and whether or not we need to
> generate a reproducer. When a problem occurs, LLDB will request the Generator
> to generate a reproducer. When LLDB finishes successfully, the Generator 
> cleans
> up anything it might have created during the session. Additionally, the
> Generator populates an index, which is part of the reproducer, and used by the
> Loader to discover what information is available.
> 
> When a reproducer is passed to LLDB, we want to use its data to replay the
> debug session. This is coordinated by the Loader. Through the index created by
> the Generator, different components know what data (Providers) are available,
> and how to use them.
> 
> It's important to note that in order to create a complete reproducer, we will
> require data from our dependencies (llvm, clang, swift) as well. This means
> that either (a) the infrastructure needs to be accessible from our 
> dependencies
> or (b) that an API is provided that allows us to query this. We plan to 
> address
> this issue when it arises for the respective 

Re: [lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Leonard Mosescu via lldb-dev
Sounds like a fantastic idea.

How would this work when the behavior of the debugee process is
non-deterministic?

On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> Hi everyone,
>
> We all know how hard it can be to reproduce an issue or crash in LLDB.
> There
> are a lot of moving parts and subtle differences can easily add up. We
> want to
> make this easier by generating reproducers in LLDB, similar to what clang
> does
> today.
>
> The core idea is as follows: during normal operation we capture whatever
> information is needed to recreate the current state of the debugger. When
> something goes wrong, this becomes available to the user. Someone else
> should
> then be able to reproduce the same issue with only this data, for example
> on a
> different machine.
>
> It's important to note that we want to replay the debug session from the
> reproducer, rather than just recreating the current state. This ensures
> that we
> have access to all the events leading up to the problem, which are usually
> far
> more important than the error state itself.
>
> # High Level Design
>
> Concretely we want to extend LLDB in two ways:
>
> 1.  We need to add infrastructure to _generate_ the data necessary for
> reproducing.
> 2.  We need to add infrastructure to _use_ the data in the reproducer to
> replay
> the debugging session.
>
> Different parts of LLDB will have different definitions of what data they
> need
> to reproduce their path to the issue. For example, capturing the commands
> executed by the user is very different from tracking the dSYM bundles on
> disk.
> Therefore, we propose to have each component deal with its needs in a
> localized
> way. This has the advantage that the functionality can be developed and
> tested
> independently.
>
> ## Providers
>
> We'll call a combination of (1) and (2) for a given component a
> `Provider`. For
> example, we'd have an provider for user commands and a provider for dSYM
> files.
> A provider will know how to keep track of its information, how to
> serialize it
> as part of the reproducer as well as how to deserialize it again and use
> it to
> recreate the state of the debugger.
>
> With one exception, the lifetime of the provider coincides with that of the
> `SBDebugger`, because that is the scope of what we consider here to be a
> single
> debug session. The exception would be the provider for the global module
> cache,
> because it is shared between multiple debuggers. Although it would be
> conceptually straightforward to add a provider for the shared module cache,
> this significantly increases the complexity of the reproducer framework
> because
> of its implication on the lifetime and everything related to that.
>
> For now we will ignore this problem which means we will not replay the
> construction of the shared module cache but rather build it up during
> replaying, as if the current debug session was the first and only one
> using it.
> The impact of doing so is significant, as no issue caused by the shared
> module
> cache will be reproducible, but does not limit reproducing any issue
> unrelated
> to it.
>
> ## Reproducer Framework
>
> To coordinate between the data from different components, we'll need to
> introduce a global reproducer infrastructure. We have a component
> responsible
> for reproducer generation (the `Generator`) and for using the reproducer
> (the
> `Loader`). They are essentially two ways of looking at the same unit of
> repayable work.
>
> The Generator keeps track of its providers and whether or not we need to
> generate a reproducer. When a problem occurs, LLDB will request the
> Generator
> to generate a reproducer. When LLDB finishes successfully, the Generator
> cleans
> up anything it might have created during the session. Additionally, the
> Generator populates an index, which is part of the reproducer, and used by
> the
> Loader to discover what information is available.
>
> When a reproducer is passed to LLDB, we want to use its data to replay the
> debug session. This is coordinated by the Loader. Through the index
> created by
> the Generator, different components know what data (Providers) are
> available,
> and how to use them.
>
> It's important to note that in order to create a complete reproducer, we
> will
> require data from our dependencies (llvm, clang, swift) as well. This means
> that either (a) the infrastructure needs to be accessible from our
> dependencies
> or (b) that an API is provided that allows us to query this. We plan to
> address
> this issue when it arises for the respective Generator.
>
> # Components
>
> We have identified a list of minimal components needed to make reproducing
> possible. We've divided those into two groups: explicit and implicit
> inputs.
>
> Explicit inputs are inputs from the user to the debugger.
>
> -   Command line arguments
> -   Settings
> -   User commands
> -   Scripting Bridge API
>
> In addition to the com

[lldb-dev] [RFC] LLDB Reproducers

2018-09-19 Thread Jonas Devlieghere via lldb-dev
Hi everyone,

We all know how hard it can be to reproduce an issue or crash in LLDB. There
are a lot of moving parts and subtle differences can easily add up. We want to
make this easier by generating reproducers in LLDB, similar to what clang does
today.

The core idea is as follows: during normal operation we capture whatever
information is needed to recreate the current state of the debugger. When
something goes wrong, this becomes available to the user. Someone else should
then be able to reproduce the same issue with only this data, for example on a
different machine.

It's important to note that we want to replay the debug session from the
reproducer, rather than just recreating the current state. This ensures that we
have access to all the events leading up to the problem, which are usually far
more important than the error state itself.

# High Level Design

Concretely we want to extend LLDB in two ways:

1.  We need to add infrastructure to _generate_ the data necessary for
reproducing.
2.  We need to add infrastructure to _use_ the data in the reproducer to replay
the debugging session.

Different parts of LLDB will have different definitions of what data they need
to reproduce their path to the issue. For example, capturing the commands
executed by the user is very different from tracking the dSYM bundles on disk.
Therefore, we propose to have each component deal with its needs in a localized
way. This has the advantage that the functionality can be developed and tested
independently.

## Providers

We'll call a combination of (1) and (2) for a given component a `Provider`. For
example, we'd have an provider for user commands and a provider for dSYM files.
A provider will know how to keep track of its information, how to serialize it
as part of the reproducer as well as how to deserialize it again and use it to
recreate the state of the debugger.

With one exception, the lifetime of the provider coincides with that of the
`SBDebugger`, because that is the scope of what we consider here to be a single
debug session. The exception would be the provider for the global module cache,
because it is shared between multiple debuggers. Although it would be
conceptually straightforward to add a provider for the shared module cache,
this significantly increases the complexity of the reproducer framework because
of its implication on the lifetime and everything related to that.

For now we will ignore this problem which means we will not replay the
construction of the shared module cache but rather build it up during
replaying, as if the current debug session was the first and only one using it.
The impact of doing so is significant, as no issue caused by the shared module
cache will be reproducible, but does not limit reproducing any issue unrelated
to it.

## Reproducer Framework

To coordinate between the data from different components, we'll need to
introduce a global reproducer infrastructure. We have a component responsible
for reproducer generation (the `Generator`) and for using the reproducer (the
`Loader`). They are essentially two ways of looking at the same unit of
repayable work.

The Generator keeps track of its providers and whether or not we need to
generate a reproducer. When a problem occurs, LLDB will request the Generator
to generate a reproducer. When LLDB finishes successfully, the Generator cleans
up anything it might have created during the session. Additionally, the
Generator populates an index, which is part of the reproducer, and used by the
Loader to discover what information is available.

When a reproducer is passed to LLDB, we want to use its data to replay the
debug session. This is coordinated by the Loader. Through the index created by
the Generator, different components know what data (Providers) are available,
and how to use them.

It's important to note that in order to create a complete reproducer, we will
require data from our dependencies (llvm, clang, swift) as well. This means
that either (a) the infrastructure needs to be accessible from our dependencies
or (b) that an API is provided that allows us to query this. We plan to address
this issue when it arises for the respective Generator.

# Components

We have identified a list of minimal components needed to make reproducing
possible. We've divided those into two groups: explicit and implicit inputs.

Explicit inputs are inputs from the user to the debugger.

-   Command line arguments
-   Settings
-   User commands
-   Scripting Bridge API

In addition to the components listed above, LLDB has a bunch of inputs that are
not passed explicitly. It's often these that make reproducing an issue complex.

-   GDB Remote Packets
-   Files containing debug information (object files, dSYM bundles)
-   Clang headers
-   Swift modules

Every component would have its own provider and is free to implement it as it
sees fit. For example, as we expect to have a large number of GDB remote
packets, the provider might choos