Re: [lldb-dev] Multiple platforms with the same name

2022-01-28 Thread Pavel Labath via lldb-dev
I'm sorry for the slow response. I had to attend to some other things 
first. It sounds like there's agreement to support multiple platform 
instances, so I'll try to move things in that direction.


Further responses inline

On 20/01/2022 01:19, Greg Clayton wrote:




On Jan 19, 2022, at 4:28 AM, Pavel Labath  wrote:

On 19/01/2022 00:38, Greg Clayton wrote:

Platforms can contain connection specific setting and data. You might want to create two 
different "remote-linux" platforms and connect each one to a different remote 
linux machine. Each target which uses this platform would each be able to fetch files, 
resolve symbol files, get OS version/build string/kernel info, get set working directory 
from the remote server they are attached. Since each platform tends to belong to a target 
and since you might want to create two different targets and have each one connected to a 
different remote machine, I believe it is fine to have multiple instances.
I would vote to almost always create a new instance unless it is the host 
platform. Though it should be possible to create to targets and possibly set 
the platform on one target using the platform from another that might already 
be connected.
I am open to suggestions if anyone has any objections.
Greg


I agree that permitting multiple platforms would be a more principled position, 
but it was not clear to me if that was ever planned to be the case.


This code definitely evolved as time went on. Then we added the remote 
capabilities. As Jim said, there are two parts for the platform that _could_ be 
separated: PlatformLocal and PlatformRemote. Horrible names that can be 
improved upon, I am sure, but just names I quickly came up with.

PlatformLocal would be "what can I do for a platform that only involves finding 
things on this machine for supporting debugging on a remote platform". This would 
involve things like:
- where are remote files cached on the local machine for easy access
- where can I locate SDK/NDK stuff that might help me for this platform
- what architectures/triples are supported by this platform so it can be 
selected
- how to start a debug session for a given binary (which might use parts of 
PlatformRemote) as platforms like "iOS-simulator" do not require any remote 
connections to be able to start a process. Same could happen for VM based debugging on a 
local machine.

PlatformRemote
- get/put files
- get/set working directory
- install executable so OS can see/launch it
- create/delete directories

So as things evolved, everything got thrown into the Platform case and we just 
made things work as we went. I am sure this can be improved.
I actually have a branch where I've tried to separate the local and 
remote cases, and remove the if(IsHost()) checks everywhere, but I 
haven't found yet found the time to clean it up and send an rfc.






If it was (or if we want it to be), then I think we need to start making bigger distinctions 
between the platform plugins (classes), and the actual instantiations of those classes. Currently 
there is no way to refer to "older" instances of the platforms as they all share the same 
name (the name of the plugin). Like, you can enumerate them through 
SBDebugger.GetPlatformAtIndex(), but that's about the only thing you can do as all the interfaces 
(including the SB ones) take a platform _name_ as an argument. This gets particularly confusing as 
in some circumstances we end up choosing the newer one (e.g. if its the "current" 
platform) and sometimes the older.

If we want to do that, then this is what I'd propose:
a) Each platform plugin and each platform instance gets a name. We enforce the 
uniqueness of these names (within their category).


Maybe it would be better to maintain the name, but implement an instance 
identifier for each platform instance?
I'm not sure what you mean by that. Or, if you mean what I think you 
mean, then we're actually in agreement. Each platform plugin (class) 
gets a name (or identifier, or whatever we want to call it), and each 
instance of that class gets a name as well.


Practically speaking, I think we could reuse the existing GetPluginName 
and GetName (currently hardwired to return GetPluginName()). The former 
would return the plugin name, and the latter would give the "instance 
identifier".





b) "platform list" outputs two block -- the list of available plugins and the 
list of plugin instances


If we added a instance identifier, then we could just show the available 
plug-in names followed by their instances?
Yes, that would just be a different (and probably better) way of 
displaying the same information. We can definitely do that.





c) a new "platform create" command to create a platform
  - e.g. "platform create my-arm-test-machine --plugin remote-linux"


Now we are assuming you want to connect to a remote machine when we create platform? "platform 
connect" can be used currently if we want to actually connect to a remote platform, but 

Re: [lldb-dev] Is GetLogIf**All**CategoriesSet useful?

2022-01-20 Thread Pavel Labath via lldb-dev

On 20/01/2022 00:30, Greg Clayton wrote:

I also vote to remove and simplify.


Sounds like it's settled then. I'll fire up my sed scripts.

On 20/01/2022 01:38, Greg Clayton wrote:



On Jan 19, 2022, at 6:40 AM, Pavel Labath  wrote: 
If we got rid of this, we could simplify the logging calls even 
further and have something like:>> Log *log =

GetLog(LLDBLog::Process);


Can a template function deduce the log type from an argument? 
Wouldn't this have to be:


Log *log = GetLog(LLDBLog::Process);

That is why I was hinting if we want to just use the enum class 
itself:


Log *log = LLDBLog::GetLog(LLDBLog::Process);

The template class in your second patch seems cool, but I don't 
understand how it worked without going and reading up on templates

in C++ and spending 20 minutes trying to wrap my brain around it.

Template functions have always been able to deduce template arguments.
Pretty much the entire c++ standard library is made of template
functions, but you don't see <> spelled out everywhere. Class templates
have not been able to auto-deduce template arguments until c++17, and I
am still not really clear on how that works.

The way that patch works is that you have one template function
`LogChannelFor`, which ties the enum to a specific channel class, and
then another one (GetLogIfAny), which returns the actual log object (and
uses the first one to obtain the channel).

But none of this is fundamentally tied to templates. One could achieve
the same thing by overloading the GetLogIfAny function (one overload for
each type). The template just saves a bit of repetition. This way, the
only thing you need to do when defining a new log channel, is to provide
the LogChannelFor function.



Or do we just switch to a dedicated log class with unique methods:

class LLDBLog: public Log { Log *Process() { return GetLog(1u << 0);
} Log *Thread() { return GetLog(1u << 1); } };

and avoid all the enums? Then we can't ever feed a bad enum or 
#define  into the wrong log class.


That could work too, and would definitely have some advantages -- for
instance we could prefix each message with the log channel it was going
to. The downside is that we would lose the ability to send one message 
to multiple log channels at once, and I believe that some (Jim?) value 
that functionality.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Is GetLogIf**All**CategoriesSet useful?

2022-01-19 Thread Pavel Labath via lldb-dev

Hi all,

In case you haven't noticed, I'd like to draw your attention to the 
in-flight patches (https://reviews.llvm.org/D117382, 
https://reviews.llvm.org/D117490) whose goal clean up/improve/streamline 
the logging infrastructure.


I'm don't want go into technical details here (they're on the patch), 
but the general idea is to replace statements like 
GetLogIf(Any/All)CategoriesSet(LIBLLDB_LOG_CAT1 | LIBLLDB_LOG_CAT2)

with
GetLogIf(Any/All)(LLDBLog::Cat1 | LLDBLog::Cat2)
i.e., drop macros and make use of templates to make the function calls 
shorter and safer.


The reason I'm writing this email is to ask about the "All" versions of 
these logging functions. Do you find them useful in practice?


I'm asking that because I've never used this functionality. While I 
can't find anything wrong with the concept in theory, practically I 
think it's just confusing to have some log message appear only for some 
combination of enabled channels. It might have made some sense when we 
had a "verbose" logging channel, but that one is long gone (we still 
have a verbose logging *flag*).


In fact, out of all our GetLogIf calls (1203), less than 1% (11*) uses 
the GetLogIfAll form with more than one category. Of those, three are in 
tests, one is definitely a bug (it combines the category with 
LLDB_LOG_OPTION_VERBOSE), and the others (7) are of questionable 
usefulness (to me anyway).


If we got rid of this, we could simplify the logging calls even further 
and have something like:

Log *log = GetLog(LLDBLog::Process);
everywhere.

cheers,
pl

(*) I used this command to count:
$ git grep -e LogIfAll -A 1 | fgrep -e '|' | wc -l
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Multiple platforms with the same name

2022-01-19 Thread Pavel Labath via lldb-dev

On 19/01/2022 00:38, Greg Clayton wrote:

Platforms can contain connection specific setting and data. You might want to create two 
different "remote-linux" platforms and connect each one to a different remote 
linux machine. Each target which uses this platform would each be able to fetch files, 
resolve symbol files, get OS version/build string/kernel info, get set working directory 
from the remote server they are attached. Since each platform tends to belong to a target 
and since you might want to create two different targets and have each one connected to a 
different remote machine, I believe it is fine to have multiple instances.

I would vote to almost always create a new instance unless it is the host 
platform. Though it should be possible to create to targets and possibly set 
the platform on one target using the platform from another that might already 
be connected.

I am open to suggestions if anyone has any objections.

Greg


I agree that permitting multiple platforms would be a more principled 
position, but it was not clear to me if that was ever planned to be the 
case.


If it was (or if we want it to be), then I think we need to start making 
bigger distinctions between the platform plugins (classes), and the 
actual instantiations of those classes. Currently there is no way to 
refer to "older" instances of the platforms as they all share the same 
name (the name of the plugin). Like, you can enumerate them through 
SBDebugger.GetPlatformAtIndex(), but that's about the only thing you can 
do as all the interfaces (including the SB ones) take a platform _name_ 
as an argument. This gets particularly confusing as in some 
circumstances we end up choosing the newer one (e.g. if its the 
"current" platform) and sometimes the older.


If we want to do that, then this is what I'd propose:
a) Each platform plugin and each platform instance gets a name. We 
enforce the uniqueness of these names (within their category).
b) "platform list" outputs two block -- the list of available plugins 
and the list of plugin instances

c) a new "platform create" command to create a platform
  - e.g. "platform create my-arm-test-machine --plugin remote-linux"
d) "platform select" selects the platform with the given /instance/ name
  - for convenience and compatibility if the name does not refer to any 
existing platform instance, but it *does* refer to a platform plugin, it 
would create a platform instance with the same name as the class. (So 
the first "platform select remote-linux" would create a new instance 
(also called remote-linux) and all subsequent selects would switch to 
that one -- a change to existing behavior)

e) SBPlatform gets a static factory function taking two string arguments
f) existing SBPlatform constructor (taking one string) creates a new 
platform instance with a name selected by us (remote-linux, 
remote-linux-2, etc.), but its use is discouraged/deprecated.
g) all other existing APIs (command line and SB) remain unchanged but 
any "platform name" argument is taken to mean the platform instance 
name, and it has the "platform select" semantics (select if it exists, 
create if it doesn't)


I think this would strike a good balance between a consistent interface 
and preserving existing semantics. The open questions are:
- is it worth it? While nice in theory, personally I have never actually 
needed to connect to more than one machine at the same time.
- what to do about platform-specific settings. The functionality has 
existed for a long time, but there was only one plugin 
(PlatformDarwinKernel) using it. I've now added a bunch of settings to 
the qemu-user platform on the assumption that there will only be one 
instance of the class. These are global, but they would really make more 
sense on a per-instance basis. We could either leave it be (I don't need 
multiple instances now), or come up with a way to have per-platform 
settings, similar like we do for targets. We could also do something 
with the "platform settings" command, which currently only sets the 
working directory.


Let me know what you think,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Multiple platforms with the same name

2022-01-17 Thread Pavel Labath via lldb-dev

Hello all,

currently our code treats platform name more-or-less as a unique 
identifier  (e.g. Platform::Find returns at most one platform instance 
--the first one it finds).


This is why I was surprised that the "platform select" CLI command 
always creates a new instance of the given platform, even if the 
platform of a given name already exists. This is because 
Platform::Create does not search the existing platform list before 
creating a new one. This might sound reasonable at first, but for 
example the Platform::Create overload which takes an ArchSpec first 
tries to look for a compatible platforms among the existing ones before 
creating a new one.


For this reason, I am tempted to call this a bug and fix the name-taking 
Create overload. This change passes the test suite, except for a single 
test, which now gets confused because some information gets leaked from 
one test to another. (although our coverage of the Platform class in the 
tests is fairly weak)


However, this test got me thinking. It happens to use the the SB way of 
manipulating platforms, and "creates" a new instance as 
lldb.SBPlatform("remote-linux"). For this kind of a command, it would be 
reasonable/expected to create a new instance, were it not for the fact 
that this platform would be very tricky to access from the command line, 
and even through some APIs -- SBDebugger::CreateTarget takes a platform 
_name_.


So, which one is it? Should we always have at most one instance of each 
platform, or are multiple instances ok?


cheers,
pl

PS: In case you're wondering about how I run into this, I was trying to 
create a pre-configured platform instance in (e.g.) an lldbinit file, 
without making it the default. That way it would get automatically 
selected when the user opens an executable of the appropriate type. This 
actually works, *except* for the case when the user selects the platform 
manually. That's because in that case, we would create an 
empty/unpopulated platform, and it would be the one being selected 
because it was the /current/ platform.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Source-level stepping with emulated instructions

2022-01-16 Thread Pavel Labath via lldb-dev

Hi Kjell,

if you say these instructions are similar to function calls, then it 
sounds to me like the best option would be to get lldb to treat them 
like function calls. I think (Jim can correct me if I'm wrong) this 
consists of two things:
- make sure lldb recognizes that these instructions can alter control 
flow (Disassembler::GetIndexOfNextBranchInstruction). You may have done 
this already.
- make sure lldb can unwind out of these "functions" when it winds up 
inside them. This will ensure the user does not stop in these functions 
when he does a "step over". This means providing it the correct unwind 
info so it knows where the functions will return. (As the functions know 
how to return to the original instructions, this information has to be 
somewhere, and is hopefully accessible to the debugger.) Probably the 
cleanest way to do that would be to create a new Module object, which 
would contain the implementations of all these functions, and all of 
their debug info. Then you could provide the unwind info through the 
usual channels (e.g. .debug_frame), and it has the advantage that you 
can also include any other information about these functions (names, 
line numbers, whatever...)


pl

On 15/01/2022 07:49, Kjell Winblad via lldb-dev wrote:

Hi!

I'm implementing LLDB support for a new processor architecture that
the company I'm working for has created. The processor architecture
has a few emulated instructions. An emulated instruction works by
jumping to a specific address that contains the start of a block of
instructions that emulates the emulated instructions. The emulated
instructions execute with interrupts turned off to be treated as
atomic by the programmer. So an emulated instruction is similar to a
function call. However, the address that the instruction jumps to is
implicit and not specified by the programmer.

I'm facing a problem with the emulated instructions when implementing
source-level stepping (the LLDB next and step commands) for C code in
LLDB. LLDB uses hardware stepping to step through the address range
that makes up a source-level statement. This algorithm works fine
until the PC jumps to the start of the block that implements an
emulated instruction. Then LLDB stops because the PC exited the
address range for the source-level statement. This behavior is not
what we want. Instead, LLDB should ideally step through the emulation
instructions and continue until the current source-level statement has
been completed.

My questions are:

1. Is there currently any LLDB plugin functionality or special DWARF
debug information to handle the kind of emulated instructions that I
have described? All the code for the emulated instructions is within
the same address range that does not contain any other code.
2. If the answer to question 1 is no, do you have suggestions for
extending LLVM to support this kind of emulated instructions?

Best regards,
Kjell Winblad
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: siginfo reading/writing support

2022-01-12 Thread Pavel Labath via lldb-dev
I kinda like the cleanliness (of the design, not the implementation) of 
a $siginfo variable, but you're right that implementing it would be 
tricky (I guess we'd have to write the struct info the process memory 
somewhere and then read it back when the expression completes).


I don't expect that users will frequently want to modify the siginfo 
structure. I think the typical use case would be to inspect the struct 
fields (maybe in a script -- we have one user wanting to do that) to 
understand more about the nature of the stop/crash.


With that in mind, I don't have a problem with a separate command, but I 
don't think that the "platform" subtree is a good fit for this. I mean, 
I am sure the (internal) Platform class will be involved in interpreting 
the data, but all of the platform _commands_ have something to do with 
the system as a whole (moving files around, listing processes, etc.) and 
not a specific process. I think this would belong under the "thread" 
subtree, since the signal is tied to a specific thread.


Due to the scripting use case, I am also interested in being able to 
inspect the siginfo struct through the SB API -- the expression approach 
would (kinda) make that possible, while a brand new command doesn't 
(without extra work). So, I started thinking whether this be exposed 
there. We already kinda expose the si_signo field via 
GetStopReasonDataAtIndex(0) (and it even happens to be the first siginfo 
field), but I don't think we would want to expose all fields in that manner.


This then lead me to SBThread::GetStopReasonExtendedInfoAsJSON. What is 
this meant to contain? Could we put the signal info there. If yes, then 
the natural command-line way of retrieving this would be the "thread 
info" command, and we would need to add any new commands.


This wouldn't solve the problem of writing to the siginfo struct, but I 
am not sure if this is a use case Michał is actually trying to solve 
right now (?) If it is then, maybe this could be done through a separate 
command, as we currently lack the ability to resume a process/thread 
with a specific signal ("process signal" does something slightly 
different). It could either be brand new command, or integrated into the 
existing process/thread continue commands. (thread continue --signal 
SIGFOO => "continue with SIGFOO"; thread continue --siginfo $47 => 
continue with siginfo in $47 ???)


pl

On 12/01/2022 01:07, Jim Ingham via lldb-dev wrote:

I would not do this with the expression parser.

First off, the expression parser doesn’t know how to do anything JIT code that 
will run directly in the target.  So if:

(lldb) expr $signinfo.some_field = 10

doesn’t resolve to some $siginfo structure in real memory with a real type such 
that clang can calculate the offset of the field “some_field” and write to it 
to make the change, then this wouldn’t be a natural fit in the current 
expression parser.  I’m guessing this is not the case, since you fetch this 
field through ptrace calls in the stub.

And the expression parser is enough of a beast already that we don’t want to 
add complexity to it without good reason.

We also don’t have any other instances of lldb injected $variables that we use 
for various purposes.  I’m not in favor of introducing them as they end up 
being pretty undiscoverable….

Why not something like:

(lldb) platform signinfo read [-field field_name]

Without the field name it would print the full siginfo, or you can list fields 
one by one with the —field argument.

And then make the write a raw command like:

(lldb) platform signinfo write -field name expression

The platform is a natural place for this, it is the agent that knows about all 
the details of the system your target is running on, so it would know what 
access you have to siginfo for the target system.

Having the argument write use expressions to produce the new value for the 
field would get you most of the value of introducing a virtual variable into 
the expression parser, since:

(lldb) pl si w -f some_field 

Is the same as you would get with the proposed $siginfo:

(lldb) expr $siginfo.some_field = 
  
You could also implement the write command as a raw command like:


(lldb)platform siginfo write —field some_field 

Which has the up side that people wouldn’t need to quote their expressions, but 
the down side that you could only change one field at a time.

This would also mean “apropos siginfo” would turn up the commands, as would a 
casual scan through the command tree.  So the feature would be pretty 
discoverable.

The only things this would make inconvenient are if you wanted to pass the 
value of a signinfo field to some function call or do something like:

$signinfo.some_field += 5

These don’t seem very common operations, and if you needed you could always do 
this with scripting, since the result from “platform siginfo read -field name” 
would be the value, so you could write a little script to grab the value and 
insert it into 

Re: [lldb-dev] Adding support for FreeBSD kernel coredumps (and live memory lookup)

2021-12-14 Thread Pavel Labath via lldb-dev

On 10/12/2021 11:12, Michał Górny wrote:

On Mon, 2021-12-06 at 14:28 +0100, Pavel Labath wrote:

The live kernel debugging sounds... scary. Can you explain how would
this actually work? Like, what would be the supported operations? I
presume you won't be able to actually "stop" the kernel, but what will
you actually be able to do?



Yes, it is scary.  No, the system doesn't stop -- it's just a racy way
to read and write kernel memory.  I don't think it's used often but I've
been told that sometimes it can be very helpful in debugging annoying
non-crash bugs, especially if they're hard to reproduce.



Interesting.

So how would this be represented in lldb? Would there be any threads, 
registers? Just a process with a bunch of modules ?


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Adding support for FreeBSD kernel coredumps (and live memory lookup)

2021-12-06 Thread Pavel Labath via lldb-dev

On 30/11/2021 14:49, Michał Górny via lldb-dev wrote:

Hi,

I'm working on a FreeBSD-sponsored project aiming at improving LLDB's
support for debugging FreeBSD kernel to achieve feature parity with
KGDB.  As a part of that, I'd like to improve LLDB's ability of working
with kernel coredumps ("vmcores"), plus add the ability to read kernel
memory via special character device /dev/mem.


The FreeBSD kernel supports two coredump formats that are of interest to
us:

1. The (older) "full memory" coredumps that use an ELF container.

2. The (newer) minidumps that dump only the active memory and use
a custom format.

At this point, LLDB recognizes the ELF files but doesn't handle them
correctly, and outright rejects the FreeBSD minidump format.  In both
cases some additional logic is required.  This is because kernel
coredumps contain physical contents of memory, and for user convenience
the debugger needs to be able to read memory maps from the physical
memory and use them to translate virtual addresses to physical
addresses.

Unless I'm mistaken, the rationale for using this format is that
coredumps are -- after all -- usually created when something goes wrong
with the kernel.  In that case, we want the process for dumping core to
be as simple as possible, and coredumps need to be small enough to fit
in swap space (that's where they're being usually written).
The complexity of memory translation should then naturally fall into
userspace processes used to debug them.

FreeBSD (following Solaris and other BSDs) provides a helper libkvm
library that can be used by userspace programs to access both coredumps
and running kernel memory.  Additionally, we have split the routines
related to coredumps and made them portable to other operating systems
via libfbsdvmcore [1].  We have also included a program that can convert
minidump into a debugger-compatible ELF core file.


We'd like to discuss the possible approaches to integrating this
additional functionality to LLDB.  At this point, our goal is to make it
possible for LLDB to correctly read memory from coredumps and live
system.


Plan A: new FreeBSDKernel plugin

I think the preferable approach is to write a new plugin that would
enable out-of-the-box support for the new functions in LLDB.  The plugin
would be based on using both libraries.  When available, libfbsdvmcore
will be used as the primary provider for vmcore support on all operating
systems.  Additionally, libkvm will be usable on FreeBSD as a fallback
provider for coredump support, and as the provider of live memory
support.

support using system-installed libfbsdvmcore to read coredumps and
libkvm to read coredumps (as a fallback) and to read live memory.

The two main challenges with this approach are:

1) "Full memory" vmcores are currently recognized by LLDB's elf-core
plugin.  I haven't investigated LLDB's plugin architecture in detail yet
but I think the cleanest solution here would be to teach elf-core to
distinguish and reject FreeBSD vmcores, in order to have the new plugin
handle them.

2) How to integrate "live kernel" support into the current user
interface?  I don't think we should make major UI modifications to
support this specific case but I'd also like to avoid gross hacks.
My initial thought is to allow specifying "/dev/mem" as core path, that
would match how libkvm handles it.

Nevertheless, I think this is the cleanest approach and I think we
should go with it if possible.


Plan B: GDB Remote Protocol-based wrapper
=
If we cannot integrate FreeBSD vmcore support into LLDB directly,
I think the next best approach is to create a minimal GDB Remote
Protocol server for it.  The rough idea is that the server implements
the minimal subset of the protocol necessary for LLDB to connect,
and implements memory read operations via the aforementioned libraries.

The advantage of this solution is that it is still relatively clean
and can be implemented outside LLDB.  It still provides quite good
performance but probably requires more work than the alternatives
and does not provide out-of-box support in LLDB.


Plan C: converting vmcores
==
Our final option, one that's practically implemented already is to
require the user to explicitly convert vmcore into an ELF core
understood by LLDB.  This is the simplest solution but it has a few
drawbacks:

1. it is limited to minidumps right now

2. it requires storing a converted coredump which means that at least
temporarily it doubles the disk space use

3. there is possibility of cleanly supporting live kernel memory
operations and therefore reaching KGDB feature parity

We could create a wrapper to avoid having users convert coredumps
explicitly but well, we think other options are better.


WDYT?


[1] https://github.com/Moritz-Systems/libfbsdvmcore



Having a new plugin for opening these kinds of core files seems 
reasonable to me. The extra dependency 

Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-11-24 Thread Pavel Labath via lldb-dev
For anyone following along, I have now posted the first patch for this 
feature here: <https://reviews.llvm.org/D114509>.


pl

On 08/11/2021 11:03, David Spickett wrote:

I actually did consider this, but it was not clear to me how this would tie in 
to the rest of lldb.
The "run qemu and connect to it" part could be reused, of course, but what else?


That part seems like a good start. I'm sure a lot of other things
would break/not work like you said but if I was shipping a modified
lldb anyway maybe I'd put the effort in to make it work nicely.

Again not something this work needs to consider. Just me relating the
idea to something I have more experience with and has some parallels
with the qemu-user idea.

On Fri, 5 Nov 2021 at 14:08, Pavel Labath via lldb-dev
 wrote:


On 04/11/2021 22:46, Jessica Clarke via lldb-dev wrote:

On Fri, Oct 29, 2021 at 05:55:02AM +, David Spickett via lldb-dev wrote:

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is 
"you" in the paragraph above?).


I tend to use "you" meaning "you or I" in hypotheticals. Same thing as
"if I had" but for whatever reason I phrase it like that to include
the other person, and it does have its ambiguities.

What I was proposing is, if I was correct (which I wasn't) then having
the user "platform select qemu-user" would solve things. (which it
doesn't)


What currently happens is that when you open a non-native (say, linux) 
executable, the appropriate remote platform gets selected automatically.


...because of this. I see where the blocker is now. I thought remote
platforms had to be selected before they could claim.


If we do have a prompt, then this may not be so critical, though I expect that 
most users would still prefer it we automatically selected qemu.


Seems reasonable to put qemu-user above remote-linux. Only claiming if
qemu-user has been configured sufficiently. I guess architecture would
be the minimum setting, given we can't find the qemu binary without
it.

Is this similar in any way to how the different OS remote platforms
work? For example there is a remote-linux and a remote-netbsd, is
there enough information in the program file itself to pick just one
or is there an implicit default there too?
(I see that platform CreateInstance gets an ArchSpec but having
trouble finding where that comes from)


Please make sure you don't forget that bsd-user also exists (and after
living in a fork for many years for various boring reasons is in the
middle of being upstreamed), so don't tie it entirely to remote-linux.



I am. In fact one of the reason's I haven't started putting up patches
yet is because I'm trying to figure out the best way to handle this. :)

My understanding is (let me know if I'm wrong) is that user-mode qemu
can emulate a different arhitecture, but not a different os. So, the
idea is that the "qemu" platform would forward all operations that don't
need special handling to the "host" platform. That would mean you get
freebsd behavior when running on freebsd, etc.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-11-05 Thread Pavel Labath via lldb-dev

On 04/11/2021 22:46, Jessica Clarke via lldb-dev wrote:

On Fri, Oct 29, 2021 at 05:55:02AM +, David Spickett via lldb-dev wrote:

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is 
"you" in the paragraph above?).


I tend to use "you" meaning "you or I" in hypotheticals. Same thing as
"if I had" but for whatever reason I phrase it like that to include
the other person, and it does have its ambiguities.

What I was proposing is, if I was correct (which I wasn't) then having
the user "platform select qemu-user" would solve things. (which it
doesn't)


What currently happens is that when you open a non-native (say, linux) 
executable, the appropriate remote platform gets selected automatically.


...because of this. I see where the blocker is now. I thought remote
platforms had to be selected before they could claim.


If we do have a prompt, then this may not be so critical, though I expect that 
most users would still prefer it we automatically selected qemu.


Seems reasonable to put qemu-user above remote-linux. Only claiming if
qemu-user has been configured sufficiently. I guess architecture would
be the minimum setting, given we can't find the qemu binary without
it.

Is this similar in any way to how the different OS remote platforms
work? For example there is a remote-linux and a remote-netbsd, is
there enough information in the program file itself to pick just one
or is there an implicit default there too?
(I see that platform CreateInstance gets an ArchSpec but having
trouble finding where that comes from)


Please make sure you don't forget that bsd-user also exists (and after
living in a fork for many years for various boring reasons is in the
middle of being upstreamed), so don't tie it entirely to remote-linux.



I am. In fact one of the reason's I haven't started putting up patches 
yet is because I'm trying to figure out the best way to handle this. :)


My understanding is (let me know if I'm wrong) is that user-mode qemu 
can emulate a different arhitecture, but not a different os. So, the 
idea is that the "qemu" platform would forward all operations that don't 
need special handling to the "host" platform. That would mean you get 
freebsd behavior when running on freebsd, etc.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-11-05 Thread Pavel Labath via lldb-dev

On 03/11/2021 14:53, David Spickett wrote:

Yeah, I think we can start with that.


No need to consider this now but it could easily be adapted to
qemu-system as well. Spinning up qemu-system for Cortex-M debug might
be a future use case. Once you've got a "run this program and connect
to this port" platform you can sub in almost anything that talks GDB.



I actually did consider this, but it was not clear to me how this would 
tie in to the rest of lldb. The "run qemu and connect to it" part could 
be reused, of course, but what else? What would be the "executable" that 
we "run" in system mode. Is it the kernel image? Disk image?


I have a feeling there wouldn't be much added value in this "platform" 
over say a python command which implements the start-up dance. OTOH, a 
proper user-mode platform enables one to hook in to all the usual lldb 
goodies like specifying the application's command line arguments, 
environment variables, can help with locating shared libraries, etc.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-11-03 Thread Pavel Labath via lldb-dev

On 29/10/2021 14:55, David Spickett wrote:

I don't think it does. Or at least I'm not sure how do you propose to solve them (who is 
"you" in the paragraph above?).


I tend to use "you" meaning "you or I" in hypotheticals. Same thing as
"if I had" but for whatever reason I phrase it like that to include
the other person, and it does have its ambiguities.

What I was proposing is, if I was correct (which I wasn't) then having
the user "platform select qemu-user" would solve things. (which it
doesn't)

Great, thanks for clarifying.


If we do have a prompt, then this may not be so critical, though I expect that 
most users would still prefer it we automatically selected qemu.


Seems reasonable to put qemu-user above remote-linux. Only claiming if
qemu-user has been configured sufficiently. I guess architecture would
be the minimum setting, given we can't find the qemu binary without
it.

Yeah, I think we can start with that.


Is this similar in any way to how the different OS remote platforms
work? For example there is a remote-linux and a remote-netbsd, is
there enough information in the program file itself to pick just one
or is there an implicit default there too?
This is actually one of the pain points in lldb. The overall design 
assumes that you can precisely identify the platform(triple) that the 
file is meant to be run on by looking at the object file. This is 
definitely true on Apple platforms (where lldb originated) as even the 
"simulator" binaries have their own triples.


The situation is more fuzzy in the elf world. TTe *bsd oses have (and 
use) a ELFOSABI_ constant to identify the binary. Linux uses 
ELFOSABI_NONE even though there is a dedicated constant it could use 
(there's probably an interesting story in there). This makes it hard to 
positively identify a file as a linux binary, but we can mostly get away 
with it because there's just one OS like that. Having some mechanism to 
resolve ambiguities might also help with that.


I'm also not sure how much do the OSes actually validate the contents of 
the elf headers. I wouldn't be surprised if one could create "polyglot" 
elf binaries that can run on multiple operating systems.



(I see that platform CreateInstance gets an ArchSpec but having
trouble finding where that comes from)
It gets called from 
TargetList::CreateTargetInternal->Platform::CreateTargetForArchitecture->Platform::Create. 
There may be other callers, but I think this is the relevant one.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] The two PDB plugins in LLDB

2021-11-03 Thread Pavel Labath via lldb-dev

[+ aleksandr]

On 03/11/2021 09:18, Martin Storsjö via lldb-dev wrote:

On Tue, 2 Nov 2021, Raphael Isemann via lldb-dev wrote:


Unless removing the non-native PDB plugin has some negative impact on
users (e.g., missing features in native plugin that work with the
non-native plugin), I would propose we delete it and only keep the
native PDB plugin in LLDB which seems far less work to maintain.


As far as I know, this is clearly the intended direction, but my 
understanding is also that the native PDB plugin isn't quite on the same 
functionality level yet.


I've been meaning to dive into it, try it out and see if there's 
something I could do help it along, but I haven't gotten to it, and I 
can't realistically take it on right now...




While that has been the intended direction all along, there has been 
very little progress made on this front in the last couple of years 
(D110172 is the only recent change in this direction). At this point, 
I'm not sure if anyone even knows what the missing functionality is. 
Obviously, this is not a good situation to be in.


So, even if it is possible for Raphael to make progress here without 
actually deleting the DIA plugin, I'd still say we should at least start 
the process of its deprecation/removal.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-10-29 Thread Pavel Labath via lldb-dev

On 29/10/2021 14:00, Pavel Labath via lldb-dev wrote:

On 29/10/2021 12:39, David Spickett wrote:
So there wouldn't be a three-way tie, but if you actually wanted to 
debug a native executable under qemu, you would have to explicitly 
select the qemu platform. This is the same thing that already happens 
when you want to debug a native executable remotely, but there it's 
kind of expected because you need to connect to the remote machine 
anyway.


Since we already have the host vs remote with native arch situation,
is it any different to ask users to do "platform select qemu-user" if
they really want qemu-user? Preferring host to qemu-user seems
logical.

It does. I am perfectly fine with preferring host over qemu-user.


For non native it would come up when you're currently connected to a
remote but want qemu-user on the host. So again you explicitly select
qemu-user.

Does that solve all the ambiguous situations?
I don't think it does. Or at least I'm not sure how do you propose to 
solve them (who is "you" in the paragraph above?).


What currently happens is that when you open a non-native (say, linux) 
executable, the appropriate remote platform gets selected automatically.

$ lldb aarch64/bin/lldb
(lldb) target create "aarch64/bin/lldb"
Current executable set to 'aarch64/bin/lldb' (aarch64).
(lldb) platform status
   Platform: remote-linux
  Connected: no

That happens because the remote-linux platform unconditionally claims 
the non-native executables (well.. it claims all of them, but it is 
overridden by the host platform for native ones). It does not check 
whether it is connected or anything like that.


And I think that behavior is fine, because for a lot of actions you 
don't actually need to connect to anything. For example, you usually 
don't connect anywhere when inspecting core files (though you can do 
that, and it would mean lldb can download relevant shared libraries). 
And you can always connect at a later time, if needed.


Now the question is what should the new platform do. If it followed the 
remote-linux pattern, it would also claim those executables 
unconditionally, we would always have a conflict (*).


I meant to add an explanation for this asterisk. I was going to say that 
in the current setup, I believe we would just choose whichever platform 
comes first (which is the first platform to get initialized), but that 
is not that great -- ideally, our behavior should not depend on the 
initialization order.




Or, it can try to be a bit less greedy and claim an executable only when 
it is configured. That would mean that in a clean state, everything 
would behave as it. However, the conflict would reappear as soon as the 
platform is configured (which will be always, for our users). The idea 
behind this (sub)feature was that there would be a way to configure lldb 
so that the qemu plugin comes out on top (of remote-linux, not host).


If we do have a prompt, then this may not be so critical, though I 
expect that most users would still prefer it we  automatically selected 
qemu.


I also realized that implementing the prompt for the case where the 
executable is specified on the command line will be a bit tricky, 
because at that lldb hasn't gone interactive yet. I don't think there's 
any reason why it shouldn't prompt a user in this case, but doing it may 
require refactoring some of our startup code.







Do you mean like, each platform would advertise its kind 
(host/emulator/remote), and the relative kind priorities would be 
hardcoded in lldb?


Yes. Though I think that opens more issues than it solves. Host being
higher priority than everything else seems ok. Then you have to think
about how many emulation/connection hops each one has, but sometimes
that's not the metric that matters. E.g. an armv7 file on a Mac would
make more sense going to an Apple Watch simulator than qemu-user.

Yes, those were my thoughts as well, but I am unsure how often would 
that occur in practice (I'm pretty sure I'll need to care for only 
one arch for my use case).


Seems like starting with a single "qemu-user" platform is the way to
go for now. When it's not configured it just won't be able to claim
anything.

The hypothetical I had was shipping a development kit that included
qemu-arch1 and qemu-arch2. Would you rather ship one init file that
can set all those settings at once (since each one has its own
namespace) or symlink lldb-arch1 to be "lldb -s ". However anyone who's looking at shipping lldb has control
of the sources so they could make their own platform entries. Or
choose a command line based on an IDE setting.


Yes, that's the hypothetical I had in mind too. I don't think we will be 
doing it, but I can imagine _somebody_ wanting to do it.



pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev



Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-10-29 Thread Pavel Labath via lldb-dev

On 29/10/2021 12:39, David Spickett wrote:

So there wouldn't be a three-way tie, but if you actually wanted to debug a 
native executable under qemu, you would have to explicitly select the qemu 
platform. This is the same thing that already happens when you want to debug a 
native executable remotely, but there it's kind of expected because you need to 
connect to the remote machine anyway.


Since we already have the host vs remote with native arch situation,
is it any different to ask users to do "platform select qemu-user" if
they really want qemu-user? Preferring host to qemu-user seems
logical.

It does. I am perfectly fine with preferring host over qemu-user.


For non native it would come up when you're currently connected to a
remote but want qemu-user on the host. So again you explicitly select
qemu-user.

Does that solve all the ambiguous situations?
I don't think it does. Or at least I'm not sure how do you propose to 
solve them (who is "you" in the paragraph above?).


What currently happens is that when you open a non-native (say, linux) 
executable, the appropriate remote platform gets selected automatically.

$ lldb aarch64/bin/lldb
(lldb) target create "aarch64/bin/lldb"
Current executable set to 'aarch64/bin/lldb' (aarch64).
(lldb) platform status
  Platform: remote-linux
 Connected: no

That happens because the remote-linux platform unconditionally claims 
the non-native executables (well.. it claims all of them, but it is 
overridden by the host platform for native ones). It does not check 
whether it is connected or anything like that.


And I think that behavior is fine, because for a lot of actions you 
don't actually need to connect to anything. For example, you usually 
don't connect anywhere when inspecting core files (though you can do 
that, and it would mean lldb can download relevant shared libraries). 
And you can always connect at a later time, if needed.


Now the question is what should the new platform do. If it followed the 
remote-linux pattern, it would also claim those executables 
unconditionally, we would always have a conflict (*).


Or, it can try to be a bit less greedy and claim an executable only when 
it is configured. That would mean that in a clean state, everything 
would behave as it. However, the conflict would reappear as soon as the 
platform is configured (which will be always, for our users). The idea 
behind this (sub)feature was that there would be a way to configure lldb 
so that the qemu plugin comes out on top (of remote-linux, not host).


If we do have a prompt, then this may not be so critical, though I 
expect that most users would still prefer it we  automatically selected 
qemu.






Do you mean like, each platform would advertise its kind 
(host/emulator/remote), and the relative kind priorities would be hardcoded in 
lldb?


Yes. Though I think that opens more issues than it solves. Host being
higher priority than everything else seems ok. Then you have to think
about how many emulation/connection hops each one has, but sometimes
that's not the metric that matters. E.g. an armv7 file on a Mac would
make more sense going to an Apple Watch simulator than qemu-user.


Yes, those were my thoughts as well, but I am unsure how often would that occur 
in practice (I'm pretty sure I'll need to care for only one arch for my use 
case).


Seems like starting with a single "qemu-user" platform is the way to
go for now. When it's not configured it just won't be able to claim
anything.

The hypothetical I had was shipping a development kit that included
qemu-arch1 and qemu-arch2. Would you rather ship one init file that
can set all those settings at once (since each one has its own
namespace) or symlink lldb-arch1 to be "lldb -s ". However anyone who's looking at shipping lldb has control
of the sources so they could make their own platform entries. Or
choose a command line based on an IDE setting.


Yes, that's the hypothetical I had in mind too. I don't think we will be 
doing it, but I can imagine _somebody_ wanting to do it.



pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-10-29 Thread Pavel Labath via lldb-dev

Thanks for reading this. Responses inline.

On 28/10/2021 16:28, David Spickett wrote:

Glad to hear the gdb server in qemu plays nicely with lldb. Perhaps
some of that is the compatibility work that has been going on.


The introduction of a qemu platform would introduce such an ambiguity, since (when 
running on a linux host) a linux executable would be claimed by both the qemu plugin and 
the existing remote-linux platform. This would prevent "target create 
arm-linux.exe" from working out-of-the-box.


I assume you wouldn't get a 3 way tie here because in connecting to a
remote-linux you've "disconnected" the host platform, right?
IIUC, the host platform is not consulted at this step. It can only be 
claim an executable when it is selected as the "current" platform, 
because the current platform is consulted first. (And this is what 
happens in most "normal" debug sessions.)


So there wouldn't be a three-way tie, but if you actually wanted to 
debug a native executable under qemu, you would have to explicitly 
select the qemu platform. This is the same thing that already happens 
when you want to debug a native executable remotely, but there it's kind 
of expected because you need to connect to the remote machine anyway.





To resolve this, I'd like to create some kind of a mechanism to give preference 
to some plugin.


This choosing of plugin, does it mostly take place automatically at
the moment or is there a good spot where we could say "X and Y could
load this file, please choose one/resolve the tie"?
This currently happens in TargetList::CreateTargetInternal, and one 
cannot create a prompt there, as that code is also used by the 
non-interactive paths (SBDebugger::CreateTarget, for instance). But I 
like the idea, and it may not be too difficult to refactor this to make 
that work. (I am imagining changing this code to use llvm::Error, and 
then creating a special AmbiguousPlatformError type, which could get 
caught by the command line code and transformed into a prompt.)




My first thought for automatic resolve is a native/emulator/remote
sort of hierarchy if you were going to order them. (with some nice
message "preferring X to Y because..." when it starts up)
Do you mean like, each platform would advertise its kind 
(host/emulator/remote), and the relative kind priorities would be 
hardcoded in lldb?





a) have just a single set of settings, effectively limiting the user to 
emulating just a single architecture per session. While it would most likely be 
enough for most use cases, this kind of limitation seems artificial.


One aspect here is the way you configure them if you want to use many
architectures of qemu-user.

If I have only one platform, I set qemu-user.foo to some Arm focused
value. Then if I want to work on AArch64 I edit my lldbinit to switch
it. (or have many init files)
If there's one platform per arch I can set qemu-arm.foo and qemu-aarch64.foo.
Yes, those were my thoughts as well, but I am unsure how often would 
that occur in practice (I'm pretty sure I'll need to care for only one 
arch for my use case).




Not much between them without having a specific use case for it. You
could work around either in various ways.

Wouldn't most of the platform entries just be subclasses of some
generic qemu-user-platform? So code wise it wouldn't be that much
extra to add them.
Yeah, it's possible they wouldn't even be actual classes, just different 
instances of the same class.



You could say it's bad to list qemu-xyz-platform when that isn't
installed, but then again, lldb lists a "local Mac OSX user platform
plug in" even on Linux. So not a big deal.
Yeah, I don't think it's a big deal either. The reason I'm asking this 
is to try to create a consistent experience. For example, we have a 
bunch of PlatformApple{Watch,TV,...}{Remote,Simulator} platforms (only 
available on apple hosts). These don't differ in architectures, but they 
do differ in the environment part of the triples, so you (almost) have a 
one-to-one mapping between triples and architectures.


However, they're also automatically configured (based on the xcode 
installation), and they don't create ambiguities (simulators have 
separate triples), so I'm not sure what kind of parallels to draw from that.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] [RFC] lldb integration with (user mode) qemu

2021-10-28 Thread Pavel Labath via lldb-dev

Hello everyone,

I'd like to propose a new plugin for better lldb+qemu integration.

As you're probably aware qemu has an integrated gdb stub. Lldb is able
to communicate with it, but currently this process is somewhat tedious.
One has to manually start qemu, giving it a port number, and then
separately start lldb, and have it connect to that port.

The chief purpose of this feature would be to automate this behavior,
ideally to the point where one can just point lldb to an executable,
type "run", and everything would just work. It would take the form of a
platform plugin (PlatformQemuUser, perhaps). This would be a non-host,
always-connected plugin, and it's heart would be the DebugProcess
method, which would ensure the emulator gets started when the user wants
to start debugging. It would operate the same way as our host platforms
do, except that it would start qemu instead of debug/lldb-server. Most
of the other methods would be implemented by delegating to the host
platform (as the process will be running on the host), possibly with
some minor adjustments like prepending sysroot to the paths, etc. (My
initial proof-of-concept implementation was 200 LOC.)

The plugin would be configured via multiple settings, which would let
the user specify, the path to the emulator, the kind of cpu it should
emulate and the path to the system libraries, and any other arguments
that the user wishes to pass to the emulator. The user could then
configure it in their lldbinit file to match their system setup.

The needs of this plugin should match the existing Platform abstraction
fairly well, so I don't anticipate (*) the need to add new entry points
or modify existing ones. There is one tricky aspect which I see, and it
relates to platform selection. Our current platform selection code gives
each platform instance (while preferring the current platform) a chance
to "claim" an executable, and aborts if the choice is ambiguous. The
introduction of a qemu platform would introduce such an ambiguity, since
(when running on a linux host) a linux executable would be claimed by
both the qemu plugin and the existing remote-linux platform. This would
prevent "target create arm-linux.exe" from working out-of-the-box.

To resolve this, I'd like to create some kind of a mechanism to give
preference to some plugin. This could either be something internal,
where a plugin indicates "strong" preference for an executable (the qemu
platform could e.g. do this when the user sets the emulator path, the
remote platform when it is connected), or some external mechanism like a
global setting giving the preferred platform order. I'd very much like
hear your thoughts on this.

I'm also not sure how to handle the case of multiple emulated
architectures. Qemu can emulate any processor architecture (of those
that lldb supports, anyway), but the path to the emulator, sysroot, and
probably other settings as well are going to be different. I see two
possible ways to go about this:

a) have just a single set of settings, effectively limiting the user to
emulating just a single architecture per session. While it would most
likely be enough for most use cases, this kind of limitation seems
artificial. It would also likely require the introduction of another
setting, which would specify which architecture the plugin should
actually emulate (and return from GetSupportedArchitectureAtIndex,
etc.). On the flip side, this would be consistent with the how our
remote-plugins work, although there it is given by the need to connect
to something, and the supported architecture is then determined by the
remote machine.

b) have multiple platform instances for each architecture. This solution
be a more general solution, but it would mean that our "platform list"
output would double, and half of it would consist of qemu platforms

As far as testing is concerned I'm planning to reuse parts of our
gdb-client test suite for this. Namely, I want to write a small python
script which would act as a fake emulator. It would be sending out
pre-programmed gdb-remote responses, much like our client test suite
does. Since the main purpose of this is to validate that the emulator
was started with the correct arguments, I don't expect the need for
emulating any complex behavior -- the existing client classes should
completely suffice.

If you got all the way here, I want to thank you for taking your time to
read this, and urge you to let me know what you think.

regards,
pavel

(*) There is one refactor of the Platform class implementations that I'd 
like to do first, but this (a) is not strictly necessary for this; and 
(b) is valueable independently of this RFC; so I am leaving that for a 
separate discussion.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Semantics of SBValue::CreateChildAtOffset

2021-10-22 Thread Pavel Labath via lldb-dev

Hello Jim, everyone,

I recently got a question/bug report about python pretty printers 
(synthetic child providers) that I couldn't answer.


The actual script is of course more complicated, but the essence boils 
down to this.


There's a class, something like:
struct S {
  ...
  T member;
};

The pretty printer tries to print this type, and it does something like:
 def get_child_at_index(self, index):
 if index == 0:
 child = self.sbvalue.GetChildMemberWithName("child")
 return child.CreateChildAtOffset("[0]", 0, T2)


Now here comes the interesting part. The exact behaviour here depends on 
the type T. If T (and of course, in the real example this is a template) 
is  plain type, then this behaves a like a bitcast so the synthetic 
child is essentially *reinterpret_cast().


*However*, if T is a pointer, then lldb will *dereference* it before 
performing the cast, giving something like

   *reinterpret_cast(s.member) // no &
as a result.

The first question that comes to mind is: Is this behavior intentional 
or a bug?


At first it seemed like this is too subtle to be a bug, but the more I 
though about it, the less I was sure about the CreateChildAtOffset 
function as a whole.


What I mean is, this pretty printer is essentially creating child for a 
value that it is not printing. That seems like a bad idea in general, 
although I wasn't able to observe any ill effects (e.g. when I printi 
s.member directly, I don't see any bonus children). Then I looked at 
some of the in-tree pretty-printers, and I did find this pattern at 
least two libc++ printers (libcxx.py:125 and :614), although they don't 
suffer from this ambiguity, because the values they are printing are 
always pointers.


However, that means I absolutely don't know what is the expected 
behavior here:
- Are pretty printers allowed to call CreateChildAtOffset on values they 
are not printing

- Is CreateChildAtOffset supposed to behave differently for pointer types?

I'd appreciate any insight,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Serial port support in LLDB

2021-10-08 Thread Pavel Labath via lldb-dev

On 08/10/2021 11:06, Pavel Labath via lldb-dev wrote:

On 06/10/2021 14:59, Michał Górny wrote:

On Wed, 2021-10-06 at 14:32 +0200, Pavel Labath wrote:

Let me try to make a counterproposal.

Since the serial parameters are a property of a specific connection, and
one could theoretically have be debugging multiple processes with
different connection parameters, having a (global) setting for them does
not seem ideal. And since lldb already has a history of using made up
urls (unix-connect://, tcp://), I am wondering if we couldn't/shouldn't
invent a new url scheme for this. So like instead of file:///dev/ttyS0,
one would use a new scheme (say serial:// to connect), and we would
somehow encode the connection parameters into the url. For example, this
could look something like
    serial://[PARODD,STOP1,B19200]/dev/ttyS0
This connection string could be passed to both lldb and lldb-server
without any new arguments. Implementation-wise this url could be
detected at a fairly high level, and would cause us to instantiate a new
class which would handle the serial connection.


I don't have a strong opinion either way.  I suppose this would be
a little easier to implement, though I'd prefer using something more
classically URL-ish, i.e.:

   serial:///dev/ttyS0?baud=115200=odd...

or:

   serial:///dev/ttyS0,baud=115200,parity=odd...

I suppose it would prevent weird paths but I don't think we need to
account for weird paths for ttys (and if we did, we should probably
urlencode them anyway).



The the full-url version (with ? and &). I'm sure you've noticed this 
already, but for the sake of others, let me just mention that we already 
have a TODO to support a syntax like this.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

(I meant to say I like the full url version)
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Serial port support in LLDB

2021-10-08 Thread Pavel Labath via lldb-dev

On 06/10/2021 14:59, Michał Górny wrote:

On Wed, 2021-10-06 at 14:32 +0200, Pavel Labath wrote:

Let me try to make a counterproposal.

Since the serial parameters are a property of a specific connection, and
one could theoretically have be debugging multiple processes with
different connection parameters, having a (global) setting for them does
not seem ideal. And since lldb already has a history of using made up
urls (unix-connect://, tcp://), I am wondering if we couldn't/shouldn't
invent a new url scheme for this. So like instead of file:///dev/ttyS0,
one would use a new scheme (say serial:// to connect), and we would
somehow encode the connection parameters into the url. For example, this
could look something like
serial://[PARODD,STOP1,B19200]/dev/ttyS0
This connection string could be passed to both lldb and lldb-server
without any new arguments. Implementation-wise this url could be
detected at a fairly high level, and would cause us to instantiate a new
class which would handle the serial connection.


I don't have a strong opinion either way.  I suppose this would be
a little easier to implement, though I'd prefer using something more
classically URL-ish, i.e.:

   serial:///dev/ttyS0?baud=115200=odd...

or:

   serial:///dev/ttyS0,baud=115200,parity=odd...

I suppose it would prevent weird paths but I don't think we need to
account for weird paths for ttys (and if we did, we should probably
urlencode them anyway).



The the full-url version (with ? and &). I'm sure you've noticed this 
already, but for the sake of others, let me just mention that we already 
have a TODO to support a syntax like this.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Serial port support in LLDB

2021-10-06 Thread Pavel Labath via lldb-dev

Thanks for the nice summary Michał. I've found it very helpful.

The thing I am missing from this proposal is how would those settings 
translate into actual termios calls? Like, who would be responsible 
reading those settings and acting on them? Currently we have some tty 
code in ConnectionFileDescriptorPosix (in the Host library) but I think 
a bit too low-level for that (I don't think the host library knows 
anything about "settings"). I am also not sure if a class called 
"ConnectionFileDescriptor" is really the best place for this.


On 05/10/2021 11:21, Michał Górny via lldb-dev wrote:

Hi, everyone.

I'm working on improving LLDB's feature parity with GDB.  As part of
this, I'm working on bettering LLDB's serial port support.  Since serial
ports are not that common these days, I've been asked to explain a bit
what I'd like to do.


At this point, LLDB (client) has minimal support for serial port
debugging.  You can use a command like:

 process connect file:///dev/ttyS0

to connect via the GDB Remote Protocol over a serial port.  However,
the client hardcodes serial transmission parameters (I'll explain
below).  I haven't been able to find an option to bind lldb-server to a
serial port.

I'd like to fix the both limitations, i.e. make it possible to configure
serial port parameters and add support for serial port in lldb-server.


The RS-232 standard is quite open ended, so I'm going to focus on 8250-
compatible serial port with DB-9 connector below (i.e. the kind found
in home PCs).  However, I'm going to skip the gory details and just
focus on the high-level problem.

The exact protocol used to transmit data over the serial port is
configurable to some degree.  However, there is no support for
autoconfiguration, so both ends have to be configured the same.
The synchronization support is also minimal.

The important hardware parameters that can be configured are:

- baud rate, i.e. data transmission speed that implies the sampling
   rate.  The higher the baud rate, the shorter individual bits are
   in the transmitted signal.  If baud rate is missynced, then
   the receiver will get wrong bit sequences.

- number of data bits (5-8) in the frame, lower values meaning that
   the characters sent are being truncated.  For binary data transfers,
   8 data bits must be used.
I believe gdb-remote protocol is compatible with 7-bit connections, 
though we would need to make sure lldb refrains from using some packets. 
Should I take it this is not an avenue you wish to pursue?




- parity used to verify frame correctness.  The parity bit is optional,
   and can be configured to use odd or even parity.  Additionally, Linux
   supports sending constant 0 or 1 as parity bit.

- number of stop bits (1 or 1.5/2) in the frame.  The use of more than
   one stop bit is apparently a relict that was supposed to give
   the receiver more time for processing.  I think this one isn't
   strictly necessary nowadays.

Gotta love those half-bits.



- flow control (none, software, hardware).  This is basically used by
   the receiver to inform the sender that it's got its buffer full
   and the sender must stop transmitting.  Software flow control used
   in-band signaling, so it's not suitable for binary protocols.
   Hardware flow control uses control lines.

Of course, there is more to serial ports than just that but for LLDB's
purposes, this should be sufficient.


The POSIX and win32 API for serial ports are quite similar in design.
In the POSIX API, you have to open a character device corresponding to
the serial port, while in win32 API a special path \\.\COMn.  In both
cases, reads and writes effect the transmission.  Both systems also have
a dedicated API to configure the serial transmission parameters
(ioctl/termios on POSIX [1], DCB on win32 [2]).  Note that I haven't
tried the win32 API, just looked it up.

The POSIX serial ports are a teletype (tty) devices just like virtual
consoles used by terminal emulators.  This makes it easy to use a serial
port as a remote terminal for other system.  This also adds a bunch of
configuration options related to input/output processing and special
behavior.  When a serial port is going to be used for non-console
purposes, these flags have to be disabled (i.e. the tty is set to 'raw'
mode).


The rough idea is that after opening the serial port, we need to set its
parameters to match the other end.  For this to work, I need to replace
LLDB's hardwired parameters with some way of specifying this.  I think
the cleanest way of doing this (and following GDB's example) would be to
add a new set of configuration variables to specify:

a. the baud rate used

b. the parity kind used

c. the number of stop bits

d. whether to use hardware flow control

I'm thinking of creating a new setting group for this, maybe
'host.serial'.  When connecting to a serial port, LLDB would set its
parameters based on the settings from this group.

That said, I can't think of a really clean 

Re: [lldb-dev] who is maintaining lldb-x86_64-debian

2021-09-06 Thread Pavel Labath via lldb-dev

On 03/09/2021 07:00, Omair Javaid via lldb-dev wrote:

Hi Jan,
On Thu, 2 Sept 2021 at 17:29, Jan Kratochvil > wrote:


On Thu, 02 Sep 2021 12:42:37 +0200, Raphael “Teemperor” Isemann via
lldb-dev wrote:
 > If this is about the TestGuiBasicDebug failures, then that test
is also
 > failing on other bots. I personally would vote to disable the
test as that
 > part of the test is flakey since its introduction (which was
during this
 > GSoC).

Yes, in the last month or so there have been some random failures on 
this otherwise very stable buildbot so I was wondering if someone is 
looking after and taking care of these issues or not.


There are multiple racy testcases. I have setup alert only if all last
5 builds failed. That is a longterm reliable enough threshold.

For https://lab.llvm.org/staging/#/builders/16
 = lldb-x86_64-fedora:
https://lab.llvm.org/staging/api/v2/builders/16/builds?limit=5=-number


         "state_string": "build successful",

This is helpful. Thanks!



Jan



I haven't been very (or at all) active for the last few months, but that 
should change soon (as soon as I regain my bearings). And then I'll see 
what I can do about this.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Duplicate use of "sp" register name on x86 targets

2021-09-06 Thread Pavel Labath via lldb-dev

On 25/08/2021 21:13, Michał Górny via lldb-dev wrote:

Hi,

While working on improving gdbserver compatibility, I've noticed that
"sp" is used twice:

1. as an alt_name for esp/rsp register (giving full 32/64-bit stack
pointer),

2. and as the name of sp pseudo-register (giving ESP/RSP truncated to 16
bits).

FWICS the current lookup logic (at least for LLGS targets) means that 1.
takes precedence, i.e. 'register read sp' and 'p $sp' will both resolve
to RSP.  The 16-bit SP is only visible via 'register read --all'.

However, I'm wondering whether this is actually desirable.
In particular, should 'sp' generic name take precedence over an actual
'sp' (pseudo-)register?



Given that accessing the lower 16 bits of the stack pointer is pretty 
useless, and I can think of several use cases for accessing the (full) 
stack pointer in an architecture-agnostic way, I think that the current 
precedence is right.


Which is not to say that the situation is ideal, but I'm not sure what 
can we (short of travelling back in time) do about it. Maybe we can 
provide some alias (sp_16, sp_86, ???), so that the register remains 
addressable?


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Improving protocol-level compatibility between LLDB and GDB

2021-04-27 Thread Pavel Labath via lldb-dev

On 25/04/2021 22:26, Jason Molenda wrote:

I was looking at lldb-platform and I noticed I implemented the A packet in it, 
and I was worried I might have the same radix error as lldb in there, but this 
code I wrote made me laugh:

 const char *p = pkt.c_str() + 1;   // skip the 'A'
 std::vector packet_contents = 
get_fields_from_delimited_string (p, ',');
 std::vector inferior_arguments;
 std::string executable_filename;

 if (packet_contents.size() % 3 != 0)
 {
 log_error ("A packet received with fields that are not a multiple of 3:  
%s\n", pkt.c_str());
 }

 unsigned long tuples = packet_contents.size() / 3;
 for (int i = 0; i < tuples; i++)
 {
 std::string length_of_argument_str = packet_contents[i * 3];
 std::string argument_number_str = packet_contents[(i * 3) + 1];
 std::string argument = decode_asciihex (packet_contents[(i * 3) + 
2].c_str());

 int len_of_argument;
 if (ascii_to_i (length_of_argument_str, 16, len_of_argument) == false)
 log_error ("Unable to parse length-of-argument field of A packet: %s in 
full packet %s\n",
length_of_argument_str.c_str(), pkt.c_str());

 int argument_number;
 if (ascii_to_i (argument_number_str, 16, argument_number) == false)
 log_error ("Unable to parse argument-number field of A packet: %s in 
full packet %s\n",
argument_number_str.c_str(), pkt.c_str());

 if (argument_number == 0)
 {
 executable_filename = argument;
 }
 inferior_arguments.push_back (argument);
 }


These A packet fields give you the name of the binary and the arguments to pass 
on the cmdline.  My guess is at some point in the past the arguments were not 
asciihex encoded, so you genuinely needed to know the length of each argument.  
But now, of course, and you could write a perfectly fine client that mostly 
ignores argnum and arglen altogether.


That's quite clever, actually. I like it. :)




I wrote a fix for the A packet for debugserver using a 'a-packet-base16' 
feature in qSupported to activate it, and tested it by hand, works correctly.  
If we're all agreed that this is how we'll request/indicate these protocol 
fixes, I can put up a phab etc and get this started.



I think that's fine, though possible changing the servers to just ignore 
the length fields, like you did above, might be even better, as then 
they will work fine regardless of which client they are talking to. They 
still should advertise their non-brokenness so that the client can form 
the right packet, but this will be just a formality to satisfy protocol 
purists (or pickier servers), and not make a functional difference.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Improving protocol-level compatibility between LLDB and GDB

2021-04-21 Thread Pavel Labath via lldb-dev

I am very happy to see this effort and I fully encourage it.

On 20/04/2021 09:13, Michał Górny via lldb-dev wrote:

On Mon, 2021-04-19 at 16:29 -0700, Greg Clayton wrote:

I think the first blocker towards this project are existing
implementation bugs in LLDB. For example, the vFile implementation is
documented as using incorrect data encoding and open flags. This is not
something that can be trivially fixed without breaking compatibility
between different versions of LLDB.


We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically 
distribute both "lldb" and "lldb-server" together so this shouldn't be a huge 
problem.


Hmm, I've focused on this because I recall hearing that OSX users
sometimes run new client against system server... but now I realized
this isn't relevant to LLGS ;-).  Still, I'm happy to do things
the right way if people feel like it's needed, or the easy way if it's
not.


The vFile packets are, used in the "platform" mode of the connection 
(which, btw, is also something that gdb does not have), and that is 
implemented by lldb-server on all hosts (although I think apple may have 
some custom platform implementations as well). In any case though, 
changing flag values on the client will affect all servers that it 
communicates with, regardless of the platform.


At one point, Jason cared enough about this to add a warning about not 
changing these constants to the code. I'd suggest checking with him 
whether this is still relevant.


Or just going with your proposed solution, which sounds perfectly 
reasonable to me





The other main issue LLDB has when using other GDB servers is the dynamic register information is 
not enough for debuggers to live on unless there is some hard coded support in the debugger that 
can help fill in register numberings. The GDB server has its own numbers, and that is great, but in 
order to truly be dynamic, we need to know the compiler register number (such as the reg numbers 
used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers 
(these are usually the same as the compiler register numbers, but they do sometimes differ (like 
x86)). LLDB also likes to know "generic" register numbers like which register it the PC 
(RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so 
that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to 
the XML that is retrieved via "target.xml" so that it can be complete. See the function 
in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:

bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo _info,
 GDBRemoteDynamicRegisterInfo _reg_info, ABISP abi_sp,
 uint32_t _num_remote, uint32_t _num_local);

There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", 
"generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"



Yes, this is probably going to be the hardest part.  While working
on plugins, I've found LLDB register implementation very hard to figure
out, especially that the plugins seem to be a mix of new, old and older
solutions to the same problem.

We will probably need more ground-level design changes too.  IIRC lldb
sends YMM registers as a whole (i.e. with duplication with XMM
registers) while GDB sends them split like in XSAVE.  I'm not yet sure
how to handle this best -- if we don't want to push the extra complexity
on plugins, it might make sense to decouple the packet format from
the data passed to plugins.



Yes, this is definitely going to be the trickiest part, and probably 
deserves its own RFC. However, I want to note that in the past 
discussions, the consensus (between Jason and me, at least) has been to 
move away from this "rich" register information transfer. For one, 
because we have this information coded into the client anyway (as people 
want to communicate with gdb-like stubs).


So, the first, and hopefully not too hard, step towards that could be to 
get lldb-server to stop sending these extra fields (and fix anything 
that breaks as a result).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: packet to identify a standalone aka firmware binary UUID / location

2021-03-30 Thread Pavel Labath via lldb-dev

On 23/03/2021 07:01, Jason Molenda wrote:

Hi, I'm working with an Apple team that has a gdb RSP server for JTAG 
debugging, and we're working to add the ability for it to tell lldb about the 
UUID and possibly address of a no-dynamic-linker standalone binary, or firmware 
binary.  Discovery of these today is ad-hoc and each different processor has a 
different way of locating the main binary (and possibly sliding it to the 
correct load address).

We have two main ways of asking the remote stub about binary images today:  
jGetLoadedDynamicLibrariesInfos on Darwin systems with debugserver, and 
qXfer:libraries-svr4: on Linux.

  jGetLoadedDynamicLibrariesInfos has two modes: "tell me about all libraries" and 
"tell me about libraries at these load addresses" (we get notified about libraries being 
loaded/unloaded as a list of load addresses of the binary images; binaries are loaded in waves on a 
Darwin system).  The returned JSON packet is heavily tailored to include everything lldb needs to 
know about the binary image so it can match a file it finds on the local disk to the description 
and not read any memory at debug time -- we get the mach-o header, the UUID, the deployment target 
OS version, the load address of all the segments.  The packets lldb sends to debugserver look like
jGetLoadedDynamicLibrariesInfos:{"fetch_all_solibs":true}
or
jGetLoadedDynamicLibrariesInfos:{"solib_addresses":[4294967296,140733735313408,..]}


qXfer:libraries-svr4: returns an XML description of all binary images loaded, 
tailored towards an ELF view of binaries from a brief skim of ProcessGDBRemote. 
 I chose not to use this because we'd have an entirely different set of values 
returned in our xml reply for Mach-O binaries and to eliminate extraneous read 
packets from lldb, plus we needed a way of asking for a subset of all binary 
images.  A rich UI app these days can link to five hundred binary images, so 
fetching the full list when only a couple of binaries was just loaded would be 
unfortunate.


I'm trying to decide whether to (1) add a new qStandaloneBinaryInfo packet which returns the simple gdb 
RSP style "uuid:;address:0xADDR;" response, or (2) if we add a third mode to 
jGetLoadedDynamicLibrariesInfos 
(jGetLoadedDynamicLibrariesInfos:{"standalone_binary_image_info":true}) or (3) have the JTAG 
stub support a qXfer XML request (I wouldn't want to reuse the libraries-svr4 name and return an XML 
completely different, but it could have a qXfer:standalone-binary-image-info: or whatever).


I figured folks might have opinions on this so I wanted to see if anyone cares 
before I pick one and get everyone to implement it.  For me, I'm inclined 
towards adding a qStandaloneBinaryInfo packet - the jtag stub already knows how 
to construct these traditional gdb RSP style responses - but it would be 
trivially easy for the stub to also assemble a fake XML response as raw text 
with the two fields.



J



Hello Jason, everyone,

It sounds to me like, if the idea is to send a UUID through the link, 
that (re)using qXfer:libraries-svr4 for this purpose will not help with 
anything, as this packet knows nothing about UUIDs qXfer:libraries 
(without svr4) would be slightly better, as it not encode details of the 
posix dynamic linkers, but it still contains no mention of the UUID, and 
it is actually not supposed to return the main executable (just the 
proper libraries).


To retrieve the main executable name, gdb uses `qXfer:exec-file:read`, 
but this also does not include the UUID, so it's not useful on its own. 
One could maybe complement it with something like qXfer:exec-uuid:read, 
but I'm not sure whether I actually like that idea.


As for new packet vs. another mode to jGetLoadedDynamicLibrariesInfos -- 
I'm don't know. If this is supposed to be used on more systems, then I'd 
probably go with a new packet, as the existing one is pretty mach-o 
specific. If this is going to be an Apple thing, then maybe it does not 
matter so much..


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Removing linux mips support

2021-03-30 Thread Pavel Labath via lldb-dev

On 16/03/2021 00:37, Ed Maste via lldb-dev wrote:

On Mon, 15 Mar 2021 at 16:00, Ed Maste  wrote:


A brief note on the FreeBSD mips support - the FreeBSD Foundation is
sponsoring Michał to do this work, but our primary non-x86 focus is
amd64.


Oops, that doesn't make a lot of sense. I meant arm64 (AArch64) here.
Thanks to a couple of folks who pointed this out to me.


Thanks for the clarification, Ed. If FreeBSD mips code ends up 
bitrotting, it may indeed also suffer the same fate. However, this all 
depends on how much of it gets in the way of other things. Right now, 
the linux mips implementation stands in the way of adding the freebsd 
mips implementation cleanly (and a few other things), and this is the 
reason I'm proposing to remove it. Right now, I don't see any 
controversial bits in the new mips patch.


PSA: I'm going to proceed with the deletion later today.

thanks,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] Removing linux mips support

2021-03-09 Thread Pavel Labath via lldb-dev

Hi all,

I propose to remove support for linux mips debugging. This basically 
amounts to deleting 
source/Plugins/Process/Linux/NativeRegisterContextLinux_mips64.{cpp,h}. 
My reasons for doing that are:


- This code is unmaintained (last non-mechanical change was in 2017) and 
untested (no public buildbots), so we don't know if even basic 
functionality works, or if it indeed builds.


- At the same, it is carrying a lot of technical debt, which is leaking 
out of the mips-specific files, and interfering with other development 
efforts. The last instance of this is D96766, which is adding FreeBSD 
mips support, but needs to work around linux specific knowledge leaking 
into supposedly generic code. This one should be fixable relatively 
easily (these days we already have precedents for similar things in x86 
and arm code), but it needs someone who is willing to do that.


But that is not all. To support mips, we introduced two new fields into 
the RegisterInfo struct (dynamic_size_dwarf_{expr_bytes,len}). These are 
introducing a lot of clutter in all our RegisterInfo definitions (which 
we have **a lot** of) and are not really consistent with the long term 
vision of the gdb-remote protocol usage in lldb. These days, we have a 
different mechanism for this (added to support a similar feature in 
arm), it would be better to implement this feature in terms of that. I 
would tout this (removal of these fields) as the main benefit of 
dropping mips support.


So, unless someone willing to address these issues (I'm happy to provide 
support where I can), I propose we drop mips support. Generic mips 
support will remain (and hopefully be better tested) thanks to the 
FreeBSD mips port, so re-adding mips support should be a matter of 
reimplementing the linux bits.


regards,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Help fixing deadlock in DWARF symbol preloading

2021-02-09 Thread Pavel Labath via lldb-dev

On 05/02/2021 00:38, Jorge Gorbe Moya wrote:
Wouldn't it be preferable to try_lock in GetDescription (which is the 
one currently acquiring the mutex) instead?


I'd be uncomfortable with a function like GetDescription "randomly" 
doing nothing (returning an empty string, or whatever). OTOH, while it's 
definitely not ideal, I think I could live with ReportError not 
reporting anything (or skipping the description, or something) under 
some circumstances.


That said, I like both Greg's and Raphael's ideas on how to fix this 
(assuming they work)...


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Help fixing deadlock in DWARF symbol preloading

2021-02-04 Thread Pavel Labath via lldb-dev
Please have a look at 
, 
which is the last time this came up.



One quick'n'dirty solution would be to have `Module::ReportError` _try_ 
to get the module lock, and if it fails, just bail out. That obviously 
means you won't get to see the error message which triggerred the 
deadlock (though we could also play around with that and try printing 
the error message without the module description or something), but it 
will at least get you past that point...


pl

On 04/02/2021 21:04, Jorge Gorbe Moya via lldb-dev wrote:

Hi,

I've found a deadlock in lldb (see attached test case, you can build it 
with just `clang -o test test.s`), but I'm a total newbie and I have no 
idea what's the right way to fix it.


The problem happens when an error is found during DIE extraction when 
preloading symbols. As far as I can tell, it goes like this:


1. Module::PreloadSymbols locks Module::m_mutex
2. A few layers below it, we end up in ManualDWARFIndex::Index, which 
dispatches DIE extractions to a thread pool:


|for (size_t i = 0; i < units_to_index.size(); ++i) 
pool.async(extract_fn, i); pool.wait(); |


3. extract_fn in the snippet above ends up executing 
DWARFDebugInfoEntry::Extract and when there's an error during 
extraction, Module::GetDescription is called while generating the error 
message.
4. Module::GetDescription tries to acquire Module::m_mutex from a 
different thread, while the main thread has the mutex already locked and 
it's waiting for DIE extraction to end, causing a deadlock.


If we make Module::GetDescription not lock the problem disappears, so 
the diagnosis looks correct, but I don't know what would be the right 
way to fix it. Module::GetDescription looks more or less safe to call 
without locking: it just prints m_arch, m_file, and m_object_name to a 
string, and those look like fields that wouldn't change after the Module 
is initialized, so maybe it's okay? But I feel like there must be a 
better solution anyway. Any advice?


Best,
Jorge








___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev



___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Remote connection expansion?

2020-11-18 Thread Pavel Labath via lldb-dev

On 11/11/2020 20:11, Mike Mestnik via lldb-dev wrote:

On Mon, Nov 9, 2020 at 5:37 PM Greg Clayton  wrote:





On Nov 4, 2020, at 1:28 PM, Mike Mestnik via lldb-dev  
wrote:

I'm looking for support running lldb over ssh.  I can forward the
originating connection, but the run command is attempting to use
random ports on localhost to attain another connection.  This fails as
the localhost's are not the same.


When you say you want to run lldb over ssh, do you mean run "lldb-server"

Is there really an issue with saying these are both lldb?  Seems like
my statements were unambiguous without noting a distinction.


"Remote debugging" can mean different things to different people. Please 
assume good faith. I'm sure Greg asked this question because he was 
genuinely not sure what you meant, and not just to annoy you.




As lldb is not, the obvious path forward is to re-implement the lldb
IPC so it's more friendly to ssh.


I've been wanting to do something like that for a while, since the 
current design has a very 1970 (the decade FTP was invented) feel to it. 
However, the issue never came up on the projects that I worked on, so I 
couldn't find time to do that.


The way this currently works is that lldb sends a packet like 
"qSpawnGdbServer", which causes lldb-server platform to spawn a gdb 
server (either lldb-server gdbserver, or debugserver) and return the 
port number it is listening on. One way to change that would be to have 
lldb open *another* connection to the same lldb-server, and then issue 
something like "qExecGdbserver" (a new command). This command would 
cause the platform to exec (without forking) the debug server and pass 
the already established connection to it (something which we already 
support).


Then there would be no need for two ports, as both connections would be 
established through the same one.



Now I'm attempting forward error correction by guessing where this
topic could lead.  I would be willing to expand the network code to
include domain sockets, to replace the whole idea of using, IMHO
barbaric, port numbers.  This work could potentially include direct
support for ssh.  I understand that this would likely be a breaking
change, is there version negotiation?


Direct support for ssh might be interesting as well, though I am not 
sure what exactly would that mean. As for version negotiation, the way 
that's generally handled is by making a new gdb-remote packet or a 
"feature" (see 
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#qSupported) 
and then checking for that.


So, for example, in order to implement my idea, we could have the 
lldb-server platform send qSpawnGdbServer+ in its qSupported response, 
and then have lldb key off of that.


cheers,
pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Segmented Address Space Support in LLDB

2020-11-02 Thread Pavel Labath via lldb-dev

On 22/10/2020 10:25, Jason Molenda wrote:

Hi Greg, Pavel.

I think it's worth saying that this is very early in this project.  We know 
we're going to need the ability to track segments on addresses, but honestly a 
lot of the finer details aren't clear yet.  It's such a fundamental change that 
we wanted to start a discussion, even though I know it's hard to have detailed 
discussions still.

In the envisioned environment, there will be a default segment, and most 
addresses will be in the default segment.  DWARF, user input (lldb cmdline), SB 
API, and clang expressions are going to be the places where segments are 
specified --- Dump methods and ProcessGDBRemote will be the main place where 
the segments are displayed/used.  There will be modifications to the memory 
read/write gdb RSP packets to include these.

This early in the project, it's hard to tell what will be upstreamed to the 
llvm.org monorepo, or when.  My personal opinion is that we don't actually want 
to add segment support to llvm.org lldb at this point.  We'd be initializing 
every address object with LLDB_INVALID_SEGMENT or LLDB_DEFAULT_SEGMENT, and 
then testing that each object is initialized this way?  I don't see this 
actually being useful.

However, changing lldb's target addresses to be strictly handled in terms of 
objects will allow us to add a segment discriminator ivar to Address and 
ProcessAddress on our local branch while this is in development, and minimize 
the places where we're diverging from the llvm.org sources.  We'll need to have 
local modifications at the places where a segment is input (DWARF, cmdline, SB 
API, compiler type) or output (Dump, ProcesssGDBRemote) and, hopefully, the 
vast majority of lldb can be unmodified.

The proposal was written in terms of what we need to accomplish based on our 
current understanding for this project, but I think there will be a lot of 
details figured out as we get more concrete experience of how this all works.  
And when it's appropriate to upstream to llvm.org, we'll be better prepared to 
discuss the tradeoffs of the approaches we took in extending 
Address/ProcessAddress to incorporate a segment.

My hope is that these generic OO'ification of target addresses will not change 
lldb beyond moving off of addr_t for now.
I think that wrapping addr_t inside a class would be a nice change, even 
without the subsequent segmentification -- I'm hoping that this would 
add some type safety to the way we work with addresses (as we have 
various kinds of addresses that are all just plain ints). I'd like to 
see a concrete proposal for this class's interface though. (And I still 
remain mildly sceptical about automating this transition.)



To be honest, we haven't thought about the UI side of this very much yet.  I 
think there will be ABI or ArchSpec style information that maps segment numbers 
to human-understandable names.


The details of this are pretty interesting for the Wasm use case, as it 
does not have a fixed number of segments/address spaces -- every module 
gets its own address space. I suppose the Wasm ArchSpec could just say 
it has UINT32_MAX address spaces, and then the dynamic loader would just 
assign modules into address spaces based on some key.


The interesting aspect here would be that the DWARF does *not* contain 
address space information here (as it's all in the same address space), 
so there may need to be a way for it to say "I don't actually know my 
address space -- I'll go whereever the dynamic loader puts me".


Still pretty early to determine that, but I'm mentioning this as it is 
the last use case of someone needing address space support in lldb (even 
though it's a slightly stranger form of address spaces).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Segmented Address Space Support in LLDB

2020-10-20 Thread Pavel Labath via lldb-dev
There's a lot of things that are unclear to me about this proposal. The 
mechanics of representing an segmented address are one thing, but I I 
think that the really interesting part will be the interaction with the 
rest of lldb. Like
- What's going to be the source of this address space information? Is it 
going to be statically baked into lldb (a function of the target 
architecture?), or dynamically retrieved from the target or platform 
we're debugging? How would that work?
- How is this going to interact with Object/SymbolFile classes? Are you 
expecting to use existing object and symbol formats for address space 
information, or some custom ones? AFAIK, none of the existing formats 
actually support encoding address space information (though that hasn't 
stopped people from trying).


Without understanding the bigger picture it's hard for me to say whether 
the proposed large scale refactoring is a good idea. Nonetheless, I am 
doubtful of the viability of that approach. Some of my reasons for that are:
- not all addr_ts represent an actual address -- sometimes that is a 
difference between two addresses, which still uses addr_t, as that's 
guaranteed to fit.
- relatedly to that, there is a difference (I'd expect) between the 
operations supported by the two types. addr_t supports all integral 
operations (though I hope we don't use all of them), but I wouldn't 
expect to be able to do the same with a SegmentedAddress. For one, I'd 
expect it wouldn't be possible to add two SegmentedAddresses together 
(which is possible for addr_t). OTOH, adding a SegmentedAddress and an 
addr_t would probably be fine? Would subtracting two SegmentedAddresses 
should result in an addr_t? But only if they have matching address 
spaces (and assert otherwise)?
- I'd also be worried about over-generalizing specialized code which can 
afford to work with plain addresses, and where the added address space 
would be a nuisance (or a source of bugs). E.g. ELF has no notion of 
address space, so I don't think I'd find it helpful to replace all plain 
integer calculations in elf parsing code with something more complex. 
(I'm aware that some people are using elf to encode address space 
information, but this is a pretty nonstandard extension, and it'd take 
more than type substitution to support anything like that.)

- large scale refactorings are very much not the norm in llvm



On 19/10/2020 23:56, Jonas Devlieghere via lldb-dev wrote:
We want to support segmented address spaces in LLDB. Currently, all of 
LLDB’s external API, command line interface, and internals assume that 
an address in memory can be addressed unambiguously as an addr_t (aka 
uint64_t). To support a segmented address space we’d need to extend 
addr_t with a discriminator (an aspace_t) to uniquely identify a 
location in memory. This RFC outlines what would need to change and how 
we propose to do that.


### Addresses in LLDB

Currently, LLDB has two ways of representing an address:

  - Address object. Mostly represents addresses as Section+offset for a 
binary image loaded in the Target. An Address in this form can persist 
across executions, e.g. an address breakpoint in a binary image that 
loads at a different address every execution. An Address object can 
represent memory not mapped to a binary image. Heap, stack, jitted 
items, will all be represented as the uint64_t load address of the 
object, and cannot persist across multiple executions. You must have the 
Target object available to get the current load address of an Address 
object in the current process run. Some parts of lldb do not have a 
Target available to them, so they require that the Address can be 
devolved to an addr_t (aka uint64_t) and passed in.
  - The addr_t (aka uint64_t) type. Primarily used when receiving input 
(e.g. from a user on the command line) or when interacting with the 
inferior (reading/writing memory) for addresses that need not persist 
across runs. Also used when reading DWARF and in our symbol tables to 
represent file offset addresses, where the size of an Address object 
would be objectionable.


## Proposal




### Address + ProcessAddress

  - The Address object gains a segment discriminator member variable. 
Everything that creates an Address will need to provide this segment 
discriminator.
  - A ProcessAddress object which is a uint64_t and a segment 
discriminator as a replacement for addr_t. ProcessAddress objects would 
not persist across multiple executions. Similar to how you can create an 
addr_t from an Address+Target today, you can create a ProcessAddress 
given an Address+Target. When we pass around addr_ts today, they would 
be replaced with ProcessAddress, with the exception of symbol tables 
where the added space would be significant, and we do not believe we 
need segment discriminators today.


I'm strongly in favor of the first approach. The reason for that is that 
we have a lot of code that can only reasonable deal with one kind 

Re: [lldb-dev] lldb 11.0.0-rc2 different behavior then gdb.

2020-10-07 Thread Pavel Labath via lldb-dev

On 07/10/2020 21:01, Jim Ingham wrote:



On Oct 7, 2020, at 11:44 AM, Pavel Labath > wrote:


On 07/10/2020 20:42, Jim Ingham via lldb-dev wrote:
There isn’t a built-in summary formatter for two dimensional arrays 
of chars, but the type is matching the regex for the one-dimensional 
StringSummaryFormat, but that doesn’t actually know how to format two 
dimensional arrays of chars.  The type regex for StringSummaryFormat:

char [[0-9]+]
We should refine this regex so it doesn’t catch up two dimensional 
strings.  We could also write a formatter for two-dimensional strings.


Do we need a special formatter for two-dimensional strings? What about 3D?

I'd hope that this could be handled by a combination of the simple 
string formatter and the generic array dumping code...


That works as expected, for instance if you do:

(lldb) frame var z.i
(char [2][4]) z.i = {
   [0] = "FOO"
   [1] = "BAR"
}

The thing that isn’t working is when the array doesn’t get auto-expanded 
by lldb, then you see the summary instead,


Ah, interesting. I didn't realize that.

which is what you are seeing 
with:


(lldb) frame var z
(b) z = (i = char [2][4] @ 0x7ffeefbff5f0)

You can fix this either by having a summary string for char [][] or by 
telling lldb to expand more pointer like children for you:


(lldb) frame var -P2 z
(b) z = {
   i = {
     [0] = "FOO"
     [1] = "BAR"
   }
}

  I’m hesitant to up the default pointer depth, I have gotten lots of 
complaints already about lldb disclosing too many subfields when 
printing structures.


Yeah, I don't think we'd want to increase that.



We could also try to be smarter about what constitutes a “pointer” so 
the arrays don’t count against the pointer depth? Not sure how workable 
that would be.


This sounds workable. I mean, an array struct member is not really a 
pointer (it only decays to a pointer) and does not suffer from the 
issues that pointers do -- infinite recursion with recursive data 
structures, etc.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] lldb 11.0.0-rc2 different behavior then gdb.

2020-10-07 Thread Pavel Labath via lldb-dev

On 07/10/2020 20:42, Jim Ingham via lldb-dev wrote:
There isn’t a built-in summary formatter for two dimensional arrays of 
chars, but the type is matching the regex for the one-dimensional 
StringSummaryFormat, but that doesn’t actually know how to format two 
dimensional arrays of chars.  The type regex for StringSummaryFormat:


char [[0-9]+]

We should refine this regex so it doesn’t catch up two dimensional 
strings.  We could also write a formatter for two-dimensional strings.


Do we need a special formatter for two-dimensional strings? What about 3D?

I'd hope that this could be handled by a combination of the simple 
string formatter and the generic array dumping code...


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Deadlock loading DWARF symbols

2020-10-05 Thread Pavel Labath via lldb-dev

On 02/10/2020 23:13, Greg Clayton wrote:

Yes this is bad, and GetDescription() is used as a convenience to print out the 
module path (which might be a .o file within a .a file) and optionally 
architecture of the module. It probably shouldn't be taking the module lock as 
the only member variables that that GetDescription accesses are:

Module::m_arch
Module::m_file
Module::m_object_name

I would almost vote to take out the mutex lock in GetDescription() as the arch, 
file and name don't change after the module has been created. I am going to CC 
a few extra folks for discussion.

Anyone else have any objections to removing the mutex in GetDescription? Seems 
like this deadlock is easy to trigger if you have DWARF with errors or warnings 
inside of it.



That sounds reasonable to me. All of the above fields can change during 
the early stages of Module construction (while the ObjectFile is being 
parsed and such), but I would certainly hope they remain stable after 
that. And this early construction process should be single-threaded.


So, I am fine with saying any subsequent modification is a bug.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: Processor Trace Support in LLDB

2020-10-02 Thread Pavel Labath via lldb-dev
On 01/10/2020 22:32, Walter wrote:
> After a chat with Greg, we agreed on this set of commands
> 
> 
> trace load /path/to/json process trace start/stop process trace save
> /path/to/json thread trace start/stop thread trace dump [instructions |
> functions]
> 

Thanks. The new commands look good to me.


The multi-process trace concept is interesting. I don't question its
usefulness -- I am sure it can be useful for various kinds of analysis
(though I've never used that myself). I am wondering though about how to
represent this thing in lldb, as we don't really have anything close to
the concept of "debugging" all processes on a given system.

The only thing that comes close is probably the kernel-level debugging.
One idea (which has just occurred to me, so it may not be good) might be
to make these traces behave similarly to that. I.e., create a single
target/process with one "thread" per physical cpu, and then have a
special "os plugin" like thing which would present individual
process/threads.

That would have the advantage of maintaining the one trace-one target
invariant and also would preserve the information about relative timings
of individual "processes". I think that wuold be an interesting way to
view these things, but I don't know if it would be the best one...

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: Processor Trace Support in LLDB

2020-10-01 Thread Pavel Labath via lldb-dev
Thank you for writing this Walter. I think this document will be a
useful reference both now and in the future.

The part that's not clear to me is what is the story with multi-process
traces. The file format enables those, but it's not clear how are they
going be created or used. Can you elaborate more on what you intend to
use those for?

The main reason I am asking that is because I am thinking about the
proposed command structure. I'm wondering if it would not be better to
fit this into the existing target/process/thread commands instead of
adding a new top-level command. For example, one could imagine the
following set of commands:

- "process trace start" + "thread trace start" instead of "thread trace
[tid]". That would be similar to "process continue" + "thread continue".
- "thread trace dump [tid]" instead of "trace dump [-t tid]". That would
be similar to "thread continue" and other thread control commands.
- "target create --trace" instead of "trace load". (analogous to target
create --core).
- "process trace save" instead of "trace save" -- (mostly) analogous to
"process save-core"

I am thinking this composition may fit in better into the existing lldb
command landscape, though I also see the appeal in grouping everything
trace-related under a single top-level command. What do you think?

The main place where this idea breaks down is the multi-process traces.
While we could certainly make "target create --trace" create multiple
targets, that would be fairly unusual. OTOH, the whole concept of having
multiple targets share something is a pretty unusual thing for lldb.
That's why I'd like to hear more about where you want to go with this idea.


On 21/09/2020 22:17, Walter via lldb-dev wrote:
> Thanks for your feedback Fangrui, I've just been checking Capn' Proto
> and it looks really good. I'll keep it in mind in the design and see how
> it can optimize the overall data transfer.

I'm not sure how Cap'n Proto comes into play here. The way I understand
it, the real data is contained in a separate file in the specialized
intel format and the json is just for the metadata. I'd expect the
metadata file to be small even for enormous traces, so I'm not sure
what's to be gained by optimizing it.

pl

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] LLDB got SIGCHLD on hitting the breakpoint

2020-09-29 Thread Pavel Labath via lldb-dev
When you say "execute binary code", where exactly is this code being
executed? Is it executed by launching another process?

Lldb will not automatically debug all child process spawned by your
process -- they will run freely.

The SIGCHLDs are not coming from lldb -- they are signals which all
processes receive (from the operating system) when one of their children
dies. They just mean that your process has executed some subprocess and
that subprocess has terminated.

pl

On 21/09/2020 11:48, le wang via lldb-dev wrote:
> Hello,everyone:
> I've got a problem, when debugging my process with lldb tool on linux
> OS(CentOS7).While I use lldb command to set breakpoints, and launch my
> process, my process will execute a binary code which contains debug
> information, but when my process launched, all breakpoints can not be
> hit, The debug process and several received informations like below:  
> (lldb)target create  /home/out/lib/linux64/Debug/appEngine 
> Current executable set to '/home/out/lib/linux64/Debug/appEngine'
>  (x86_64)     
> (lldb)br s -f
> /home/out/lib/linux64/Debug/configDB/TestFunctionProcess.cpp -l1
> Breakpoint 1: no locations(pending).
> WARNING :  Unable to resolve breakpoint to any actual locations.
> (lldb)br s -f
> /home/out/lib/linux64/Debug/configDB/TestFunctionProcess.cpp -l2
> Breakpoint 2: no locations(pending).
> WARNING :  Unable to resolve breakpoint to any actual locations.
> (lldb)r
> Process 4256 launched: '/home/out/lib/linux64/Debug/appEngine'  (x86_64)
>     
> Process 4256 stopped and restarted: thread1 received signal:   SIGCHLD
> Process 4256 stopped and restarted: thread1 received signal:   SIGCHLD
> Process 4256 stopped and restarted: thread1 received signal:   SIGCHLD
> Process 4256 stopped and restarted: thread1 received signal:   SIGCHLD
> Process 4256 stopped and restarted: thread1 received signal:   SIGCHLD
> Process 4256 stopped and restarted: thread2 received signal:   SIGCHLD
> 
> Process stopped and restarted: thread 2 received signal: SIGCHLD
> It seems these repeated restart notifications come from lldb, and at
> last although my process is executed,  it is meaningless. I have checked
> that debug information in IR is correct. I have no idea the reason. Can
> anyone tell me the reason and how to fix this problem. My lldb version
> is 5.0.0, which got from http://www.llvm.org/ with llvm5.0.0
> 
> 
> Thanks,
> le wang
> 
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Deprecating Python2 and adding type-annotations to the python API

2020-08-11 Thread Pavel Labath via lldb-dev
On 04/08/2020 02:37, Jonas Devlieghere via lldb-dev wrote:
> Hi Nathan,
> 
> Thanks for bringing this up. I've been expecting this question for a
> while now. 
> 
> Python 2 is end-of-life and we should move to Python 3. I'm pretty sure
> nobody here disagrees with that. Unfortunately though, we still have
> consumers, both internally and externally, that still rely on it. We're
> actively making an effort to change that, but we're not quite there yet. 
> 
> That said, I think we should continue moving in that direction. In line
> with the rest of LLVM moving to Python 3 by the end of the year, we've
> already made it the default. All our bots on GreenDragon are also
> building against Python 3.
> 
> As a first step, for the next release, I propose we remove the fallback
> to Python 2 and make it the only supported configuration. At the same
> time we can convert any scripts and tools (I'm thinking of the lit
> configurations, the lldb-dotest and lldb-repro wrappers, etc) to be
> Python 3 only. During this time however, we'd ask that the bindings and
> the test suite remain compatible with Python 2. Given that Python 3 is
> the only supported configuration for developers, we'd take on the burden
> of maintaining Python 2 compatibility in the test suite and correcting
> (accidental) incompatibilities.
> 
> When the 12.0 release is cut, we can reconsider the situation. If we're
> still not ready by then to drop Python 2 support, I  propose another
> intermediate step where we remove Python 2 support from the upstream
> repository, but ask the community to not actively modernize the test
> suite and the bindings. In this situation we'd be dealing with the merge
> conflicts in our downstream fork and this would avoid an endless number
> of conflicts in the test suite. 
> 
> Finally, presumably after the 13.0 release, we'd drop that last
> requirement. 
> 
> Please let me know if you think that sounds like a reasonable timeline.
> 
> Thanks,
> Jonas
> 



This sounds like a good plan to me. If lldb is supposed to remain
buildable with python2, then I also don't see a problem with keeping the
cmake bits which enable that. We could make it harder to build with
python2 accidentally by requiring the user to set a variable similar to
the LLVM_TEMPORARILY_ALLOW_OLD_TOOLCHAIN thingy.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-23 Thread Pavel Labath via lldb-dev
On 22/07/2020 01:31, Jim Ingham wrote:
> 
> 
>> On Jul 21, 2020, at 9:27 AM, Pavel Labath > > wrote:
>> I do see the attractiveness of constructing of a full compiler type. The
>> reason I am hesitant to go that way, because it seems to me that this
>> would negate the two main benefits of the frame variable command over
>> the expression evaluator: a) it's fast; b) it's less likely to crash.
>>
>> And while I don't think it will be as slow or as crashy as the
>> expression evaluator, the usage of the ast importer will force a lot
>> more types to be parsed than are strictly needed for this functionality.
>> And the insertion of all potentially conflicting types from different
>> modules into a single ast context is also somewhat worrying.
> 
> Importation should be incremental as well, so this shouldn’t make things
> that much slower.  And you shouldn’t ever be looking things up by name
> in this AST so you wouldn’t be led astray that way.  You also are going
> to have to do pretty much the same job for “expr”, right?  So you
> wouldn’t be opening new dangerous pathways.

The import is not as incremental as we might want, and it actually sort
of depends on what is the state of the source ast. Let's the source AST
has types A and B, and A depends on B in some way (say as a method
argument). Let's say that A is complete (parsed) and B isn't. While
importing A, the ast importer will import the method which has the B
argument, but whether it will not descend into B (and cause us to parse it).
If however, B happens to be B already parsed then it will import B and
all of its base classes (but not fields and methods).

On top of that we also have our own additions -- whenever we encounter a
method returning a pointer, we import the pointer target type (this has
to do with covariant return types). These things compound and so even a
simple import can end up importing quite a lot.

I actually tried making the ast importer more lazy -- I have a proof of
concept, but it required adding more explicit lookups into clang's Sema,
so that's why I haven't pursued it yet.

I could also try to disable some of these things for these frame
variable imports (they don't need methods at all), but then I would be
opening new dangerous pathways...


> 
> OTOH, the AST’s are complex beasts, so I am not unmoved by your worries...

Yeah... :)

>> The dlclose issue is an interesting one. Presumably, we could ensure
>> that the module does not go away by storing a module shared (or weak?)
>> pointer somewhere inside the value object. BTW, how does this work with
>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>> belonging to a different module, does anything ensure this module does
>> not go away? Or when dereferencing a pointer to an type which is not
>> complete in the current module?
> 
> I don’t think at present we do anything smart about this.  It’s just
> always bugged me at the back of my brain that we could get into trouble
> with this, and so I don’t want to do something that would make it worse,
> especially in a systemic way.

Is there a reason we don't store a pointer to the module where the
TypeSystem came from? We could do either do that for all ValueObjects,
or just when the type system changes (casts, dereferences of incomplete
types, and now -flimit-debug-info) ?

> 
>>
>> I'm hoping that this stuff won't be "hard work". I haven't prototyped
>> the code yet, but I am hoping to keep this lookup code in under 200 LOC.
>> And as Greg points out, there are ways to put this stuff into the type
>> system -- I'm just not sure whether that is needed given that the
>> ValueObject class is the only user of the GetIndexOfChildMemberWithName
>> interface. The whole function is pretty clearly designed with
>> ValueObject::GetChildMemberWithName in mind.
> 
> It seems fine to me to proceed along the lines you propose.  If it ends
> up being smooth sailing, I can’t see any reason not to do it this way.
>  When/If you end up having lots of corner cases to manage, would be the
> time to consider cutting back to using the real type system to back
> these computations.

Ok, sounds good. Let me create a prototype for this, and we'll see how
it goes from there. It may take a while because I'm now entangled in
some line table stuff.


On 21/07/2020 23:23, Greg Clayton wrote:
>> On Jul 21, 2020, at 9:27 AM, Pavel Labath  wrote:
>> The dlclose issue is an interesting one. Presumably, we could ensure
>> that the module does not go away by storing a module shared (or weak?)
>> pointer somewhere inside the value object. BTW, how does this work with
>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>> belonging to a different module, does anything ensure this module does
>> not go away? Or when dereferencing a pointer to an type which is not
>> complete in the current module?
>
> I am not sure dlclose is a problem, the module won't usually be
cleaned up. And that shared 

Re: [lldb-dev] Break setting aliases...

2020-07-23 Thread Pavel Labath via lldb-dev
On 22/07/2020 19:50, Jim Ingham wrote:
>> On Jul 22, 2020, at 12:34 AM, Pavel Labath  wrote:
>>
>> The "--" is slightly unfortunate, but it's at least consistent with our
>> other commands taking raw input. We could avoid that by making the
>> command not take raw input. I think most of the "modes" of the "b"
>> command wouldn't need quoting in most circumstances -- source regex and
>> "lib`func" modes being exceptions.
> 
> If anybody wants to work on this, I think Jonas is right, the first step 
> would be to convert it to an actual command not a regex command.  The 
> _regexp-break command is already hard enough to comprehend.
> 
> You could still do the actual specifier parsing with a series of regex’s if 
> that seems best, though there might be easier ways to do it.  I also don’t 
> think this would need to be a raw command, rather it could be a parsed 
> command with one argument which was the breakpoint specifier and then all the 
> other breakpoint flags.  
> 
> All the specifications you can currently pass to b are single words w/o 
> spaces in them, or if they do have spaces they are the things you are used to 
> having to quote in lldb: like file names with spaces.  
The lib`func notation contains a backtick, which is used for expression
substitution in the command interpreter. Currently we seem to be just
dropping an unmatched backtick, which would break that.  We could change
it so that the unmatched backtick is kept, though I would actually
prefer to make that an error..


>> "br set" help starts with a long list command switches, which are
>> supposed to show which options can be used together. I think this sort
>> of listing is nice when the command has a couple of modes and a few
>> switches, but it really misses the mark when it ends up listing 11 modes
>> with approximately 20 switches in each one.
>>
>> This is then followed by descriptions of the 20 or so switches. This
>> list is alphabetical, which means the most commonly used options end up
>> burried between the switches I've never even used.
> 
> Yes.  I’ve said many times before that making “break set” the master command 
> for breakpoint setting was a mistake.  ...


Restructuring the commands is one thing. It might be a good idea but
there are less invasive things we could do to make this better. Just
restructuring the help output to make the most common use cases easier
to find would help a lot IMO. We could drop or simplify the "synopsis"
section, maybe replacing it with a couple of examples of the most useful
kinds of breakpoints.

Then we could group the options to keep the similar ones together and
make them easier to find/skip. Maybe with groups like:
- options specifying where to set a breakpoint: --file, --line; --name; etc.
- options restricting the reported breakpoint hits:
--thread-id,--thread-name,--condition, etc.
- various modifiers: --disable, --hardware, --one-shot, etc.
- others (?)

The division may not be perfect (like, is --ignore-count a "modifier" or
does it "restrict breakpoint hits"?), but even so I think this would
make that list a lot easier to navigate. But we digress...

On 22/07/2020 20:20, Greg Clayton wrote:
> BTW: to see what things expand to after reach regex alias, just set
this setting first:
>
> (lldb) settings set interpreter.expand-regex-aliases true

That is cool. I wonder if that should be the default...

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Break setting aliases...

2020-07-22 Thread Pavel Labath via lldb-dev
I use "b" for file+line breakpoints and "br set -n" for function
breakpoints, mainly because I couldn't be bothered to figure out how
that works with the "b" command. Now that I have studied the command
once again, I may try using it for function breakpoints as well...

I don't have any hard objections to new aliases (if "b" for name
breakpoints doesn't pan out, I may start using "bn" as a shorthand for
"br set -n"), though I do find the idea of beefing up the "b" command
more appealing.

On 22/07/2020 00:03, Jim Ingham via lldb-dev wrote:
> 
> 
>> On Jul 21, 2020, at 2:54 PM, Greg Clayton > > wrote:
>>
>>
>>
>>> On Jul 21, 2020, at 10:22 AM, Jim Ingham via lldb-dev
>>> mailto:lldb-dev@lists.llvm.org>> wrote:
>>>
>>> When we were first devising commands for lldb, we tried to be really
>>> parsimonious with the one & two letter unique command strings that
>>> lldb ships with by default.  I was trying to leave us as much
>>> flexibility as possible as we evolved, and I also wanted to make sure
>>> we weren’t taking up all the convenient short commands, leaving a
>>> cramped space for user aliases.
>>>
>>> The _regex_break command was added (and aliased by default to ‘b’) as
>>> a way to allow quick access for various common breakpoint setting
>>> options.  However it suffers from the problem that you can only
>>> provide the options that are recognized by the _regexp_break command
>>> aliases.  For instance, you can’t add the -h option to make a
>>> hardware breakpoint.  Because the “_regex_break command works by
>>> passing the command through a series of regex’s stopping at the first
>>> match, trying to extend the regular expressions to also include
>>> “anything else” while not causing one regex to claim a command that
>>> was really meant for a regex further on in the series is really tricky.
>>>
>>> That makes it kind of a wall for people.  As soon as you need to do
>>> anything it doesn’t support you have to go to a command that is not
>>> known to you (since “b” isn’t related to “break set” in any way that
>>> a normal user can actually see.)
>>>
>>> However, lldb has been around for a while and we only have two unique
>>> commands of the form “b[A-Za-z]” in the current lldb command set (br
>>> and bt).  So I think it would be okay for us to take up a few more
>>> second letter commands to make setting breakpoints more convenient.
>>>  I think adding:
>>>
>>> bs (break source) -> break set -y
>>
>> Is -y a new option you would add? I don't see it. We have --file and
>> --line
> 
> Added it yesterday.
> 
>>
>>> ba (break address) -> break set -a
>>> bn (break name) -> break set -n
>>>
>>> would provide a convenient way to set the most common classes of
>>> breakpoints while not precluding access to all the other options
>>> available to “break set”.  We could still leave “b” by itself for the
>>> _regex_break command - people who’ve figured out it’s intricacies
>>> shouldn’t lose their investment.  This would be purely additive.
>>>
>>> What do people think?
>>
>> Can we modify the _regex_break to accept options at the start or end
>> of the command somehow? 
>>
> 
> When the principle of so much of the rest of the lldb command line is
> that this sort of positional ordering is NOT necessary, doing this would
> be a shame.  At that point, I think Jonas suggestion of having a command
>  “break break-spec-set” or whatever, that took the breakpoint modify
> option group and then a specifier as an argument(s) which get parsed in
> the same way that “_regexp_break” does would be a better long-term
> supportable option.


Couldn't we have "b" command work the same way as the "expr" command? If
the user passes no arguments then he can just do "b whatever". And if he
also wants to add any parameters then he can do "b --hardware -- whatever".

The "--" is slightly unfortunate, but it's at least consistent with our
other commands taking raw input. We could avoid that by making the
command not take raw input. I think most of the "modes" of the "b"
command wouldn't need quoting in most circumstances -- source regex and
"lib`func" modes being exceptions.

On 21/07/2020 20:13, Jonas Devlieghere via lldb-dev wrote:
>  Furthermore, with a first-class command we can do a better job on the
help front which is really underwhelming for _regexp_break command aliases.

FWIW, this is the first time that I looked at the help for the "b"
command, and I have to say I found it more understandable than the "br
set" command. :P

"br set" help starts with a long list command switches, which are
supposed to show which options can be used together. I think this sort
of listing is nice when the command has a couple of modes and a few
switches, but it really misses the mark when it ends up listing 11 modes
with approximately 20 switches in each one.

This is then followed by descriptions of the 20 or so switches. This
list is alphabetical, which means the most commonly used options end up
burried between 

Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-21 Thread Pavel Labath via lldb-dev
On 20/07/2020 23:25, Jim Ingham wrote:
> It seems like you are having to work hard in the ValueObject system because 
> you don’t want to use single AST Type for the ValueObject’s type.  Seems like 
> it be much simpler if you could cons up a complete type in the 
> ScratchASTContext, and then use the underlying TypeSystem to do the layout 
> computation.
> 
> Preserving the full type in the scratch context also avoids other problems.  
> For instance, suppose module A has a class that has an opaque reference to a 
> type B.  There is a full definition of B in modules B and C.  If you make up 
> a ValueObject for an object of type A resolving the full type to the one in 
> Module B you can get into trouble.  Suppose the next user step is over the 
> dlclose of module B.  When the local variable goes to see if it has changed, 
> it will stumble across a type reference to a module that’s no longer present 
> in the program.  And if somebody calls RemoveOrphanedModules it won’t even be 
> in the shared module cache.
> 
> You can try to finesse this by saying you can choose the type from the 
> defining module so it can’t go away.  But a) I don’t think you can know that 
> for non-virtual classes in C++ and I don’t think you guarantee you can know 
> how to do that for any given language.
> 
> I wonder if it wouldn’t be a better approach to build up a full compiler-type 
> by importing the types you find into the scratch AST context.  That way you 
> know they can’t go away.   And since you still have a full CompilerType for 
> the variable, you can let the languages tell you where to find children based 
> on their knowledge of the types.
> 

I do see the attractiveness of constructing of a full compiler type. The
reason I am hesitant to go that way, because it seems to me that this
would negate the two main benefits of the frame variable command over
the expression evaluator: a) it's fast; b) it's less likely to crash.

And while I don't think it will be as slow or as crashy as the
expression evaluator, the usage of the ast importer will force a lot
more types to be parsed than are strictly needed for this functionality.
And the insertion of all potentially conflicting types from different
modules into a single ast context is also somewhat worrying.

The dlclose issue is an interesting one. Presumably, we could ensure
that the module does not go away by storing a module shared (or weak?)
pointer somewhere inside the value object. BTW, how does this work with
ValueObject casts right now? If I cast a ValueObject to a CompilerType
belonging to a different module, does anything ensure this module does
not go away? Or when dereferencing a pointer to an type which is not
complete in the current module?

I'm hoping that this stuff won't be "hard work". I haven't prototyped
the code yet, but I am hoping to keep this lookup code in under 200 LOC.
And as Greg points out, there are ways to put this stuff into the type
system -- I'm just not sure whether that is needed given that the
ValueObject class is the only user of the GetIndexOfChildMemberWithName
interface. The whole function is pretty clearly designed with
ValueObject::GetChildMemberWithName in mind.

Another thing I like about this approach is that it will mostly use the
same code path for the limit and no-limit debug info scenarios. OTOH,
I'm pretty sure we would want to use the scratch context thingy only for
types that are really not complete in their own modules, which would
leave the scratch context method as a fairly complex, but rarely
exercised path.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-20 Thread Pavel Labath via lldb-dev
Hello all,

With the expression evaluation part of -flimit-debug-info nearly
completed, I started looking at doing the same for the "frame variable"
command.

I have thought this would be simpler than the expression evaluator,
since we control most of that code. However, I have hit one snag, hence
this RFC.

The problem centers around how to implement
ValueObject::GetChildMemberWithName, which is the engine of the
subobject resultion in the "frame variable" command. Currently, this
function delegates most of the work to
CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
indexes needed to access the relevant subobject. The list aspect is
important, because the desired object can be in a base class or in a C11
anonymous struct member.

The CompilerType instance in question belongs to the type system of the
module from which we retrieved the original variable. Therein lies the
problem -- this type system does not have complete information about the
contents of the base class subobjects.

Now, my question is what to do about it. At the moment, it seems to me
that the easiest solution to this problem would be to replace
CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
- Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
children with the given name
- IsTransparent -- whether to descend into the type during name lookups
(i.e., is this an anonymous struct member)

The idea is that these two functions (in conjunction with existing
methods) can provide their answers even in a -flimit-debug-info setting,
and they also provide enough information for the caller to perform the
full name lookup himself. It would first check for direct members, and
if no matches are found, (recursively) proceed to look in all the
transparent members and base classes, switching type systems if the
current one does not contain the full type definition.

The downside of that is that this would hardcode a specific, c++-based,
algorithm which may not be suitable for all languages. Swift has a
fairly simple inheritance model, so I don't think this should be a
problem there, but for example python uses a slightly different method
to resolve ambiguities. The second downside is that a faithful
implementation of the c++ model, including the virtual inheritance
dominance is going to be fairly complicated.

The first issue could be solved by moving this logic into the clang
plugin, but making it independent of any specific type system instance.
The second issue is unavoidable, except by creating a unified view of
the full type in some scratch ast context, as we do for expression
evaluation.

That said, it's not clear to me how faithful do we need the "frame
variable" algorithm to be. The frame variable syntax does not precisely
follow the c++ semantics anyway. And a simple "recurse into subclasses"
algorithm is going to be understandable and be "close enough" under
nearly all circumstances. Virtual inheritance is used very seldomly, and
shadowing of members defined in a base class is even rarer.

While analysing this code I've found much more serious bugs (e.g.,
accessing a transparent member fetches a random other value if the class
it is in also has base cases; fetching a transparent member in a base
class does not work at all), which seem to have existed for quite some
time without being discovered.

For that reason I am tempted to just implement a basic "recurse into
subclasses" algorithm inside ValueObject::GetChildMemberWithName, and
not bother with the virtual inheritance details, nor with being able to
customize this algorithm to the needs of a specific language.

What do you think?

regards,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Pre-merge lldb testing

2020-05-19 Thread Pavel Labath via lldb-dev
I believe pre-merge testing would be very useful.

The thing which I would find very useful (let's call it a feature
request) is to be able to get some sort of an overview/dashboard of all
the runs on this infrastructure and their results. It doesn't have to be
super detailed -- the question which I want an answer to is "which
builds have an lldb test failing", and the reason is to check for
possible flakyness.

The reason I am bringing this up is because lldb tests tend to be more
flaky than most llvm tests (for various reasons). I've been trying to
hunt down all the causes of this, and I believe we're in a pretty good
shape right now (I haven't seen a flaky build on linux for at least
three weeks), but it is not uncommon for new test setups to uncover new
sources of flakyness.

In particular, I am interested in the behavior of tests exercising the
hardware debug features of cpus (e.g. hardware watchpoints), as I know
that a lot of virtualization environments don't virtualize these
properly/reliably (and IIUC, this infrastructure uses at least one level
of virtualization). If these tests are not reliably there, we should
avoid running them on this infrastructure.

I don't think this needs to block anything, but I want to make sure
everyone is aware of the possible issues.

pl

On 17/05/2020 03:08, Eric Christopher via lldb-dev wrote:
> 
> 
> On Sat, May 16, 2020 at 12:18 PM Greg Clayton  > wrote:
> 
> 
> 
>> On May 15, 2020, at 7:04 PM, Eric Christopher via lldb-dev
>> mailto:lldb-dev@lists.llvm.org>> wrote:
>>
>> Hi All,
>>
>> We've been testing[1] a number of patches upstream by default via
>> some pre-merge checks in phabricator. I was thinking of turning
>> them on for lldb as well. Mostly it well just help people know
>> whether or not they've broken lldb before they commit something,
>> but won't stop committing or do anything else that direction.
> 
> I am all for it! 
> 
>> Let me know what you think and otherwise I'd like to turn it on in
>> a week or so. This will also help keep the test suite a little
>> cleaner on linux FWIW.
> 
> Please do.
> 
>> There are a few additional links down below and if you have any
>> questions send them my way.
> 
> Will the lldb tests be run automagically if and only if lldb code is
> modified in the patch? 
> 
> 
> I don't think our dependencies in cmake are that good for tests ...
> especially since lldb uses a largeish chunk of clang and llvm anyhow :)
> 
> -eric
> 
>  
> 
>> Thanks!
>>
>> -eric
>>
>>
>> [1]
>> 
>> https://github.com/google/llvm-premerge-checks/blob/master/docs/user_doc.md
>> [2] https://reviews.llvm.org/project/members/78/
>> [3] https://github.com/google/llvm-premerge-checks/issues
>> ___
>> lldb-dev mailing list
>> lldb-dev@lists.llvm.org 
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> 
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Is there a just-my-code like debugging mode for LLDB?

2020-05-14 Thread Pavel Labath via lldb-dev
On 14/05/2020 11:56, Jaroslav Sevcik wrote:
> 
> The svr4 support seems to be off by
> default: 
> https://github.com/llvm/llvm-project/blob/2974b3c566d68f1d7c907f891137cf0292dd35aa/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemoteProperties.td#L14
> 
> It would definitely make sense to turn it on by default.

Done (deea174ee5).

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Is there a just-my-code like debugging mode for LLDB?

2020-05-14 Thread Pavel Labath via lldb-dev
On 14/05/2020 03:50, Emre Kultursay via lldb-dev wrote:
> One thing I want to try is "settings set
> plugin.process.gdb-remote.use-libraries-svr4 true".

Isn't that the default? The reason this setting was added was so we
could test the !svr code path without forcibly disabling xml support
(and possibly workaround any svr issues). However, having it on as a
default definitely makes sense?

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Is there a just-my-code like debugging mode for LLDB?

2020-05-11 Thread Pavel Labath via lldb-dev
Hi Emre,

I have to say I'm pretty sceptical about this approach, for all the
reasons that Jim already mentioned. Making it so that lldb pretends the
other shared libraries don't exist (which is what I assume your
prototype does) is fairly easy, but it does create a very long tail of
"feature X does not work in this mode" problems (my app crashed in a
third-party library -- maybe because I passed it a wrong argument -- but
I cannot even unwind into my code to see what I passed).

It's the fixing these problems that will make the solution very complicated.

On 08/05/2020 22:21, Jim Ingham via lldb-dev wrote:
> Note, if you are reading the binaries out of memory from the device, and
> don’t have local symbols, things go much more slowly.  gdb-remote is NOT
> a high bandwidth protocol, and fetching all the symbols through a series
> of memory reads is pretty slow.  lldb does have a setting to control
> what you do with binaries that don’t exist on the host
> (target.memory-module-load-level) that controls this behavior.  But it
> just deals with what we do and don’t read and makes no attempt to
> ameliorate the fallout from having a reduced view of the symbols in the
> program.
We don't load modules from memory on android (with elf that doesn't even
work right now in lldb). We download files directly via the adb
protocol, which is much faster than gdb-remote's qFile. So I'm afraid
the problem is going to be more complicated than that.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Upstreaming Reproducer Capture/Replay for the API Test Suite

2020-04-07 Thread Pavel Labath via lldb-dev
Hi Jonas, Davide,

I am not exactly thrilled by the ever-growing number of "modes" our test
suite can be run in. However, it seems that's a battle I am destined to
loose, so I'll just repeat what I've been saying for some time now.

I don't believe that either of these funny "modes" should be the _only_
way to test a given piece of code. Using the extra modes to increase
test coverage is fine, and I can certainly appreciate the value of this
kind of exploratory testing (I've added some temporary modes locally
myself when working on various patches), but I still believe that every
patch should have an accompanying test(s) which can run in the default
"mode" and with as few dependencies as possible.

I believe Jonas is aware of that, and his existing work on reproducers
reflects that philosophy, but I think it's still important to spell this
out.

regards,
pl

On 06/04/2020 23:32, Davidino Italiano via lldb-dev wrote:
> 
> 
>> On Apr 6, 2020, at 2:24 PM, Jonas Devlieghere via lldb-dev 
>>  wrote:
>>
>> Hi everyone,
>>
>> Reproducers in LLDB are currently tested through (1) unit tests, (2) 
>> dedicated end-to-end shell tests and (3) the `lldb-check-repro` suite which 
>> runs all the shell tests against a replayed reproducer. While this already 
>> provides great coverage, we're still missing out on about 800 API tests. 
>> These tests are particularly interesting to the reproducers, because as 
>> opposed to the shell tests, which only exercises a subset of SB API calls 
>> used to implement the driver, they cover the majority of the API surface.
>>
>> To further qualify reproducer and to improve test coverage, I want to 
>> capture and replay the API test suite as well. Conceptually, this can be 
>> split up in two stages: 
>>
>>  1. Capture a reproducer and replay it with the driver. This exercises the 
>> reproducer instrumentation (serialization and deserialization) for all the 
>> APIs used in our test suite. While a bunch of issues with the reproducer 
>> instrumentation can be detected at compile time, a large subset only 
>> triggers through assertions at runtime. However, this approach by itself 
>> only verifies that we can (de)serialize API calls and their arguments. It 
>> has no knowledge of the expected results and therefore cannot verify the 
>> results of the API calls.
>>
>>  2. Capture a reproducer and replay it with dotest.py. Rather than having 
>> the command line driver execute every API call one after another, we can 
>> have dotest.py call the Python API as it normally would, intercept the call, 
>> replay it from the reproducer, and return the replayed result. The 
>> interception can be hidden behind the existing LLDB_RECORD_* macros, which 
>> contains sufficient type info to drive replay. It then simply re-invokes 
>> itself with the arguments deserialized from the reproducer and returns that 
>> result. Just as with the shell tests, this approach allows us to reuse the 
>> existing API tests, completely transparently, to check the reproducer output.
>>
>> I have worked on this over the past month and have shown that it is possible 
>> to achieve both stages. I have a downstream fork that contains the necessary 
>> changes.
>>
>> All the runtime issues found in stage 1 have been fixed upstream. With the 
>> exception of about 30 tests that fail because the GDB packets diverge during 
>> replay, all the tests can be replayed with the driver.
>>
>> About 120 tests, which include the 30 mentioned earlier, fail to replay for 
>> stage 2. This isn't entirely unexpected, just like the shell tests, there 
>> are tests that simply are not expected to work. The reproducers don't 
>> currently capture the output of the inferior and synchronization through 
>> external files won't work either, as those paths will get remapped by the 
>> VFS. This requires manually triage.
>>
>> I would like to start upstreaming this work so we can start running this in 
>> CI. The majority of the changes are limited to the reproducer 
>> instrumentation, but some changes are needed in the test suite as well, and 
>> there would be a new decorator to skip the unsupported tests. I'm splitting 
>> up the changes in self-contained patches, but wanted to send out this RFC 
>> with the bigger picture first.
> 
> I personally believe this is a required step to make sure:
> a) Reproducers can jump from being a prototype idea to something that can 
> actually run in production
> b) Whenever we add a new test [or presumably a new API] we get coverage 
> for-free.
> c) We have a verification mechanism to make sure we don’t regress across the 
> large surface API and not only what the unittests & shell tests cover.
> 
> I personally would be really glad to see this being upstreamed. I also would 
> like to thank you for doing the work in a downstream branch until you proved 
> this was achievable.
> 
> —
> D
> 
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> 

Re: [lldb-dev] Saving and restoring STDIN in the ScriptInterpreter

2020-04-07 Thread Pavel Labath via lldb-dev
Hi Davide,

I believe your guess about background processes is correct. I think that
the lldb process is stopped (or is continually getting stopped and
restarted) by SIGTTOU.


Macro: int SIGTTOU

This is similar to SIGTTIN, but is generated when a process in a
background job attempts to write to the terminal or ***set its modes***.
Again, the default action is to stop the process. SIGTTOU is only
generated for an attempt to write to the terminal if the TOSTOP output
mode is set; see Output Modes.


Saving/restoring tty state before/after entering the python interpreter
does not sound like an unreasonable thing to do. However, I do see two
problems with this code:
- it unconditionally uses STDIN_FILENO -- it should use
SBDebugger::GetInputFileHandle (or equivalent) instead
- it has no test and it's impossible to track down why it exactly exists
and whether is really needed

With that in mind, I don't have a problem with deleting this code (and
later readding it properly, if needed) -- I might even say it's a good
idea. I cannot guarantee this will solve your problem completely, since
any other operation which will attempt to access stdin will trigger the
same problem. However, this, in combination with
SBDebugger::SetInputFileHandle(/dev/null) should in theory be sufficient
since nothing should be accessing the process stdin.

That said, if you just want to make your creduce script work,
redirecting stdin to /dev/null (lldb ...  Hi Pavel, Jonas,
> 
> I was trying to reduce a bug through c-reduce, so I decided to write a
> SBAPI script to make it easier.
> I did find out, that after the first iteration, the reduction gets stuck
> forever.
> I sampled the process and I saw the following (trimmed for readability).
> 
> Call graph:
> […]
> 8455 
> lldb_private::CommandInterpreter::GetScriptInterpreter(bool)  (in _lldb.so) + 
> 84  [0x111aff826]
>   8455 
> lldb_private::PluginManager::GetScriptInterpreterForLanguage(lldb::ScriptLanguage,
>  lldb_private::CommandInterpreter&)  (in _lldb.so) + 99  [0x111a1efcf]
> 8455 
> lldb_private::ScriptInterpreterPython::CreateInstance(lldb_private::CommandInterpreter&)
>   (in _lldb.so) + 26  [0x111d128f4]
>   8455 
> std::__1::shared_ptr 
> std::__1::shared_ptr::make_shared(lldb_private::CommandInterpreter&&&)
>   (in _lldb.so) + 72  [0x111d1b976]
> 8455 
> lldb_private::ScriptInterpreterPython::ScriptInterpreterPython(lldb_private::CommandInterpreter&)
>   (in _lldb.so) + 353  [0x111d11ff3]
>   8455 
> lldb_private::ScriptInterpreterPython::InitializePrivate()  (in _lldb.so) + 
> 494  [0x111d12594]
> 8455 
> (anonymous namespace)::InitializePythonRAII::~InitializePythonRAII()  (in 
> _lldb.so) + 146  [0x111d1b446]
>   
> 8455 lldb_private::TerminalState::Restore() const  (in _lldb.so) + 74  
> [0x111ac8268]
> 
> 8455 tcsetattr  (in libsystem_c.dylib) + 110  [0x7fff7b95b585]
>   
> 8455 ioctl  (in libsystem_kernel.dylib) + 151  [0x7fff7ba19b44]
>   
>   8455 __ioctl  (in libsystem_kernel.dylib) + 10  [0x7fff7ba19b5a]
> 
> 
> It looks like lldb gets stuck forever in `tcsetattr()`, and there are no
> other threads waiting so it’s not entirely obvious to me why it’s
> waiting there.
> I was never able to reproduce this with an interactive session, I
> suspect this is somehow related to the fact that c-reduce spawns a
> thread in the background, hence it doesn’t have a TTY associated.
> I looked at the code that does this, and I wasn’t really able to find a
> reason why we need to do this work. Jim thinks it might have been needed
> historically.
> `git blame` doesn’t really help that much either. If I remove the code,
> everything still passes and it’s functional, but before moving forward
> with this I would like to collect your opinions.
> 
> $ git diff
> diff --git
> a/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
> b/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
> index ee94a183e0d..c53b3bd0fb6 100644
> ---
> a/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
> +++
> b/lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
> @@ -224,10 +224,6 @@ struct InitializePythonRAII {
>  public:
>    InitializePythonRAII()
>        

Re: [lldb-dev] Default script language

2020-04-02 Thread Pavel Labath via lldb-dev
+1 for making this a cmake option.

That said, I don't think we can implement this using #ifdefs.
lldb-enumerations.h is a part of our public api, Config.h isn't (it
theoretically could be, but I don't think we want that).

I think the simplest way to achieve this would be to make
eScriptLanguageDefault an enum value in its own right, and handle the
translation to an actual language internally.

pl


On 02/04/2020 01:12, Ted Woodward via lldb-dev wrote:
> I agree with Jim - it should be a cmake setting, defaulting to Python. If the 
> person building lldb wants to change the default scripting language from 
> Python to Lua, it should be easy. Since we now support 2 scripting languages, 
> we should have an easy way for the user to see which are supported, and which 
> is the default if there are more than 1 supported. Maybe in lldb --version?
> 
> Ted
> 
> -Original Message-
> From: lldb-dev  On Behalf Of Jim Ingham via 
> lldb-dev
> Sent: Wednesday, April 1, 2020 5:43 PM
> To: Ed Maste 
> Cc: LLDB 
> Subject: [EXT] Re: [lldb-dev] Default script language
> 
> Right now, Lua is not nearly as well supported as Python, so it makes sense 
> that if both Python and Lua are available Python should be the default.  But 
> at some point Lua will become an equal to Python.  When that happens, it 
> seems to me the default scripting language choice should be up to the package 
> distributor.  I don’t see why we need to weigh in on that.  That would imply 
> that the default should be an independent build setting.  Not sure that means 
> we need to do it that way now, but if we don’t want to do it twice…
> 
> Jim
> 
> 
>> On Apr 1, 2020, at 2:09 PM, Ed Maste via lldb-dev  
>> wrote:
>>
>> In lldb/include/lldb/lldb-enumerations.h we have:
>> eScriptLanguageDefault = eScriptLanguagePython
>>
>> I'd like to do something like:
>> #if LLDB_ENABLE_PYTHON
>> eScriptLanguageDefault = eScriptLanguagePython #elif LLDB_ENABLE_LUA
>> eScriptLanguageDefault = eScriptLanguageLua #else
>> eScriptLanguageDefault = eScriptLanguageNone #endif
>>
>> if we could include Config.h, or achieve the same effect in some other
>> way if we cannot. Does this seem reasonable?
>>
>> I'm interested in this for lldb in the FreeBSD base system. We have
>> lua available already (and no python) and I've integrated our liblua
>> it into lldb, but it required "--script-language lua" on the command
>> line. For now I'll just change the default to be eScriptLanguageLua in
>> our tree, but would like to have this "just work" upstream.
>> ___
>> lldb-dev mailing list
>> lldb-dev@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] How to associate a bug and CL?

2020-03-26 Thread Pavel Labath via lldb-dev
On 26/03/2020 00:36, Emre Kultursay via lldb-dev wrote:
> llvm-project dev noob here.
> 
> I opened a Bug  and have
> some CLs  to fix them. How do I link
> the CL with the bug so that the code reviewer sees what bug I'm fixing?
> 
> 

Hi Emre,

we have no metadata or anything like that to do this. Normally, one just
mentions the bug number somewhere in the patch title or description. The
canonical way to reference llvm bugs is llvm.org/PR12345.

regards,
pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] LLDB problem about building with Python

2020-03-12 Thread Pavel Labath via lldb-dev
On 12/03/2020 10:50, Rui Hong via lldb-dev wrote:
> Hi LLDB devs,
> 
> I'm working on porting LLDB to my own architecture.
> I choose to use the target-definition-file(python) to let LLDB support
> my architecture based on my situation(already have a workable GDB-stub
> and the target is an embedded DSP). The usage:
>  (lldb) settings set plugin.process.gdb-remote.target-definition-file
> /path/to/_target_definition.py
>  (lldb) gdb-remote 
> So I think I definitely need to rebuild LLDB with python support(cmake
> with -DLLDB_ENABLE_PYTHON=1 according to
> https://lldb.llvm.org/resources/build.html). But problem comes:
> getting this error:
> 
> CMake Error: The following variables are used in this project, but they
> are set to NOTFOUND.
> Please set them or make sure they are set and tested correctly in the
> CMake files:
> /export/pfs/home/lte_dsp/hongrui/DEBUG/jihai_lldb/jihai_lldb/lldb/scripts/Python/modules/readline/libedit_INCLUDE_DIRS
>    used as include directory in directory
> /export/pfs/home/lte_dsp/hongrui/DEBUG/jihai_lldb/jihai_lldb/lldb/scripts/Python/modules/readline
> libedit_LIBRARIES (ADVANCED)
>     linked by target "readline" in directory
> /export/pfs/home/lte_dsp/hongrui/DEBUG/jihai_lldb/jihai_lldb/lldb/scripts/Python/modules/readline
> 
> This problem persists even after I add -DLLDB_ENABLE_LIBEDIT=0
> My cmake version is 3.5.2, the LLDB/LLVM version I choose to work with
> is 7.0.1

Hi Rui,

LLDB_ENABLE_LIBEDIT was introduced only recently (you're using the
current documentation but applying it to an old lldb) . Back in 7.0,
this was called LLDB_DISABLE_LIBEDIT (with the inverted meaning of values).

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] Continuing from dbgtrap on different targets

2020-03-04 Thread Pavel Labath via lldb-dev
On 04/03/2020 21:45, Jim Ingham via llvm-dev wrote:
> As you have seen, different machine architectures do different things after 
> hitting a trap.  On x86_64, the trap instruction is executed, and then you 
> stop, so the PC is left after the stop.  On arm64, when execution halts the 
> pc is still pointing at the trap instruction.
> 
> I don't think lldb should be in the business of telling systems how they 
> should report stops, especially since that is certainly something we can 
> handle in lldb.
> 
> For traps that lldb recognizes as ones it is using for breakpoints, it 
> already has to handle this difference for you.  But for traps we know nothing 
> about we don't do anything special. 
> 
> I think it would be entirely reasonable that whenever lldb encounters a trap 
> instruction that isn't one of ours it should always move the PC after the 
> trap before returning control to the user.  I can't see why you would want to 
> keep hitting the trap over and over.  I've received several bugs (on the 
> Apple bug reporter side) for this feature.  This might be something we teach 
> lldb-server & debugserver to do, rather than lldb but that's an 
> implementation detail...
> 
> For now, on architectures where the trap doesn't execute, you just need to 
> move the pc past the trap by hand (with the "thread jump" command) before 
> continuing.  That has always been safe on arm64 so far as I can tell.
> 
> Jim

Yes, this is something that has bugged me too.

While I think it would be nice if the OSes hid these architecture quirks
(hell, I think it would be nice if the CPU manufacturers made this
consistent so that the OS doesn't need to hide it), I think that
changing that at this point is very unlikely, and so working around it
in lldb is probably the best we can do.

I am not sure what is the official position on continuing from a debug
trap, but I think that without that ability, the concept would be pretty
useless. A quick example  shows that clang
produces the "expected" output even at -O3. In fact, on aarch64,
__builtin_debugtrap() and __builtin_trap() produce the same instruction,
and the only difference between them is that the latter also triggers
DCE of everything coming after it.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Odd behavior with Python on Windows (loading 2 copies of liblldb.dll/_lldb.pyd)

2020-02-24 Thread Pavel Labath via lldb-dev
On 21/02/2020 23:32, Ted Woodward via lldb-dev wrote:
> Looking into differences, I’m using swig 3.0.12 and the bot is using
> swig 3.0.2. I’m building with 3.0.2 on my machine right now, but it will
> take a while to finish!

I think this could very likely be the cause. We use a different
mechanism
()
for importing the lldb module starting with swig-3.0.11. This could very
well be one of the side effects.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] How to tell if an address belongs to the heap?

2020-02-07 Thread Pavel Labath via lldb-dev
Thanks for the explanation, Vangelis.

It sounds like binary instrumentation would be the best approach for this,
as this is pretty much exactly what msan does. If recompilation is not an
option, then you might be able to get something to work via lldb, but I
expect this to be _incredibly_ slow (like 1000x, or more). One thing I
might consider in your place is some kind of a in-process solution. For
instance, if you intercept mmap (via LD_PRELOAD or something) then you
could set it map all anonymous memory (aka heap) as read-only. This way
you'll get a SIGSEGV everytime somebody tries to write to that address. You
could intercept that signal and do your analysis there. Assuming heap
writes are not very common, this might even give you a reasonable
performance.

But this is not going to be super easy either. The trickiest part here will
be resuming the program -- you'll need to remap the page read-write, do a
single step, and then set it to read-only again.

pl

On Fri, 7 Feb 2020 at 01:40, Vangelis Tsiatsianas 
wrote:

> Thank you for your thorough and timely response, Pavel! 
>
> Your suggestions might actually cover completely what I am attempting to
> achieve.
>
> Unfortunately, I am not able to disclose the exact reason I need it, but I
> want to track all heap writes, in order to detect modifications in the heap
> and save both the old and the newly written value.
>
> For now, this translates to tracking common x86 assembly instructions (mov{l,
> w, d, q}) for a single thread ―supporting more “exotic” instructions like
> SIMD, multiple architectures or threads is not currently a goal.
>
> Another method could also be an LLVM instrumentation pass, however I
> would like to avoid recompiling and modifying the binary, thus I focus on
> LLDB, even if I end up missing a few writes that way.
>
> I was initially looking for a more complete, cross-platform solution (see:
> http://lists.llvm.org/pipermail/llvm-dev/2019-November/136876.html), but
> the solution proved to be too time consuming for the timeframe I have
> available for my master’s (ending in March).
>
>
> ― Vangelis
>
>
> On 7 Feb 2020, at 01:20, Pavel Labath  wrote:
>
> In general, getting this kind of information is pretty hard, so lldb does
> not offer you an out-of-the-box solution for it, but it does give you tools
> which you can use to approximate that.
>
> If I wanted to do something like this, the first thing I'd try to do is
> run "image lookup -a 0xaddr". If this doesn't return anything then the
> address does not correspond to any known module. This rules out code,
> global variables, and similar. Then you can run through all of the threads
> and do a "memory region $SP", which will give you bounds of the memory
> allocation around the stack pointer. If your address is in one of these
> ranges, then it's a stack address. Otherwise, it's probably heap (though
> you can never be 100% sure of that).
>
> However, it's not fully clear to me what it is that you're trying to do
> here. Maybe if you explain the higher level problem that you're trying to
> solve, we can come up with a better solution.
>
> pl
>
> On Thu, 6 Feb 2020 at 07:40, Vangelis Tsiatsianas via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
>
>> Hi everyone,
>>
>> I am looking for a way to tell whether a memory address belongs to the
>> heap or not.
>>
>> In other words, I would like to make sure that the address does not
>> reside within any stack frame (even if the stack of the thread has been
>> allocated in the heap) and that it’s not a global variable or instruction.
>>
>> Checking whether it is a valid or correctly allocated address or a
>> memory-mapped file or register is not a goal, so accessing it in order to
>> decide, at the risk of causing a segmentation fault, is an accepted
>> solution.
>>
>> I have been thinking of manually checking the address against the
>> boundaries of each active stack frame, the start and end of the instruction
>> segment and the locations of all global variables.
>>
>> However, I would like to ask where there are better ways to approach this
>> problem in LLDB.
>>
>> Thank you very much, advance! 
>>
>>
>> ― Vangelis
>>
>> ___
>> lldb-dev mailing list
>> lldb-dev@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] How to tell if an address belongs to the heap?

2020-02-06 Thread Pavel Labath via lldb-dev
In general, getting this kind of information is pretty hard, so lldb does
not offer you an out-of-the-box solution for it, but it does give you tools
which you can use to approximate that.

If I wanted to do something like this, the first thing I'd try to do is run
"image lookup -a 0xaddr". If this doesn't return anything then the address
does not correspond to any known module. This rules out code, global
variables, and similar. Then you can run through all of the threads and do
a "memory region $SP", which will give you bounds of the memory allocation
around the stack pointer. If your address is in one of these ranges, then
it's a stack address. Otherwise, it's probably heap (though you can never
be 100% sure of that).

However, it's not fully clear to me what it is that you're trying to do
here. Maybe if you explain the higher level problem that you're trying to
solve, we can come up with a better solution.

pl

On Thu, 6 Feb 2020 at 07:40, Vangelis Tsiatsianas via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> Hi everyone,
>
> I am looking for a way to tell whether a memory address belongs to the
> heap or not.
>
> In other words, I would like to make sure that the address does not reside
> within any stack frame (even if the stack of the thread has been allocated
> in the heap) and that it’s not a global variable or instruction.
>
> Checking whether it is a valid or correctly allocated address or a
> memory-mapped file or register is not a goal, so accessing it in order to
> decide, at the risk of causing a segmentation fault, is an accepted
> solution.
>
> I have been thinking of manually checking the address against the
> boundaries of each active stack frame, the start and end of the instruction
> segment and the locations of all global variables.
>
> However, I would like to ask where there are better ways to approach this
> problem in LLDB.
>
> Thank you very much, advance! 
>
>
> ― Vangelis
>
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] gdb-remote protocol questions

2020-01-28 Thread Pavel Labath via lldb-dev
On 27/01/2020 19:43, Alexander Zhang via lldb-dev wrote:
> Hi,
> 
> Thanks for pointing me towards stack unwinding. I don't have debug
> information much of the time, so I'm depending on the architecture rules
> for backtracing. A look at the mips ABI plugin shows it uses dwarf
> register numbers to get the register values it needs, and I wasn't
> including them in my qRegisterInfo responses. After fixing this, step
> over and step out appear to work correctly, which is a great help.
> 
> However, backtraces only show 2 frames with the current pc and ra
> values, no matter where I am, so it seems there's some problem getting
> stack frame info from the actual stack. I've attached an unwind log from
> running bt inside a function that should have a deeper backtrace. The
> afa value of 0x looks suspicious to me, but I don't
> really understand where it comes from. The frame before 0x8002ee70
> should, I think, be 0x80026a6c, as that's the pc after stepping out twice.
> 
> Thanks,
> Alexander 
> 

Hi Alexander,

I am pretty sure the AFA is a red herring and you needn't worry about
it. It is only used in some very specific circumstances, when a function
realigns the stack pointer (e.g. when you have a over-aligned local
variable), and only on x86 I believe. Everyone else gets a 0xfff...f
value, and that's fine.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] The future of modern-type-lookup in LLDB

2019-12-12 Thread Pavel Labath via lldb-dev
Thanks for the nice summary Raphael,


this isn't exactly my playground (though I may end up having to poke in
this soon), but it overall, it sounds to me like this "modern" thingy is
a better overall design. However, your penultimate bullet point seems
pretty important to me. I don't think we can avoid "merging" namespace
defined in two shared libraries. If the ExternalASTMerger does not
support this, then it may not work for us.

Given that it is only used from lldb, it may be possible to refactor it
to support this use case, but then again, we should be able to do the
same with the existing mechanism too.

So at the end of the day, I think I'd go for removing this, but I think
that you're probably the most qualified to make decision here..

pl

On 11/12/2019 11:13, Raphael “Teemperor” Isemann via lldb-dev wrote:
> Hi,
> 
> some of you may have seen that I’ve been working on the ‘modern-type-lookup’ 
> mode in LLDB over the last months. I thought it’s a good time now to start a 
> discussion about what’s the current state, goals and the future of this 
> feature.
> 
> Some background about the topic to get everyone on the same page:
> * LLDB uses Clang’s AST to model the types/functions/variables/etc. in the 
> target program.
> * Clang’s AST is usually represented as a ASTContext (= AST + associated 
> data).
> * LLDB has several ASTContexts for different purposes:
>   - Usually we have several ASTContexts that just contain the AST nodes we 
> create from debug information on the disk.
>   - There are some minor ones floating around like the ASTContext we use to 
> store AST nodes that we load from Clang modules (*.pcm files).
>   - There are temporary ASTContexts we create to run expressions (every 
> expression is its own temporary ASTContext that gets deleted after we ran the 
> expression).
>   - There is one ASTContext that is the one we put in all the persistent 
> $-variables and their associated types.
> * The last two ASTContexts don’t have any associated data source that they 
> represent but are supposed to be ‘filled' by the other two kinds of 
> ASTContexts.
> * To fill an ASTContext we move AST nodes from one ASTContext to another.
> * Moving ASTNodes is done with clang’s ASTImporter implementation which 
> ‘imports’ AST nodes.
> * Nearly all importing in LLDB is done lazily which the ASTImporter calls 
> ‘MinimalImport’. This means the ASTImporter only imports what it needs for 
> certain declarations and then imports the rest as it is needed.
> * The lazy importing is a big source of problems in LLDB. If we don’t 
> correctly import enough information to make Clang happy then we usually end 
> up hitting an assert. However it is also avoiding the loading unnecessary 
> debug information from disk which makes LLDB faster. There are no accurate 
> numbers on how much faster as we don’t have an easy way to run LLDB without 
> MinimalImport.
> 
> Now let’s move on to the modern-type-lookup part.
> 
> What is modern-type-lookup?
> * modern-type-lookup is a flag that makes LLDB use `clang::ExternalASTMerger` 
> instead of directly using the `clang::ASTImporter` via our ClangASTImporter 
> wrapper.
> * `clang::ExternalASTMerger` is some kind of manager for clang’s ASTImporter. 
> It keeps track of several ASTImporter instances (that each have an associate 
> ASTContext source) and one target ASTContext that all ASTImporters import 
> nodes into. When the ASTImporters import certain nodes into the single target 
> ASTContext it does the bookkeeping to associate the imported information with 
> the ASTImporter/source ASTContext.
> * The ExternalASTMerger also does some other smaller book keeping such as 
> having a ‘reverse ASTImporter’ for every ‘ASTImporter’.
> * The ExternalASTMerger is only used by LLDB (and clang-import-test which is 
> its testing binary).
> 
> What is good about modern-type-lookup:
> * The ExternalASTMerger is better code than our current system of directly 
> (ab-)using the ASTImporter. Most notably it doesn’t use globals like our 
> current system (yay).
> * The ExternalASTMerger is easer to test because it comes with a testing 
> binary (clang-import-test) and it doesn’t depend on anything beside the 
> ASTImporter. In comparison our current system depends on literally all of 
> LLDB.
> * It brings better organisation to our ASTImporter network which allows us to 
> implement some tricky features (e.g. https://reviews.llvm.org/D67803).
> * It actually is a quite small (but sadly very invasive) change into LLDB and 
> Clang.
> 
> What is bad and ugly about modern-type-lookup:
> * modern-type-lookup in LLDB was completely untested until recently. The idea 
> was the testing was done by running the whole test suite with the setting 
> enabled by default [1] and then fix the found issues [2], but none of this 
> happened for various reasons. The only dedicated tests for it are the ones I 
> added while I was trying to see what it even does (see the 
> 

Re: [lldb-dev] [RFC] Supporting Lua Scripting in LLDB

2019-12-10 Thread Pavel Labath via lldb-dev
On 09/12/2019 18:27, Jonas Devlieghere wrote:
> On Mon, Dec 9, 2019 at 1:55 AM Pavel Labath  wrote:
>>
>> I think this would be a very interesting project, and would allow us to
>> flesh out the details of the script interpreter interface.
>>
>> A lot of the complexity in our python code comes from the fact that
>> python can be (a) embedded into lldb and (b) lldb can be embedded into
>> python. It's been a while since I worked with lua, but from what I
>> remember, lua was designed to make (a) easy., and I don't think (b) was
>> ever a major goal (though it can always be done ways, of course)..
>>
>> Were you intending to implement both of these directions or just one of
>> them ((a), I guess)?
> 
> Thanks for pointing this out. Indeed, my goal is only to support (a)
> for exactly the reasons you brought up.
> 
>> The reason I am asking this is because doing only (a) will definitely
>> make lua support simpler than python, but it will also mean it won't be
>> a "python-lite".
>>
>> Both of these options are fine -- I just want to understand where you're
>> going with this. It also has some impact on the testing strategy, as our
>> existing python tests are largely using mode (b).
> 
> That's part of my motivation for *not* doing (b). I really don't want
> to create/maintain another (Lua driven) test suite.

I certainly see where you're coming from, but I'm not sure if this will
actually achieve the intended effect. The thing is, not doing (b) does
not really reduce the testing surface that much -- it just makes the
tested APIs harder to reach. If Python didn't have (b), we wouldn't be
able to do "import lldb" in python, but that's about it. The full lldb
python api would still be reachable by starting lldb and typing "script".

What this means is that if lua doesn't support (b) then the lua bindings
will need to be tested by driving lldb from within the lua interpreter
embedded within lldb -- which doesn't exactly sound like a win. I'm not
saying this means we *must* implement (b), or that the alternative
solution will be more complex than testing via (b) (though I'm sure we
could come up with something way simpler than dotest), but I think we
should try to come up with a good testing story very early on.

Speaking of testing, will there be any bot configured to build the
lua code?

> 
>> Another question I'm interested in is how deeply will this
>> multi-interpreter thing go? Will it be a build time option, will it be
>> selectable at runtime, but we'll have only one script interpreter per
>> SBDebugger, or will we be able to freely mix'n'match scripting languages?
> 
> There is one script interpreter per debugger. As far as I can tell
> from the code this is already enforced.
> 
>> I think the last option would be best because of data formatters
>> (otherwise one would have a problem is some of his data formatters are
>> written in python and others in lua), but it would also create a lot
>> more of new api surface, as one would have to worry about consistency of
>> the lua and python views of lldb, etc.
> 
> That's an interesting problem I didn't think of. I'm definitely not
> excited about having the same data formatter implemented in both
> scripting languages. Mixing scripting languages makes sense for when
> your LLDB is configured to support both Python and Lua, but what do
> you do for people that want only Lua? They might still want to
> re-implement some data formatters they care about...

Well, if they really have a lldb build which only supports one scripting
language, then yes, they'd have to reimplement something -- there isn't
anything else that can be done. But it'd be a pitty if someone had lldb
which supports *both* languages and he is forced to choose which data
structures he wants pretty-printed.

> Anyway, given
> that we don't maintain/ship data formatters in Python ourselves, maybe
> this isn't that big of an issue at all?

Hard to say without this thing actually being used. I certainly don't
think this is something that we need to solve right now, though I think
it something that we should be aware of, and not close the door on that
possibility completely.

And BTW we do ship python data formatters right now. The libc++ and
libstdc++ have some formatters written in python -- with the choice of
formatters being pretty arbitrary.

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Supporting Lua Scripting in LLDB

2019-12-09 Thread Pavel Labath via lldb-dev
I think this would be a very interesting project, and would allow us to
flesh out the details of the script interpreter interface.

A lot of the complexity in our python code comes from the fact that
python can be (a) embedded into lldb and (b) lldb can be embedded into
python. It's been a while since I worked with lua, but from what I
remember, lua was designed to make (a) easy., and I don't think (b) was
ever a major goal (though it can always be done ways, of course)..

Were you intending to implement both of these directions or just one of
them ((a), I guess)?

The reason I am asking this is because doing only (a) will definitely
make lua support simpler than python, but it will also mean it won't be
a "python-lite".

Both of these options are fine -- I just want to understand where you're
going with this. It also has some impact on the testing strategy, as our
existing python tests are largely using mode (b).

Another question I'm interested in is how deeply will this
multi-interpreter thing go? Will it be a build time option, will it be
selectable at runtime, but we'll have only one script interpreter per
SBDebugger, or will we be able to freely mix'n'match scripting languages?

I think the last option would be best because of data formatters
(otherwise one would have a problem is some of his data formatters are
written in python and others in lua), but it would also create a lot
more of new api surface, as one would have to worry about consistency of
the lua and python views of lldb, etc.

On 09/12/2019 01:25, Jonas Devlieghere via lldb-dev wrote:
> Hi everyone,
> 
> Earlier this year, when I was working on the Python script
> interpreter, I thought it would be interesting to see what it would
> take to support other scripting languages in LLDB. Lua, being designed
> to be embedded, quickly came to mind. The idea remained in the back of
> my head, but I never really got around to it, until now.
> 
> I was pleasantly surprised to see that it only took me a few hours to
> create a basic but working prototype. It supports running single
> commands as well as an interactive interpreter and has access to most
> of the SB API through bindings generated by SWIG. Of course it's far
> from complete.
> 
> Before I invest more time in this, I'm curious to hear what the
> community thinks about adding support for another scripting language
> to LLDB. Do we need both Lua and Python?
> 
> Here are some of the reasons off the top of my head as to why the
> answer might be
> "yes":
> 
>  - The cost for having another scripting language is pretty small. The
> Lua script interpreter is very simple and SWIG can reuse the existing
> interfaces to generate the bindings.
>  - LLDB is designed to support multiple script interpreters, but in
> reality we only have one. Actually exercising this property ensures
> that we don't unintentionally break that design assumptions.
>  - The Python script interpreter is complex. It's hard to figure out
> what's really needed to support another language. The Lua script
> interpreter on the other hand is pretty straightforward. Common code
> can be shared by both.
>  - Currently Python support is disabled for some targets, like Android
> and iOS. Lua could enable scripting for these environments where
> having all of Python is overkill or undesirable.
> 
> Reasons why the answer might be "no":
> 
>  - Are our users going to use this?
>  - Supporting Python is an ongoing pain. Do we really want to risk
> burdening ourselves with another scripting language?
>  - The Python API is very well tested. We'd need to add test for the
> Lua bindings as well. It's unlikely this will match the coverage of
> Python, and probably even undesirable, because what's the point of
> testing the same thing twice. Also, do we want to risk fragmenting
> tests across two scripting languages?
> 
> There's probably a bunch more stuff that I didn't even think of. :-)
> 
> Personally I lean towards "yes" because I feel the benefits outweigh
> the costs, but of course that remains to be seen. Please let me know
> what you think!
> 
> If you're curious about what this looks like, you can find the patches
> on my fork on GitHub:
> https://github.com/JDevlieghere/llvm-project/tree/lua
> 
> Cheers,
> Jonas
> ___
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 

On 09/12/2019 09:33, Raphael “Teemperor” Isemann via lldb-dev wrote:
> I think this is great, thanks for working on this! My only concern
> is that I would prefer if we could limit the Lua tests to just the
> Lua->C++ calling machinery (e.g., that we handle Lua strings
> correctly and all that jazz) and not fragment our test suit.
> Otherwise Lua seems to require far less maintenance work than
> Python, so I am not worried about the technical debt this adds.
I agree -- I think our position should be (at least until lua support is
very mature) is that the 

Re: [lldb-dev] SBValues referencing deallocated memory

2019-11-27 Thread Pavel Labath via lldb-dev

On 27/11/2019 08:47, Raphael “Teemperor” Isemann via lldb-dev wrote:

This can also be reproduced in the command line like this:

(lldb) expr "foo"
(const char [4]) $0 = "foo"
(lldb) expr "bar"
(const char [4]) $1 = "bar"
(lldb) expr $0
(const char [4]) $0 = “bar”

This however works just fine:

(lldb) expr char c[] = "foo"; c
(char [4]) $0 = "foo"
(lldb) expr char c[] = "bar"; c
(char [4]) $1 = "bar"
(lldb) expr $0
(char [4]) $0 = “foo”

I don’t know the related code so well, but from what I remember we have 
a storage mechanism for persistent variables that we fill up (in 
the ‘Materializer’ IIRC). We probably just copy the pointer itself to 
this storage but not the memory it points to. I guess we could tweak 
that logic to detect pointers that point into memory LLDB allocated and 
then either extract the necessary memory into our storage or keep the 
related sections around.


Anyway, I filed https://bugs.llvm.org/show_bug.cgi?id=44155 and I will 
ask around what solution people would prefer once thanksgiving is over.




You can find a kind of a description of how this is meant to work in 
.


Persisting string literals that were typed into the expression seems 
reasonable and hopefully not too difficult, and it would kind of match 
what happens during "normal" compilation. Doing that for random "const 
char *"s that you happen to stumble upon in the result variable seems 
more problematic, and I'm not sure we should even try...


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Inconsistencies in CIE pointer in FDEs in .debug_frame

2019-11-25 Thread Pavel Labath via lldb-dev

On 25/11/2019 10:46, Martin Storsjö wrote:

On Mon, 25 Nov 2019, Pavel Labath wrote:


On 24/11/2019 23:16, Martin Storsjö via lldb-dev wrote:

Hi,

I'm looking into something that seems like an inconsistency in 
handling of the CIE pointer in FDEs in .debug_frame, between how 
debug info is generated in LLVM and consumed in LLDB.


For FDEs in .eh_frame, the CIE pointer/cie_id field is interpreted as 
an offset from the current FDE - this seems to be consistent.


But for cases in .debug_frame, they are treated differently. In LLDB, 
the cie_id field is assumed to be relative to the begin of the 
.debug_frame section: 
https://github.com/llvm/llvm-project/blob/master/lldb/source/Symbol/DWARFCallFrameInfo.cpp#L482-L495 

However, when this field is produced in LLVM, it can, depending on 
MCAsmInfo flags, end up written as a plain absolute address to the 
CIE: 
https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCDwarf.cpp#L1699-L1705 

That code in MCDwarf.cpp hasn't been touched in many years, so I 
would expect that the info it generates actually has been used since 
and been found to be correct. Or are most cases built with 
-funwind-tables or similar, enabled by default?, so this is exercised 
in untested cases?


In the case where I'm running in this, LLDB reports "error: Invalid 
cie offset" when running executables with such .debug_frame sections.


By adding an ", true" to the end of the EmitSymbolValue call in 
MCDwarf.cpp, the symbol reference is made section relative and the 
code seems to do what LLDB expects. Is that correct, or should LLDB 
learn the cases (which?) where the cie_id is an absolute address 
instead of a section relative one?


// Martin


What's the target you're encountering this behavior on? Can you maybe 
provide a short example of how the CIE/FDE entries in question look like?


I'm seeing this behaviour for mingw targets. GCC produces debug_frame 
sections where the CIE pointer is a section relative address (with a 
SECTREL relocation), while LLVM produces debug_frame sections with 
absolute (global/virtual) addresses.


Right. That's the part I was missing. Thanks.



LLDB seems to expect the format that GCC produces here.

I could be wrong (I'm not really an expert on this), but my 
understanding is that 
"asmInfo->doesDwarfUseRelocationsAcrossSections()" is basically 
equivalent to "is target MachO"


Yes, that's pretty much my take of it as well. The BPF target also has 
an option for setting this flag in asminfo, but other than that, it's 
not modified >
That said, if that is all there is here, then it does not seem to me 
like there's any special support in lldb needed, as the cie offset 
will always be a correct absolute offset from the start of the section 
by the time lldb gets to see it (and so it shouldn't matter if the 
offset was put there by the compiler or the linker). This makes me 
think that I am missing something, but I have no idea what could that 
be..


This wasn't the inconsistency I'm looking into.

I'm looking into an inconsistency between section relative and absolute 
addresses. The default case in MCDwarf.cpp, calls 
EmitSymbolValue(, 4).


By default EmitSymbolValue emits _absolute_ addresses (or more 
precisely, relocations that makes the linker produce absolute 
addresses), i.e. the full address of the CIE, instead of section relative.


The EmitSymbolValue function, declared at 
https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/MC/MCStreamer.h#L669-L670, 
takes an IsSectionRelative parameter, which defaults to false here (as 
it isn't specified). I would expect that it should be true, as LLDB 
expects a section relative address here.


I think this is a bug in LLVM's MCDwarf.cpp, but it puzzles me how it 
can have gone unnoticed.


But now I tested this a bit more with ELF setups, and realized that it 
somehow does seem to do the right thing. It might have something to do 
with how ELF linkers handle this kind of section that isn't loaded at 
runtime (and thus perhaps doesn't really have a virtual address assigned).


So that pretty much clears the question regarding inconsistency, and 
raises more questions about how this really works in ELF and MCDwarf.



A test procedure that shows off the issue is this:

$ cat test.c
void entry(void) { }

$ bin/clang -fno-unwind-tables test.c -c -g -o test.o -target 
i686-linux-gnu

$ bin/llvm-objdump -r test.o

test.o: file format ELF32-i386



RELOCATION RECORDS FOR [.debug_frame]:
0018 R_386_32 .debug_frame
001c R_386_32 .text

# As far as I know, these two R_386_32 relocations both indicate that the
# full, absolute address of these two locations should be inserted in
# these two locations.

$ bin/ld.lld test.o -o exe -e entry
$ bin/llvm-dwarfdump --eh-frame exe

exe:    file format ELF32-i386

.debug_frame contents:

 0010  CIE


0014 0018  FDE cie= pc=004010c0...004010c5
   ^
# The CIE offset, the 

Re: [lldb-dev] Inconsistencies in CIE pointer in FDEs in .debug_frame

2019-11-25 Thread Pavel Labath via lldb-dev

On 24/11/2019 23:16, Martin Storsjö via lldb-dev wrote:

Hi,

I'm looking into something that seems like an inconsistency in handling 
of the CIE pointer in FDEs in .debug_frame, between how debug info is 
generated in LLVM and consumed in LLDB.


For FDEs in .eh_frame, the CIE pointer/cie_id field is interpreted as an 
offset from the current FDE - this seems to be consistent.


But for cases in .debug_frame, they are treated differently. In LLDB, 
the cie_id field is assumed to be relative to the begin of the 
.debug_frame section: 
https://github.com/llvm/llvm-project/blob/master/lldb/source/Symbol/DWARFCallFrameInfo.cpp#L482-L495 



However, when this field is produced in LLVM, it can, depending on 
MCAsmInfo flags, end up written as a plain absolute address to the CIE: 
https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCDwarf.cpp#L1699-L1705 



That code in MCDwarf.cpp hasn't been touched in many years, so I would 
expect that the info it generates actually has been used since and been 
found to be correct. Or are most cases built with -funwind-tables or 
similar, enabled by default?, so this is exercised in untested cases?


In the case where I'm running in this, LLDB reports "error: Invalid cie 
offset" when running executables with such .debug_frame sections.


By adding an ", true" to the end of the EmitSymbolValue call in 
MCDwarf.cpp, the symbol reference is made section relative and the code 
seems to do what LLDB expects. Is that correct, or should LLDB learn the 
cases (which?) where the cie_id is an absolute address instead of a 
section relative one?


// Martin


What's the target you're encountering this behavior on? Can you maybe 
provide a short example of how the CIE/FDE entries in question look like?


I could be wrong (I'm not really an expert on this), but my 
understanding is that "asmInfo->doesDwarfUseRelocationsAcrossSections()" 
is basically equivalent to "is target MachO", and the reason that we 
don't emit section relative addresses there is because MachO does not 
link debug info sections. This means there will only ever be a single 
debug_frame contribution in one file, and so we can just put offsets 
directly, instead of relying on linker to patch things up. Doing 
anything like this in a format which links (concatenates) debug info 
sections would certainly result in irreparably corrupted unwind info, 
since you have no idea what will be present at a certain absolute 
address (offset) once the linker has finished its thing.


That said, if that is all there is here, then it does not seem to me 
like there's any special support in lldb needed, as the cie offset will 
always be a correct absolute offset from the start of the section by the 
time lldb gets to see it (and so it shouldn't matter if the offset was 
put there by the compiler or the linker). This makes me think that I am 
missing something, but I have no idea what could that be..


Anyway, I hope this helps somehow..

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] The pre-built Windows LLDB binary has a dependency on an external python36.dll?

2019-11-21 Thread Pavel Labath via lldb-dev

On 22/11/2019 01:26, Adrian McCarthy wrote:
Yes, that sounds plausible, but I don't recall for sure.  I think 
there's a build-time option to say you don't want Python at all, but I 
can't remember if there was a load-as-needed option.


I'm pretty sure we have never had explicit support for anything like 
this. The only way I can see this happening is if this fell out 
"accidentally" out of our lazy python initialization and some windows 
dll behavior, but I don't think windows has anything like that. (At 
least on linux, I know that lazy binding can delay library mismatch 
errors until the first time you call some function, but they won't help 
you if the library is not there at all.)


I think the more likely scenario is that python was disabled in the 
previous "official" releases, and that some of the python changes 
enabled it.




In any event, the current situation is what it is.  What's feasible and 
worth doing for the future?


* Hard dependency (as we have right now)
I'm fine with that. We could add a note on the website that one needs to 
have python installed for this to work. Or we could disable python for 
the official releases.


* Dynamically load Python DLL on startup if it exists, or provide a 
better error message with instructions
* Dynamically load Python DLL on startup if it exists, otherwise disable 
Python-dependent features

* Dynamically load a specific version of the Python DLL if/when needed
All of these seem fine too, if anyone is willing to invest the time to 
make it work (it shouldn't be _that_ hard). Since python is pretty 
compartmentalized nowadays, it shouldn't relatively easy to disable 
python features at runtime instead of just exiting.


The main question I have here is should we dlopen python.dll, or some 
lldb wrapper of it (the entire "script interpreter" plugin).


I'd also like to note that this isn't the only external dependency of 
lldb. (Well, it might be on windows..) Lldb can use libcurses, libedit, 
libz, etc. Libedit is fairly likely to not be present on a random linux 
system. libcurses are almost certainly there, but it's not always a 
compatible version, etc.



* Dynamically load any supported Python DLL if/when needed
That might be tricky since the different versions are not binary 
compatible in general. But it is possible, as Apple folks have shown, 
though it amounts to building multiple copies of ScriptInterpreterPython 
and then choosing the right one at runtime.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] The pre-built Windows LLDB binary has a dependency on an external python36.dll?

2019-11-20 Thread Pavel Labath via lldb-dev

On 20/11/2019 23:53, Adrian McCarthy via lldb-dev wrote:

That said, I didn't expect an explicit dependency on python36.dll.


What kind of behavior did you expect?

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] https://reviews.llvm.org/D69273

2019-11-04 Thread Pavel Labath via lldb-dev

On 04/11/2019 18:19, Jim Ingham wrote:

Sorry, my brain is not working this morning, I answered your question in the 
review comments…

Jim



NP, maybe let's continue the discussion there? I find it useful to have 
the actual code change around..

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] https://reviews.llvm.org/D69273

2019-11-04 Thread Pavel Labath via lldb-dev

On 31/10/2019 20:51, Jim Ingham via lldb-dev wrote:

It looks like this change is causing problems with swift.  I was talking a 
little bit with Davide about this and it seems like it wasn't obvious how this 
was designed to work.  So here's what this was intended to do (apologies if 
this is at too basic a level and the issue was something deeper I missed, but 
anyway this might get us started...)

The lldb ValueObject system supports two fairly different kinds of values, live 
and frozen.

The classic example of a live ValueObject is ValueObjectVariable.  That ValueObject is backed by an 
entity in the target, and knows when that entity is valid and not.  So it can always try to do 
"UpdateValueIfNeeded" and that will always return good values.  However, there's on 
complication with this, which is that we also want ValueObjectVariable to be able to answer 
"IsChanged".  That's so in a UI you can mark values that change over a step in red, which 
is very helpful for following along in a debugging session.  So you have to copy the values into 
host memory, in order to have something to compare against when you stop again.  That's why there's 
this slightly complex dance between host and target memory for the live ValueObjects.

The example of a frozen object is the ValueObjectConstResult that is returned 
from expression evaluation.  That value is fixed to a StopID, so the backing 
entity is only known to be good at that stop id.  This is implemented by 
copying the value into Host memory and fetching it from there when requested.

The use case for this is for people like me who have a bad memory.  So I can 
stop somewhere and do:

(lldb) expr foo
struct baz $1 = {
   bar = 20
}

Then later on when I forget what foo.bar was at that time, I can do:

(lldb) expr $1.bar
bar = 20

At a first approximation, this leads to the statement that ConstValues should 
fetch what they fetch when made, and then not offer any information that wasn't 
gathered when the variable was fetched, and you certainly don't ever want these 
values to be updated.

A little complication arises because I might do:

(lldb) expr foo_which_has_a_pointer
$1 = ...
(lldb) expr *$1->the_pointer

If the StopID is the same between the first and second evaluation, then you 
should follow the pointer into target memory and fetch the value.  But if the 
StopID has changed, then trying to dereference a pointer should be an error.  
After all, you are now accessing an incoherent object, and if you try to do 
anything fancier with it than just print some memory (like asking the Swift 
Language Runtime what this value happens to be) you are very likely to get into 
trouble.

So it's clear we need two different behaviors w.r.t. how we treat live or 
frozen values.  Pavel's change was addressing a failure in ValueObjectChild, 
and the solution was to move the ValueObjectVariable behavior up to the 
ValueObject level.  But then that means that ValueObjectConstResults are no 
longer obeying the ConstResult rules.

But it seems like the problem really is that we have only one ValueObjectChild 
class, but child value objects can either be live or frozen, depending on the 
nature of their Root ValueObject.  And this is made a little more complicated 
by the fact that frozen values only freeze when the stop ID changes.



Thanks for the writeup Jim. I haven't managed to dive into the source 
code yet, but the thing that's not clear to me from this otherwise 
detailed an understandable explanation is what is the interaction 
between this ConstResult stuff and the above patch.


Superficially, it doesn't sound like that patch should do anything bad 
here. As the ValueObjectConstResult's data is located in host memory, 
the patch will compute that its pointer children will be of "load 
address" type, which sounds like precisely what's needed here.


Of course, under the surface, there are plenty of ways this can go 
wrong, but precisely because of that, it's hard to say what's the right 
thing to do. Is it that ValueObjectConstResult uses the "address type" 
field to implement the "are the children valid at this stop ID" logic, 
and so this patch interferes with that? What's exactly the nature of the 
crash/misbehavior you were witnessing?


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] https://reviews.llvm.org/D69273

2019-10-31 Thread Pavel Labath via lldb-dev
(Just writing to say that tomorrow is a public holiday in most of 
Europe, so I wont be able to meaningfully reply to this until 
monday/tuesday. But if, in the mean time, you want to revert this, or 
just limit the scope of that patch somehow, then that's fine with me.)


On 31/10/2019 20:51, Jim Ingham via lldb-dev wrote:

It looks like this change is causing problems with swift.  I was talking a 
little bit with Davide about this and it seems like it wasn't obvious how this 
was designed to work.  So here's what this was intended to do (apologies if 
this is at too basic a level and the issue was something deeper I missed, but 
anyway this might get us started...)

The lldb ValueObject system supports two fairly different kinds of values, live 
and frozen.

The classic example of a live ValueObject is ValueObjectVariable.  That ValueObject is backed by an 
entity in the target, and knows when that entity is valid and not.  So it can always try to do 
"UpdateValueIfNeeded" and that will always return good values.  However, there's on 
complication with this, which is that we also want ValueObjectVariable to be able to answer 
"IsChanged".  That's so in a UI you can mark values that change over a step in red, which 
is very helpful for following along in a debugging session.  So you have to copy the values into 
host memory, in order to have something to compare against when you stop again.  That's why there's 
this slightly complex dance between host and target memory for the live ValueObjects.

The example of a frozen object is the ValueObjectConstResult that is returned 
from expression evaluation.  That value is fixed to a StopID, so the backing 
entity is only known to be good at that stop id.  This is implemented by 
copying the value into Host memory and fetching it from there when requested.

The use case for this is for people like me who have a bad memory.  So I can 
stop somewhere and do:

(lldb) expr foo
struct baz $1 = {
   bar = 20
}

Then later on when I forget what foo.bar was at that time, I can do:

(lldb) expr $1.bar
bar = 20

At a first approximation, this leads to the statement that ConstValues should 
fetch what they fetch when made, and then not offer any information that wasn't 
gathered when the variable was fetched, and you certainly don't ever want these 
values to be updated.

A little complication arises because I might do:

(lldb) expr foo_which_has_a_pointer
$1 = ...
(lldb) expr *$1->the_pointer

If the StopID is the same between the first and second evaluation, then you 
should follow the pointer into target memory and fetch the value.  But if the 
StopID has changed, then trying to dereference a pointer should be an error.  
After all, you are now accessing an incoherent object, and if you try to do 
anything fancier with it than just print some memory (like asking the Swift 
Language Runtime what this value happens to be) you are very likely to get into 
trouble.

So it's clear we need two different behaviors w.r.t. how we treat live or 
frozen values.  Pavel's change was addressing a failure in ValueObjectChild, 
and the solution was to move the ValueObjectVariable behavior up to the 
ValueObject level.  But then that means that ValueObjectConstResults are no 
longer obeying the ConstResult rules.

But it seems like the problem really is that we have only one ValueObjectChild 
class, but child value objects can either be live or frozen, depending on the 
nature of their Root ValueObject.  And this is made a little more complicated 
by the fact that frozen values only freeze when the stop ID changes.

Jim


___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev



___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] issue with lldb9 and python3.5

2019-10-30 Thread Pavel Labath via lldb-dev

On 30/10/2019 11:18, Kamil Rytarowski wrote:

On 30.10.2019 11:06, Pavel Labath wrote:

On 29/10/2019 21:40, Christos Zoulas wrote:

On Oct 29,  6:54pm, pa...@labath.sk (Pavel Labath) wrote:
-- Subject: Re: [lldb-dev] issue with lldb9 and python3.5

| On 29/10/2019 09:31, Serge Guelton via lldb-dev wrote:
| > On Mon, Oct 28, 2019 at 10:09:53AM -0700, Adrian McCarthy wrote:
| >> +1 Yes, for Windows, I'd be happy if we said Python 3.6+.
| >
| > I investigated the bug yesterday, and filled some of my
discoveries in
| >
| >  https://bugs.llvm.org/show_bug.cgi?id=43830
| >
| > TLDR: libpython uses libreadline and lldb uses libedit, and that's
a mess.
|
| Hey Christos,
|
| could I bother you to take a look at this python PR
| , and the related lldb
bug
| ?
|
| The executive summary is that there is an incompatibility between
| readline and its libedit emulation, which python needs to work around.
| Is there any way this can be fixed in libedit?
|
| I guess the presence of the workaround will make the fix tricky,
because
| then the workaround will be wrong for the "fixed" libedit, but it's
| still probably worth it to try to resolve this somehow.
|
| WDYT?

I don't know what I have to do here. Can someone explain to me what the
issue is?


I haven't dug into this (maybe Serge can explain in more detail), but I
think this comment (Modules/readline.c in python sources) gives a
general overview of the problem. Ignore the "On OSX" part, the same
should apply to any OS.

/*
  * It is possible to link the readline module to the readline
  * emulation library of editline/libedit.
  *
  * On OSX this emulation library is not 100% API compatible
  * with the "real" readline and cannot be detected at compile-time,
  * hence we use a runtime check to detect if we're using libedit
  *
  * Currently there is one known API incompatibility:
  * - 'get_history' has a 1-based index with GNU readline, and a 0-based
  *   index with older versions of libedit's emulation.
  * - Note that replace_history and remove_history use a 0-based index
  *   with both implementations.
  */

Furthermore, you can probably look at every instance of
if(using_libedit_emulation) in that file (or via this link
),
as each one indicates a workaround for some libedit incompatibility. It
looks like not all of them are still relevant, as it seems some of them
are there just for the sake of old libedit bugs which have since been
fixed, but it looks like at least some of them are. Serge, can you tell
what exactly was the problem which caused the crash?

pl


Is this a packaging issue?

There are good reasons to use libedit as a gnu readline drop in
replacement. The most important one is certainly saner licensing state
as gnu readline is GPLv3 and it makes it incompatible with at least
GPLv2 users (there are users that pick old gnureadline GPLv2 around).

If there are known incompatibilities, I think (not speaking on behalf of
Christos) it would be best to contribute patches with rationale. Generic
call for compatibility might not work well.




Well, I was hoping that for someone intimately familiar with libedit 
(i.e., Christos) the difference in behavior would be obvious from just 
looking at that patch. :) But, of course, I can't make anyone do 
anything (nor I want to do that), so if fixing/figuring that out 
requires more time than you're willing to devote to that right now, then 
yes, I guess it means I or someone else will have to dive in and figure 
out what's wrong (eventually).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] issue with lldb9 and python3.5

2019-10-30 Thread Pavel Labath via lldb-dev

On 29/10/2019 21:40, Christos Zoulas wrote:

On Oct 29,  6:54pm, pa...@labath.sk (Pavel Labath) wrote:
-- Subject: Re: [lldb-dev] issue with lldb9 and python3.5

| On 29/10/2019 09:31, Serge Guelton via lldb-dev wrote:
| > On Mon, Oct 28, 2019 at 10:09:53AM -0700, Adrian McCarthy wrote:
| >> +1 Yes, for Windows, I'd be happy if we said Python 3.6+.
| >
| > I investigated the bug yesterday, and filled some of my discoveries in
| >
| >  https://bugs.llvm.org/show_bug.cgi?id=43830
| >
| > TLDR: libpython uses libreadline and lldb uses libedit, and that's a mess.
|
| Hey Christos,
|
| could I bother you to take a look at this python PR
| , and the related lldb bug
| ?
|
| The executive summary is that there is an incompatibility between
| readline and its libedit emulation, which python needs to work around.
| Is there any way this can be fixed in libedit?
|
| I guess the presence of the workaround will make the fix tricky, because
| then the workaround will be wrong for the "fixed" libedit, but it's
| still probably worth it to try to resolve this somehow.
|
| WDYT?

I don't know what I have to do here. Can someone explain to me what the
issue is?


I haven't dug into this (maybe Serge can explain in more detail), but I 
think this comment (Modules/readline.c in python sources) gives a 
general overview of the problem. Ignore the "On OSX" part, the same 
should apply to any OS.


/*
 * It is possible to link the readline module to the readline
 * emulation library of editline/libedit.
 *
 * On OSX this emulation library is not 100% API compatible
 * with the "real" readline and cannot be detected at compile-time,
 * hence we use a runtime check to detect if we're using libedit
 *
 * Currently there is one known API incompatibility:
 * - 'get_history' has a 1-based index with GNU readline, and a 0-based
 *   index with older versions of libedit's emulation.
 * - Note that replace_history and remove_history use a 0-based index
 *   with both implementations.
 */

Furthermore, you can probably look at every instance of 
if(using_libedit_emulation) in that file (or via this link 
), 
as each one indicates a workaround for some libedit incompatibility. It 
looks like not all of them are still relevant, as it seems some of them 
are there just for the sake of old libedit bugs which have since been 
fixed, but it looks like at least some of them are. Serge, can you tell 
what exactly was the problem which caused the crash?


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] issue with lldb9 and python3.5

2019-10-29 Thread Pavel Labath via lldb-dev

On 29/10/2019 09:31, Serge Guelton via lldb-dev wrote:

On Mon, Oct 28, 2019 at 10:09:53AM -0700, Adrian McCarthy wrote:

+1 Yes, for Windows, I'd be happy if we said Python 3.6+.


I investigated the bug yesterday, and filled some of my discoveries in

 https://bugs.llvm.org/show_bug.cgi?id=43830

TLDR: libpython uses libreadline and lldb uses libedit, and that's a mess.


Hey Christos,

could I bother you to take a look at this python PR 
, and the related lldb bug 
?


The executive summary is that there is an incompatibility between 
readline and its libedit emulation, which python needs to work around. 
Is there any way this can be fixed in libedit?


I guess the presence of the workaround will make the fix tricky, because 
then the workaround will be wrong for the "fixed" libedit, but it's 
still probably worth it to try to resolve this somehow.


WDYT?

pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Adding a clang-style LLVM.h (or, "Are you tired of typing 'llvm::' everywhere ?")

2019-10-08 Thread Pavel Labath via lldb-dev

On 08/10/2019 02:45, Larry D'Anna via lldb-dev wrote:

Pavel Labath said


some llvm classes, are so well-known and widely used, that qualifying
them with "llvm::" serves no useful purpose and only adds visual noise.
I'm thinking here mainly of ADT classes like String/ArrayRef,
Optional/Error, etc. I propose we stop explicitly qualifying these classes.

We can implement this proposal the same way as clang solved the same
problem, which is by creating a special LLVM.h

header in the Utility library. This header would adopt these classes
into the lldb_private namespace via a series of forward and "using"
declarations.

I think clang's LLVM.h is contains a well-balanced collection of adopted
classes, and it should cover the most widely-used classes in lldb too,
so I propose we use that as a starting point.


I think this is a great idea, particularly for llvm::Expected.   The signatures 
of functions
using Expected arer kind of noisy already, and adding llvm:: doesn’t help.

Anyone object to this idea?


I am in still in favour of that. :) I consider the following points to 
be the benefits of this proposal:

- consistency with llvm/clang/lld
- the extra llvm:: qualifications make people want to do away with the 
"cruft" via "auto", which *decreases* consistency with llvm 


- better formatting of code in the 80 columns we have available

On 08/10/2019 10:14, Jan Kratochvil wrote:
> If I should say something I would keep llvm::.
>
> My reason: The LLVM types are in many cases emulating classes adopted
> in future C++ standards and I find more clear llvm:: vs. std:: than
> "" vs. std::. Moreover when std:: is commonly omitted in other projects.


Which classes do you have in mind exactly? I know a lot of llvm 
*functions* mimic similar std:: versions, but I can't think of any 
*classes* right now. I mean StringRef is similar to std::string_view and 
so, but they still differ in the spelling of the base name...


Also, I am not proposing importing any llvm functions this way (and in 
fact, I am against a blanket "using namespace llvm", even in c++ files, 
save for some files with heavy llvm ties). It is true that often these 
llvm functions can be accessed unqualified thanks to ADL, but this 
proposal has nothing to do with that.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Rust support in LLDB, again

2019-10-01 Thread Pavel Labath via lldb-dev

+1 to everything that Jonas said.

On 30/09/2019 18:28, Jonas Devlieghere via lldb-dev wrote:

Hi Vadim,

On Sat, Sep 28, 2019 at 4:00 PM Vadim Chugunov via lldb-dev
 wrote:


Hi,
Last year there was an effort led by Tom Tromey to add Rust language support 
into LLDB.  He had implemented a fairly complete language plugin, however it 
was not accepted into mainline because of supportability concerns.I guess 
these concerns had some merit, because this change did not survive even in 
Rust's private branch due to the difficulty of rebasing on top of LLVM 9.


Unless my memory is failing me, I don't think we ever explicitly
rejected Rust's language plugin. We removed a few other language
plugins (Go, Java) that were not maintained and were becoming an
increasing burden on the community. At the same time we agreed that we
didn't want to make the same mistake again. Some of the things that
come to mind are having a working implementation, testing, CI, etc. If
the rust community can show that they're dedicated to maintaining Rust
support in LLDB, I wouldn't expect a lot of resistance. I just bring
this up because I don't want to discourage anyone from adding support
for new languages to LLDB.


I am wondering if there's a more limited version of this, that can be merged 
into mainline:
In terms of its memory model, Rust is not that far off from C++, so treating 
Rust types is if they were C++ types basically works.  There is only one major 
problem: currently LLDB cannot deal with tagged unions, which Rust code uses 
quite heavily.   When such a type is encountered, LLDB just emits an empty 
struct, which makes it impossible to examine the contents.

My tentative proposal is to modify LLDB's DWARFASTParserClang to handle 
DW_TAG_variant et al, and create a C++ approximation of these types, e.g. as a 
polymorphic class, or just an untagged union.   This would provide at least a 
minimal level of functionality for Rust (and possibly other languages) and be a 
much lesser maintenance burden on LLDB core team.
What would y'all say?


The people that actually work on this code should answer this, but
personally I don't have strong objections to this. That said, of
course I would prefer to have a (maintained) language plugin instead.

PS: Are there other changes that live downstream that are not Rust
specific and would benefit upstream LLDB and would potentially improve
Rust debugging?

Jonas


___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] lldb warning or developer options

2019-09-27 Thread Pavel Labath via lldb-dev

On 27/09/2019 07:52, Sourabh Singh Tomar via lldb-dev wrote:

Hi Folks,

Is their a developer switch or diagnostic switch, that we can set in 
lldb, to show warnings and other useful stuff when lldb process the 
binary, building table and utilizing DWARF.


I stumble upon, a rather trivial test case. where I deleted the *.dwo 
file. and loaded primary binary in lldb. lldb loaded it silently without 
any error or diagnostic reports for not found *.dwo file. Obviously, 
debug data wasn't loaded, as I entered list command, nothing happened.


Why need this, as new dwarf-5 features are in development, if lldb can 
report some diagnostics like section not found or ill-formed section, 
this would be immensely useful .


Thanks!
-- Sourabh Singh Tomar



Hi Sourabh,

there the Module::ReportWarning function, which will print a warning (to 
stderr generally). We should probably make use of that functionality to 
report missing dwo files. Care to put up a patch?


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: full support for python files, and avoid using FILE* internally

2019-09-24 Thread Pavel Labath via lldb-dev

On 23/09/2019 20:54, Larry D'Anna wrote:




On Sep 23, 2019, at 7:11 AM, Pavel Labath  wrote:

On 20/09/2019 17:35, Larry D'Anna via lldb-dev wrote:

Hi lldb-dev.
I want to be able to use LLDB inside of iPython, so I can have mixed python and 
LLDB debug session.
To this end, I’d like to update LLDB to have full support for python file 
objects, so the outputs of debugger commands can be redirected into iPython’s 
own streams.
This however, is difficult to do, because LLDB makes use of FILE* streams in a 
number of places.   This presents two problems.  The first is that there is no 
really
correct way to create SWIG typemaps that handle conversion to FILE* and get the 
ownership semantics correct.   The second problem is that there is not a 
portable
way to make a FILE* with arbitrary callbacks for reading and writing.   On 
Darwin and BSD there’s funopen, and on linux there’s something else, and I 
don’t know if
there’s any way on windows.
I made an attempt at this a while ago using funopen a while ago, here:
https://reviews.llvm.org/D38829
Zachary Turner suggested a more thorough approach. where instead of trying to 
use funopen to paper over all the use of FILE* streams, we should make
lldb_private::File capable of doing the dynamic dispatch and excise all the 
unnecessary FILE* stuff in favor of lldb_private::File.
That’s what I’ve done here: https://github.com/smoofra/llvm-project/tree/files
I’ve posted the first few patches to phabricator for review.
https://reviews.llvm.org/D67793
https://reviews.llvm.org/D67792
https://reviews.llvm.org/D67789
What do you think?




Hello Larry,

thanks for starting this thread.

So, judging by your problem description, it sounds to me like you're primarily 
interested in the SBCommandInterpreter::HandleCommand family of functions (and 
by extension, the SBCommandReturnObject class). Would that be a fair thing to 
say?


Not really.  I want to be able to embed a full LLDB session inside of iPython, 
which means redirecting anything that prints to the debugger's main output and 
error streams. Yes, in most cases that will be coming from HandleCommand(), 
but I really want to avoid the situation where some output that would normally 
be printed to the terminal is missed under iPython.


Ok, that's fair.




The reason I am asking this is that I'm wondering what is the scope of the 
thing you're proposing to do (and then, whether this is the best way to 
accomplish that). For instance, if we were only interested in the HandleCommand 
api, then it might be possible to plug the python in at a higher level (Stream 
instead of File). I am hoping that doing that might be easier as the Stream 
class has a simpler interface, and already supports multiple backing 
implementations (StreamFile, StreamString, ...).

Also, doing that would allow to side step some complicated questions. One of 
the reasons why getting rid of FILE* is so complicated (you're not the first 
person to try that) is that there are some APIs (libedit mainly), that we just 
cannot change, and which require a FILE*.


I saw that.   My strategy for dealing with that was to audit the codebase for 
any use of File::GetStream().   I found the only two places I could not remove 
the use of GetStream() was libedit and IOHandlerCursesGUI.In my prototype, 
I deal with that by checking for NULL from GetStream() before libedit or 
IOHandlerCursesGUI are enabled. In other words, If a File can produce a 
FILE*, it will.   But you can still  have a valid File that will return NULL 
from GetStream.   If you set your debugger streams to Files that return 
NULL from GetStream, then libedit and the curses GUI will be disabled.I 
think this is a reasonable approach.For my use-case in particular, there is 
no need for either libedit or the curses gui, because the whole point is to use 
iPython as the gui.  In general, libedit and curses only really make sense 
if the IO streams are a terminal anyway, so it’s not a problem to disable these 
features if the IO streams are redirected to python.


Ok, that also sounds like a reasonable position to take. Might be the 
only reasonable position, even. Theoretically, one might try to go the 
extra mile and try to synthesize a FILE* using fopencookie et al. on 
platforms that support that (the only platforms that support libedit and 
curses also happen to have a fopencookie equivalent). That's probably 
overkill now, but it is nice to have that option open for the future.





If you do want to go with the more general change, then I'd like to ask you to 
give a bit more detail about the your vision of the new role of the 
lldb_private::File class and its interaction with other major lldb components 
(SBFile, StreamFile, ???). My understanding (it's been a while since I looked 
at this in detail) is that the File class can be constructed from both FILE* 
and a file descriptor and (crucially) it is also able to give back these 
underlying objects, 

Re: [lldb-dev] RFC: full support for python files, and avoid using FILE* internally

2019-09-23 Thread Pavel Labath via lldb-dev

On 20/09/2019 17:35, Larry D'Anna via lldb-dev wrote:

Hi lldb-dev.

I want to be able to use LLDB inside of iPython, so I can have mixed 
python and LLDB debug session.


To this end, I’d like to update LLDB to have full support for python 
file objects, so the outputs of debugger commands can be redirected 
into iPython’s own streams.


This however, is difficult to do, because LLDB makes use of FILE* 
streams in a number of places.   This presents two problems.  The first 
is that there is no really
correct way to create SWIG typemaps that handle conversion to FILE* and 
get the ownership semantics correct.   The second problem is that there 
is not a portable
way to make a FILE* with arbitrary callbacks for reading and writing.   
On Darwin and BSD there’s funopen, and on linux there’s something else, 
and I don’t know if

there’s any way on windows.

I made an attempt at this a while ago using funopen a while ago, here:

https://reviews.llvm.org/D38829

Zachary Turner suggested a more thorough approach. where instead of 
trying to use funopen to paper over all the use of FILE* streams, we 
should make
lldb_private::File capable of doing the dynamic dispatch and excise all 
the unnecessary FILE* stuff in favor of lldb_private::File.


That’s what I’ve done here: 
https://github.com/smoofra/llvm-project/tree/files


I’ve posted the first few patches to phabricator for review.

https://reviews.llvm.org/D67793
https://reviews.llvm.org/D67792
https://reviews.llvm.org/D67789

What do you think?






Hello Larry,

thanks for starting this thread.

So, judging by your problem description, it sounds to me like you're 
primarily interested in the SBCommandInterpreter::HandleCommand family 
of functions (and by extension, the SBCommandReturnObject class). Would 
that be a fair thing to say?


The reason I am asking this is that I'm wondering what is the scope of 
the thing you're proposing to do (and then, whether this is the best way 
to accomplish that). For instance, if we were only interested in the 
HandleCommand api, then it might be possible to plug the python in at a 
higher level (Stream instead of File). I am hoping that doing that might 
be easier as the Stream class has a simpler interface, and already 
supports multiple backing implementations (StreamFile, StreamString, ...).


Also, doing that would allow to side step some complicated questions. 
One of the reasons why getting rid of FILE* is so complicated (you're 
not the first person to try that) is that there are some APIs (libedit 
mainly), that we just cannot change, and which require a FILE*.


If you do want to go with the more general change, then I'd like to ask 
you to give a bit more detail about the your vision of the new role of 
the lldb_private::File class and its interaction with other major lldb 
components (SBFile, StreamFile, ???). My understanding (it's been a 
while since I looked at this in detail) is that the File class can be 
constructed from both FILE* and a file descriptor and (crucially) it is 
also able to give back these underlying objects, including converting 
between the two. Now, I am assuming you're intending to add a third 
method of constructing a File object (using some python callbacks), but 
I assume that (due the mentioned lack of funopen etc.) you won't be 
trying to convert between these types. So, it would be good to spell out 
what exactly does the File class promise to do, and what happens when 
(e.g) a pythonified File object makes its way to code (libedit) which 
requires a FILE*.


regards,
pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] help, how to get a debug build on windows (python37_d.lib)

2019-09-23 Thread Pavel Labath via lldb-dev

On 22/09/2019 20:20, Larry D'Anna via lldb-dev wrote:

Hi lldb-dev.

I can’t seem to figure out how to build a debug lldb on windows.   It 
wants to link against a debug version of Python, which isn’t there.


My cmake line looks like this:

cmake -G Ninja `
         "-DPYTHON_HOME=C:\Program Files (x86)\Microsoft Visual 
Studio\Shared\Python37_64" `

         "-DLLVM_ENABLE_PROJECTS=clang;lldb;libcxx;libcxxabi;lld" `
"-DSWIG_EXECUTABLE=C:\ProgramData\chocolatey\bin\swig.exe" `
         "C:\Users\smoofra\llvm-project\llvm"

I also made this change, to tell it to link against the release python.

--- a/lldb/cmake/modules/LLDBConfig.cmake
+++ b/lldb/cmake/modules/LLDBConfig.cmake
@@ -227,7 +227,7 @@ function(find_python_libs_windows)
    else()
      # Lookup for concrete python installation depending on build type
      if (CMAKE_BUILD_TYPE STREQUAL Debug)
-      set(LOOKUP_DEBUG_PYTHON TRUE)
+      set(LOOKUP_DEBUG_PYTHON FALSE)
      else()
        set(LOOKUP_DEBUG_PYTHON FALSE)
      endif()

But somehow at the very end, the link still fails because python37_d.lib 
isn’t there.


Anybody know what I’m doing wrong?  Thank you.



Hi Larry,

I don't know the full details, but it is my understanding that due to 
how windows runtime libraries work (they have a separate debug and 
release CRT), all libraries in a single application need be linked 
against the same CRT flavour. IIRC, the default python installation does 
not come with a debug python, but it should be possible to install it 
somehow (possibly via checking some box in the installation dialog, but 
I don't remember the details).


It should also be possible to create a "fake" debug build by setting the 
CMAKE_BUILD_TYPE to Release, and enabling debug info (and disabling 
optimizations) via CMAKE_CXX_FLAGS, but it's probably better to just get 
the debug python installed.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Breakpoint matching with -n seems to have gotten too generous

2019-08-30 Thread Pavel Labath via lldb-dev

On 30/08/2019 02:33, Jim Ingham via lldb-dev wrote:

If I have a program like:

class A {
public:
   int AMethod() { return 100; }
};

class AA {
public:
   int AMethod() { return 200; }
};

int
main()
{
   A myA;
   AA myAA;
   myA.AMethod();
   myAA.AMethod();
   return 0;
}

Build and run it under lldb, and do:

(lldb) b s -n A::AMethod
Breakpoint 1: 2 locations.
(lldb) break list
Current breakpoints:
1: name = 'A::AMethod', locations = 2
   1.1: where = many_names`A::AMethod() + 8 at many_names.cpp:3:19, address = 
many_names[0x00010f78], unresolved, hit count = 0
   1.2: where = many_names`AA::AMethod() + 8 at many_names.cpp:8:19, address = 
many_names[0x00010f88], unresolved, hit count = 0

I think that's wrong.  The point of the fuzziness in -n is that you can leave 
out containing namespaces, or arguments, and we'll still match what you've 
given us.  But IMO that should only expand the search into containing contexts. 
 It is surprising to me that if I specify A::AMethod, I also match the one in 
the namespace AA.  If you wanted to match .*A::AMethod, you could do that with 
a regular expression.  But there's no easy way to not pick up extra breakpoints 
if you happen to have overlaps like this, so it seems like expanding -n to 
strstr type matches seems like a bad idea.

I wondered if other folks thought this was desirable behavior.


I was surprised by this behavior too. I wouldn't have expected 
AA::AMethod to match this.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] RFC: Support for unwinding on windows

2019-08-23 Thread Pavel Labath via lldb-dev

Hello everyone,

after a short recess, I have started to resume work on the breakpad 
symbols project. The project is "nearly" finished, but there is one more 
thing remaining to be done. This is to support stack unwinding for 32 
bit windows targets.


The unwinding info in breakpad is represented differently in breakpad, 
because unwinding on 32bit windows is... different. For example, an 
"unwind plan" for a typical win32 function would be described by a 
string like "$T0 .raSearch = $eip $TO ^ = $esp $TO 4 + =". Now, this 
string is basically a program in a special postfix language, and if you 
look closely, you can recognise fragments like "$eip = deref(T0)", and 
"$esp = $T0 + 4", which look fairly typical. However, there's this 
".raSearch" bit that is unlike anything we have in lldb at the moment.


The way that .raSearch works is that it asks the debugger to look around 
the stack for something that looks like a plausible return address, and 
then return the address of that return address. Now, the reason that 
this works (mostly) reliably is that on windows, the debug info contains 
information about the size of functions' stack frames -- the number of 
bytes taken up by local variables, arguments, etc. Armed with this 
knowledge, the debugger can skip portions of the stack that definitely 
do not hold the return address and can e.g. avoid confusing function 
pointer arguments for the return addresses.


The way I propose to implement this in lldb is:
- SymbolFile: add two new APIs to get the number of stack bytes used by 
a function for locals and the size of function arguments. Two functions 
are needed because (as usual) the function arguments are considered a 
part of the callers stack frame, and so the number of arguments for a 
function is relevant not when unwinding this function, but for the 
unwinding of its caller. SymbolFileBreakpad (and later SymbolFilePDB -- 
Adrian is working on that) will implement these function to return the 
appropriate values.
- UnwindPlan: add a new value (isFoundHeuristically) to the CFA kind 
enumeration. When encountering this value, the unwinder will search the 
stack for a value that looks like a pointer to a known code section. It 
will use the above SymbolFile APIs to skip over uninteresting parts of 
the stack. SymbolFileBreakpad will search for the known usage patterns 
of the .raSearch keyword and generate appropriate unwind plans (i.e. set 
CFA to "isFoundHeuristically", and appropriate rules for the other 
registers)


I've created a couple of patches which demonstrate how this could be 
implemented. The most interesting one is 
, which is WIP, but I should be 
sufficient for demonstration purposes. 
, and  
are also needed for this patch to work, but they are largely 
uninteresting from an general unwind perspective.


Let me know if you have any questions, concerns, suggestions, etc.

Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Fast Conditional Breakpoints (FCB)

2019-08-23 Thread Pavel Labath via lldb-dev

On 23/08/2019 00:58, Ismail Bennani via lldb-dev wrote:

Hi Greg,

Thanks for your suggestion!


On Aug 22, 2019, at 3:35 PM, Greg Clayton  wrote:

Another possibility is to have the IDE insert NOP opcodes for you when you 
write a breakpoint with a condition and compile NOPs into your program.

So the flow is:
- set a breakpoint in IDE
- modify breakpoint to add a condition
- compile and debug, the IDE inserts NOP instructions at the right places


We’re trying to avoid rebuilding every time we want to debug, but I’ll keep
this in mind as an eventual fallback.



A slight variation on that feature would be to just have the compiler 
guarantee that there will always be enough space between two jump 
targets for us to insert a trampoline jump. One way to guarantee that 
would be to align all jump targets to 16-byte boundaries (on x86 anyway).


I say this because I have a vague recollection that some of the more 
exotic llvm backends (webassembly?) may already have such a requirement, 
albeit for different reasons (to do with being able to statically 
analyze control flow), so the code for doing this might already be 
there, and maybe all it would take is a little tinkering with the 
codegen options to enable it.


Unfortunately, I don't remember the details of this, but someone on this 
list might...


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Fast Conditional Breakpoints (FCB)

2019-08-15 Thread Pavel Labath via lldb-dev

On 15/08/2019 20:15, Jim Ingham wrote:

Thanks for your great comments.  A few replies...


On Aug 15, 2019, at 10:10 AM, Pavel Labath via lldb-dev 
 wrote:
I am wondering whether we really need to involve the memory allocation 
functions here. What's the size of this address structure? I would expect it to 
be relatively small compared to the size of the entire register context that we 
have just saved to the stack. If that's the case, the case then maybe we could 
have the trampoline allocate some space on the stack and pass that as an 
argument to the $__lldb_arg building code.


You have no guarantee that only one thread is running this code at any given 
time.  So you would have to put a mutex in the condition to guard the use of 
this stack allocation.  That's not impossible but it means you're changing 
threading behavior.  Calling the system allocator might take a lock but a lot 
of allocation systems can hand out small allocations without locking, so it 
might be simpler to just take advantage of that.


I am sorry, but I am confused. I am suggesting we take a slice of the 
stack from the thread that happened to hit that breakpoint, and use that 
memory for the __lldb_arg structure for the purpose of evaluating the 
condition on that very thread. If two threads hit the breakpoint 
simultaneously, then we just allocate two chunks of memory on their 
respective stacks. Or am I misunderstanding something about how this 
structure is supposed to be used?





Another possible fallback behavior would be to still do the whole trampoline 
stuff and everything, but avoid needing to overwrite opcodes in the target by 
having the gdb stub do this work for us. So, we could teach the stub that some 
addresses are special and when a breakpoint at this location gets hit, it 
should automatically change the program counter to some other location (the 
address of our trampoline) and let the program continue. This way, you would 
only need to insert a single trap instruction, which is what we know how to do 
already. And I believe this would still bring a major speedup compared to the 
current implementation (particularly if the target is remote on a high-latency 
link, but even in the case of local debugging, I would expect maybe an order of 
magnitude faster processing of conditional breakpoints).


This is a clever idea.  It would also mean that you wouldn't have to figure out 
how to do register saves and restores in code, since debugserver already knows 
how to do that, and once you are stopped it is probably not much slower to have 
debugserver do that job than have the trampoline do it.  It also has the 
advantage that you don't need to deal with the problem where the space that you 
are able to allocate for the trampoline code is too far away from the code you 
are patching for a simple jump.  It would certainly be worth seeing how much 
faster this makes conditions.


I actually thought we would use the exact same trampoline that would be 
used for the full solution (so it would do the register saves, restores, 
etc), and the stub would only help us to avoid trampling over a long 
sequence of instructions. But other solutions are certainly possible too...




Unless I'm missing something you would still need two traps.  One in the main instruction stream and one to stop when the condition is true.  But maybe you meant "a single kind of insertion - a trap" not  "a single trap instruction" 


I meant "a single in the application's instruction stream". The counts 
of traps in the code that we generate aren't that important, as we can 
do what we want there. But if we insert just a single trap opcode, then 
we are guaranteed to overwrite only one instruction, which means the 
whole "are we overwriting a jump target" discussion becomes moot. OTOH, 
if we write a full jump code then we can overwrite a *lot* of 
instructions -- the shortest sequence that can jump anywhere in the 
address space I can think of is something like pushq %rax; movabsq 
$WHATEVER, %rax; jmpq *%rax. Something as big as that is fairly likely 
to overwrite a jump target.



...




This would be kind of similar to the "cond_list" in the gdb-remote 
"Z0;addr,kind;cond_list" packet <https://sourceware.org/gdb/onlinedocs/gdb/Packets.html>.

In fact, given that this "instruction shifting" is the most unpredictable part 
of this whole architecture (because we don't control the contents of the inferior 
instructions), it might make sense to do this approach first, and then do the instruction 
shifting as a follow-up.


One side-benefit we are trying to get out of the instruction shifting approach is not 
having to stop all threads when inserting breakpoints as often as possible.  Since we can 
inject thread ID tests into the condition as well, doing the instruction shifting would 
mean you could specify thread-specific breakpoints, and then ONLY the threads that ma

Re: [lldb-dev] [RFC] Fast Conditional Breakpoints (FCB)

2019-08-15 Thread Pavel Labath via lldb-dev
Hello Ismail, and wellcome to LLDB. You have a very interesting (and not 
entirely trivial) project, and I wish you the best of luck in your work. 
I think this will be a very useful addition to lldb.


It sounds like you have researched the problem very well, and the 
overall direction looks good to me. However, I do have some ideas 
suggestions about possible tweaks/improvements that I would like to hear 
your thoughts on. Please find my comments inline.


On 14/08/2019 22:52, Ismail Bennani via lldb-dev wrote:

Hi everyone,

I’m Ismail, a compiler engineer intern at Apple. As a part of my internship,
I'm adding Fast Conditional Breakpoints to LLDB, using code patching.

Currently, the expressions that power conditional breakpoints are lowered
to LLVM IR and LLDB knows how to interpret a subset of it. If that fails,
the debugger JIT-compiles the expression (compiled once, and re-run on each
breakpoint hit). In both cases LLDB must collect all program state used in
the condition and pass it to the expression.

The goal of my internship project is to make conditional breakpoints faster by:

1. Compiling the expression ahead-of-time, when setting the breakpoint and
inject into the inferior memory only once.
2. Re-route the inferior execution flow to run the expression and check whether
it needs to stop, in-process.

This saves the cost of having to do the context switch between debugger and
the inferior program (about 10 times) to compile and evaluate the condition.

This feature is described on the [LLDB Project 
page](https://lldb.llvm.org/status/projects.html#use-the-jit-to-speed-up-conditional-breakpoint-evaluation).
The goal would be to have it working for most languages and architectures
supported by LLDB, however my original implementation will be for C-based
languages targeting x86_64. It will be extended to AArch64 afterwards.

Note the way my prototype is implemented makes it fully extensible for other
languages and architectures.

## High Level Design

Every time a breakpoint that holds a condition is hit, multiple context
switches are needed in order to compile and evaluate the condition.

First, the breakpoint is hit and the control is given to the debugger.
That's where LLDB wraps the condition expression into a UserExpression that
will get compiled and injected into the program memory. Another round-trip
between the inferior and the LLDB is needed to run the compiled expression
and extract the expression results that will tell LLDB to stop or not.

To get rid of those context switches, we will evaluate the condition inside
the program, and only stop when the condition is true. LLDB will achieve this
by inserting a jump from the breakpoint address to a code section that will
be allocated into the program memory. It will save the thread state, run the
condition expression, restore the thread state and then execute the copied
instruction(s) before jumping back to the regular program flow.
Then we only trap and return control to LLDB when the condition is true.

## Implementation Details

To be able to evaluate a breakpoint condition without interacting with the
debugger, LLDB changes the inferior program execution flow by overwriting
the instruction at which the breakpoint was set with a branching instruction.

The original instruction(s) are copied to a memory stub allocated in the
inferior program memory called the __Fast Conditional Breakpoint Trampoline__
or __FCBT__. The FCBT will allow us the re-route the program execution flow to
check the condition in-process while preserving the original program behavior.
This part is critical to setup Fast Conditional Breakpoints.

```
   Inferior Binary Trampoline

|.|  +-+
|.|  | |
|.|   +->+   Save RegisterContext  |
|.|   |  | |
+-+   |  +-+
| |   |  | |
|   Instruction   |   |  |  Build Arguments Struct |
| |   |  | |
+-+   |  +-+
| +---+  | |
|   Branch to Trampoline  |  |  Call Condition Checker |
| +<--+  | |
+-+   |  +-+
| |   |  | |
|   Instruction   |   |  | Restore RegisterContext |
| |   |  | |
+-+

Re: [lldb-dev] STABS

2019-07-30 Thread Pavel Labath via lldb-dev

On 26/07/2019 20:43, Jim Ingham wrote:

Amplifying Fred's comments:

Most of the code in ParseSymtab is parsing the nlist records in the binary.  Only a tiny 
subset of those nlist records are "stabs".  Most are just the format by which 
MachO expresses its symbol table.  So all that needs to be there.

Over the past couple of years, the linker on MachO has switched from using 
nlist records to using the dyld trie data structure.  You can also see evidence 
of that in ParseSymtab.  At this point the nlist records are there because 
there are lots of analysis tools that haven't been updated to support the new 
dyld trie.  At some point, everything will be updated and the linker will 
switch over to only emitting the dyld trie, and not emitting the symbol table 
in nlist form.  When that is done and we convince ourselves we no longer need 
to support older binaries that still use nlist records, we can then remove the 
nlist parsing code.  But until then, this is how the symbol table is expressed. 
 The symbol parsing is actually the majority of the code in ParseSymtab.

Not all nlist records are stabs.  Stabs, per se, are the nlist records that have the 
is_debug flag set.  As Fred said, MachO uses the debug nlist records as the format for 
it's "debug map" which tells us where symbols from .o files ended up in the 
final linked product.  This is definitely NOT a stabs parser, we only support a tiny 
subset of the full stabs debug format, just what is needed for the debug map.  We've 
talked on and off about coming up with a dedicated format for the debug map, but so far 
there's been no strong motivation to actually do that, so we've continued to borrow a 
subset of stabs for the purpose.

There is one bit of ugliness, which is that the debug map parsing is essentially duplicated.  Look 
for: "if (is_debug)" and you will see two very similar blocks (2860 and 3826 in the 
current sources.)  Jason will remember the details better, but there was something gnarly about how 
libraries in the "shared cache" on Darwin systems work that made it awkward to use the 
same code for it and normal libraries.  Some ambitious person could probably go through and unify 
the two sections, but this is code that doesn't see much active development, it pretty much does 
what's needed, so it's not clear what the benefit would be at this point.

  Jim



Thanks for the detailed explanation Jim. I've found it very useful, as 
it plugs a large gap I've had in the knowledge of how debug info works 
on apple platforms.


The reason I was looking at this code in the first place is because I'm 
trying to add unwinding support on windows platforms. It is mostly 
straight-forward, but there is one large hickup in the form of the 
__stdcall calling convention on x86. This is a callee-cleanup 
convention, which AFAICT is a new thing to lldb.


The interesting bit here is that it becomes important to know the size 
of the arguments to a function during unwinding. This size is encoded in 
the symbol names (e.g. "_foo@4"). Due to the way that unwind info is 
represented (the argument pushes aren't represented in the caller, it 
may be necessary to look at the argument size of one function (callee) 
when unwinding another function (caller).


I was hoping I could represent this information in the Symbol class 
without increasing its size. Hence I was looking at the various other 
bits of information stored in there, and seeing if any of those can be 
removed.


Anyway, it looks like the "stabs" code and the various Symtab bits 
associated with it are going to stay. I'm not sure yet what does this 
mean for my unwinding effort, as I am still in the process learning how 
this stuff actually works, and whether the Symtab stuff is really 
needed, but I figured it would be good to at least explain the my 
motivations here.



On 26/07/2019 22:57, Chris Bowler wrote:> IBM is currently adding 
support for AIX to LLVM and we still have

> customers that use STABS for debug.  I expect customers to try to move
> to DWARF but I think the DWARF support on AIX needs some improvement
> before we can fully transition. I kindly request that we defer removal
> of the STABS support until IBM has a better handle on whether or not
> we'll want it for AIX.

Chris,

it seems that the code in question is going to stay for a while. 
Unfortunately, it looks like it won't be of much use to you, should you 
decide to add STABS support to lldb (it is in macho-specific parts of 
code, and is not a real STABS parser anyway).


Cheers,
pavel

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] RFC: changing variable naming rules in LLVM codebase & git-blame

2019-07-30 Thread Pavel Labath via lldb-dev

On 30/07/2019 11:53, Raphael “Teemperor” Isemann wrote:

Is the plan for LLDB to just adapt the code that is trying to follow the new 
code style or also the code using the LLDB code style?


I don't think there's a "plan" at this moment, but I believe Chris meant 
all of LLDB.




I’m in general in favor of moving LLDB to the LLVM code style because it makes 
the LLDB code that interfaces with Clang/LLVM less awkward to write (e.g. no 
more code style confusion when inheriting from a Clang classes inside the LLDB 
code base). But I think if we do this, then it should be discussed/planned in 
more detail and in a lldb-dev thread that actually reaches all/most LLDB devs. 
I wouldn’t even have read this thread if Pavel didn’t CC lldb-dev.

As a side note: LLDB has downstream projects that will suffer from this, but I 
believe (?) LLD has no downstream projects. So I think LLD is maybe also a good 
candidate to test this?


The details of this may have gotten lost in the long thread, but 
actually, LLD has gone through the reformatting in the beginning of this 
month. You can look up the details in the thread, but the short summary 
is that it was done via an automated script in a manner very similar to 
the Great LLDB Reformat a couple of years ago. Judging by the thread, 
there are downstream lld users, and while they encountered some hickups, 
it looks like the overall merge process has been relatively painless.


The topic of discussion now is "where do we go from here" and LLDB has 
been proposed as the next step.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [llvm-dev] RFC: changing variable naming rules in LLVM codebase & git-blame

2019-07-30 Thread Pavel Labath via lldb-dev

On 30/07/2019 01:57, Chris Lattner via llvm-dev wrote:
On Jul 29, 2019, at 10:58 AM, JF Bastien > wrote:
I think that Rui rolled this out in an incredibly great way with LLD, 
incorporating a lot of community feedback and discussion, and (as you 
say) this thread has accumulated many posts and a lot of discussion, 
so I don’t see the concern about lack of communication.


I think there’s lack of proper communication for this effort. The RFC 
is all about variable naming, with 100+ responses. Sounds like a 
bikeshed I’ve happily ignored, and I know many others have. Even if 
you don’t think I’m right, I’d appreciate a separate RFC with details 
of what’s actually being proposed. Off the top of my head I’d expect 
at least these questions answered:


  * What’s the final naming convention?
  * Will we have tools to auto-flag code that doesn’t follow it, and
can auto-fix it?
  * Will we clang-format everything while we’re at it?
  * Will we run clang modernizer to move code to C++11 / C++14 idioms
while we’re doing all this?
  * What’s the timeline for this change?
  * Is it just a single huge commit?
  * After the monorepo and GitHub move?
  * Is there a dev meeting roundtable scheduled?
  * What tooling exists to ease transition?
  * Out-of-tree LLVM backends are a normal thing. They use internal
LLVM APIs that should all be auto-updatable, has this been tried?
  * Some folks have significant non-upstream code. Have they signed up
to remedy that situation before the deadline (either by
upstreaming or trying out auto-update scripts)?


LLD and LLDB are indeed good small-scale experiments. However, I think 
the rest of the project is quite different in the impact such a change 
would have. LLVM and clang expose many more C++ APIs, and have many 
more out-of-tree changes (either on top of upstream, or in sub-folders 
such as backends or clang tools). They also have many more 
contributors affected, and not all those contributors have the same 
constraints, making this much more complex. So far this discussion 
hasn’t seemed to care about these concerns, and I’m worried we’re 
about to burn a bunch of bridges. Maybe I missed this part of the 
discussion in the 100+ emails! Sorry if I did… but again, a simple 
updated RFC would solve everything.


Thanks for the detailed list here.  I have no idea what the status of 
most of these are - it sounds like you’re generally asking “what is the 
plan?” beyond LLD.  :-)


Rui, what are your thoughts on next steps?  LLDB seems like a logical 
step, particularly because it uses its own naming convention that is 
completely unlike the rest of the project.




I don't speak for LLDB, but I personally would welcome such a change, 
particularly as there is some newer code in lldb now that attempts to 
follow the about-to-be-changed llvm conventions.


If we're going to go in that direction, it would be good to loop in 
lldb-dev, as I think some people don't follow llvm-dev regularly (and 
this thread in particular).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] STABS

2019-07-26 Thread Pavel Labath via lldb-dev

Hello everyone,

I recently found myself looking at ObjectFileMachO.cpp. I noticed that 
nearly half of that file (2700 LOC) is taken up by the ParseSymtab 
function, and that maybe one third of that is taken up by what appears 
to be STABS parsing code.


Is anyone still using STABS debug info? If not, can we remove it?

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Removing lldb-mi

2019-07-02 Thread Pavel Labath via lldb-dev

On 02/07/2019 00:09, Jonas Devlieghere via lldb-dev wrote:

Hi everyone,

After long consideration, I want to propose removing lldb-mi from the 
repository. It is basically unmaintained, doesn't match the LLDB code 
style, and worst of all the tests are unreliable if not already 
disabled. As far as I can tell it's missing core functionality to be 
usable from something like say emacs.




I am in favour of this proposal, for all of the reasons stated above. 
Since lldb-mi uses lldb stable API, any interested parties can easily 
create a separate project for it on github (or wherever). Maybe this 
would even serve as a spark to reignite lldb-mi development (which has 
been sitting in idle for quite some time now).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] Trouble running Python tests on Windows

2019-06-25 Thread Pavel Labath via lldb-dev

On 25/06/2019 17:00, Ted Woodward via lldb-dev wrote:
The bug that makes swig before 4 fail with Python 3.7 may turn into a 
big issue, given that swig is now licensed under GPL v3. I believe Apple 
has said in the past that they can’t move past a certain version of swig 
2.x, since the license changed from GPL v2 to GPL v3.





I believe that problem has already arrived, as there is no GPL v2 swig 
that can target any version of python3, much less python 3.7.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] RFC: Removing SymbolVendor indirection

2019-06-11 Thread Pavel Labath via lldb-dev

Hello all,

we currently have an almost identical set of about 20-odd functions on 
ModuleList, Module, SymbolVendor and SymbolFile classes. The only thing 
that the SymbolVendor functions are doing is taking the Module mutex, 
and then calling the equivalent SymbolFile function to do the work. The 
only useful task the objects of this class do is to hold the list of 
parsed compile units and types.


It also seems like at some point the intention was for the SymbolVendor 
to be able to host multiple SymbolFiles, but right now nobody makes use 
of that feature.


Therefore, I propose to remove the SymbolVendor indirection and have 
Module functions (and everybody else) call the SymbolFile directly when 
they need to.


Holding the compile unit and type lists is something that can be easily 
implemented in the base SymbolFile class instead of the symbol vendor. 
In fact, I would say that it is more natural to implement it that way, 
as it would change code like

  m_obj_file->GetModule()->GetSymbolVendor()->SetCompileUnitAtIndex(
dwarf_cu->GetID(), cu_sp);
into
  SetCompileUnitAtIndex(dwarf_cu->GetID(), cu_sp);

Locking the module mutex is also a responsibility that can be handled by 
the SymbolFile class. I would say that this also makes things cleaner, 
because even right now, the SymbolFile instances need to be aware of 
locking. E.g., some functions in SymbolFileDWARF (which is the only 
battle-tested symbol file implementation right now) take the module 
mutex themselves, while others assert that the module mutex is already 
taken.


The ability to have more than one SymbolFile for a module sounds useful, 
but: a) nobody uses it; b) I'm not sure it would really work in the 
current form. The reason for (b) is that the SymbolFile interface 
assumes a fixed numbering of compile units starting with zero. As such, 
this would require two SymbolFile instances to be aware of themselves 
anyway, in order to present a coherent compile unit list to the rest of 
lldb.


I don't think that the removal of SymbolVendor indirection would remove 
the possibility of multiple SymbolFiles though. Should anyone wish to do 
so, he can always implement a CombiningSymbolFile class, which will 
contain other symbol files, and delegate to them. This is kind of what 
is already happening, with SymbolFileDWARFDebugMap, and the external 
module lists in SymbolFileDWARF.


I already have a proof of concept patch which implements this idea. It's 
not a particularly big one -- it touches about 1k lines of code. If we 
agree this is the way to go, I can clean it up and submit for review as 
a sequence of smaller patches.


regards,
pavel

PS: I am not saying anything about the role of the SymbolVendor in 
finding the symbol files. This is because I am not planning to change 
that in any way. The idea is that the SymbolVendor will still be 
responsible for finding the symbol file, but it will then return the 
symbol file directly, instead of wrapping it in a SymbolVendor instance.


PPS: To get an idea of the changes involved, here's a stat of the 
changes required. Most of the changes are in SymbolVendor.cpp, and 
involve deleting code. The changes in other files are mostly mechanical 
and involve changing code fetching the SymbolVendor to access the 
SymbolFile directly.


 include/lldb/Core/Module.h |  43 +-
 include/lldb/Symbol/SymbolFile.h   |  39 +-
 include/lldb/Symbol/SymbolVendor.h | 135 +-
 include/lldb/Symbol/Type.h |   2 -
 include/lldb/lldb-private-interfaces.h |   4 +-
 source/API/SBCompileUnit.cpp   |   9 +-
 source/API/SBModule.cpp|  37 +-
 source/Commands/CommandObjectTarget.cpp| 173 
 source/Core/Address.cpp|  38 +-
 source/Core/Module.cpp | 121 +++---
 source/Core/SearchFilter.cpp   |   8 +-
 source/Expression/IRExecutionUnit.cpp  |   6 +-
 .../MacOSX-DYLD/DynamicLoaderMacOS.cpp |  27 +-
 .../ExpressionParser/Clang/ClangASTSource.cpp  |  27 +-
 source/Plugins/Platform/MacOSX/PlatformDarwin.cpp  | 196 +
 .../SymbolFile/Breakpad/SymbolFileBreakpad.cpp |  45 +-
 .../SymbolFile/Breakpad/SymbolFileBreakpad.h   |  15 +-
 .../SymbolFile/DWARF/DWARFASTParserClang.cpp   |  16 +-
 .../Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp   | 125 +++---
 source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h  |  14 +-
 .../SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp   | 105 ++---
 .../SymbolFile/DWARF/SymbolFileDWARFDebugMap.h |  10 +-
 .../SymbolFile/DWARF/SymbolFileDWARFDwo.cpp|  13 +-
 .../Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h  |   1 -
 .../SymbolFile/NativePDB/SymbolFileNativePDB.cpp   |  47 ++-
 .../SymbolFile/NativePDB/SymbolFileNativePDB.h |  11 +-
 source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp| 104 

Re: [lldb-dev] Are overlapping ELF sections problematic?

2019-06-05 Thread Pavel Labath via lldb-dev

On 04/06/2019 11:21, Thomas Goodfellow wrote:

Hi Pavel


I can't say what's the situation in the rest of llvm, but right now lldb
has zero test coverage for the flow you are using, so the fact that this
has worked until now was pretty much an accident.


It was a pleasant surprise that it worked at all, since flat memory
maps have become near-ubiquitous. But it's good to at least know that
the conceptual ice hasn't become any thinner through the patch, i.e.
it refines the existing state rather than reflecting a more explicit
policy change.


Yes, I didn't mean to make anything drastic with this patch. However, I 
would say that independently of this patch, in the past few years, lldb 
has gotten more strict in accepting features/fixes which don't have test 
coverage and/or are useful in only some peculiar downstream use case 
(see removal of ocaml/go/java language support, etc.)..





In the mean time, I believe you can just patch out the part which drops
the overlapping sections from the section list and get behavior which
was more-or-less identical to the old one.


I think this also requires reverting the use of the IntervalMap as the
VM address container, since that relies upon non-overlapping
intervals? That smells like a bigger fork than I would want like to
keep indefinitely alive.


It sounds like you might be able to just skip adding some (all?) of the 
sections into the interval map, which should result in all of them being 
created, like they used to be.


Or maybe you could fudge their "file addresses" and remap them into 
non-overlapping regions at this level too. It would break lookups by 
file addresses for the remapped sections, but this is something that 
didn't work already when the addresses overlapped. I'm not sure what 
else could be broken by this.. We already do some fudging like this for 
relocatable (.o) files, which have all addresses starting at zero, so it 
seems like at least something can work here.


For my own education, would you be able to send me one of your files 
with these overlapping sections (or maybe just the output of "readelf 
-e" or something)? I don't know much about these more exotic platforms, 
so being aware things like these might be of help when doing future changes.


Incidentally, I was just made aware that this change also breaks for 
thread-local sections, which can appear to have overlapping file 
addresses with other sections. So I will probably be revisiting this 
piece of code soon. However, right not my thinking is to simply stop 
putting thread-local section address range map while simultaneously 
starting to ignore them for file address lookups (as thread-local 
sections need to be handled in a more complex manner anyway). This won't 
help your use case much...





I believe that a long term solution here would be to introduce some
concept of address spaces to lldb. Then these queries would no longer be
ambiguous as the function FindSectionContainingFileAddress would
(presumably) take an additional address-space identifier as an argument.
I know this is what some downstream users are doing to make things like
this work. However, this is a fairly invasive change, so doing something
like this upstream would require a lot of previous discussion.


Would this also extend the GDB remote protocol, where the single flat
address space seems the only current option? (at least the common
solution in various GDB discussions of DSP targets is address muxing
of the sort we're using)


I would say "hopefully yes", but I not very familiar with these kinds of 
targets.




I imagine such changes are hampered by the lack of in-tree targets
that require them, both to motivate the change and to keep it testable
(the recent "removing magic numbers assuming 8-bit bytes" discussion
in llvm-dev features the same issue). Previously Embecosm was
attempting to upstream a LLVM target for its demonstration AAP
architecture (features multiple address spaces), e.g.
http://lists.llvm.org/pipermail/llvm-dev/2017-February/109776.html .
However their public forks on GitHub only reveal GDB support rather
than LLDB, and that implementation is by an address mux.

Unfortunately the architecture I'm working with is (yet another) poor
candidate for upstreaming, since it lacks general availability, but
hopefully one of the exotic architectures lurking in the LLVM shadows
someday steps forth with a commitment to keep it alive in-tree.



Yeah, the lack of in-tree targets is one of the causes (but also a 
consequence of ?) the lack of address space support. I've been following 
the non-8-bit thread from a distance, and FWIW, I would be fine with 
having some kind of a mock target supporting these things in lldb. I 
might even prefer debugging things against a simple mock instead of some 
complicated-but-real target.


The other causes are the main contributors not knowing enough about 
these architectures to help drive this, and just being generally busy 
with other stuff. :/



Re: [lldb-dev] Are overlapping ELF sections problematic?

2019-06-03 Thread Pavel Labath via lldb-dev

On 03/06/2019 10:19, Thomas Goodfellow via lldb-dev wrote:

I'm working with an embedded platform that segregates memory between
executable code, RAM, and constant values. The three kinds occupy
three separate address spaces, accessed by specific instructions (e.g.
"load from RAM address #0" vs "load from constant ROM address #0")
with fairly small ranges for literal address values. So necessarily
all three address spaces all start at zero.

We're using the LLVM toolchain with ELF32 files, mapping the three
spaces as.text, .data, and .crom sections, with a linker script
setting the address for all three sections to zero and so producing a
non-relocatable executable image (the .text section becomes a ROM for
an embedded device so final addresses are required). To support
debugging with LLDB (where the GDB server protocol presumes a single
flat memory space) the sections are mapped to address ranges in a
larger space (using the top two bits) and the debugger stub of the
platform then demuxes the memory accesses to the appropriate address
spaces).

Until recently this was done by loading the ELF file in LLDB, e.g:
"target modules load --file test.elf .data 0 .crom 0x4000 .text
0x8000". However the changes introduced through
https://reviews.llvm.org/D55998 removed support for overlapping
sections, with a remark "I don't anticipate running into this
situation in the real world. However, if we do run into it, and the
current behavior is not suitable for some reason, we can implement
this logic differently."

Our immediate coping strategy was implementing the remapping in the
file parsing of ObjectFileELF, but this LLDB change makes us
apprehensive that we may start encountering similar issues elsewhere
in the LLVM tooling. Are ELF sections with overlapping addresses so
rare (or even actually invalid) that ongoing support will be fragile?
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev



Hi Thomas,

I can't say what's the situation in the rest of llvm, but right now lldb 
has zero test coverage for the flow you are using, so the fact that this 
has worked until now was pretty much an accident.


The reason I chose to disallow the overlapping sections in the patch you 
quote was because it was very hard to say what will be the meaning of 
this to the upper layers of lldb. For instance, a lot things in lldb 
work with "file addresses" (that is, virtual address, as they are known 
in the file, without any remapping). This means that the overlapping 
sections become ambiguous even though you have remapped them to 
non-overlapping "load addresses" with the "target modules load" command. 
For instance, the result of a query like 
"SectionList::FindSectionContainingFileAddress(lldb::addr_t)" would 
depend on how exactly was the search algorithm implemented.


I believe that a long term solution here would be to introduce some 
concept of address spaces to lldb. Then these queries would no longer be 
ambiguous as the function FindSectionContainingFileAddress would 
(presumably) take an additional address-space identifier as an argument. 
I know this is what some downstream users are doing to make things like 
this work. However, this is a fairly invasive change, so doing something 
like this upstream would require a lot of previous discussion.


In the mean time, I believe you can just patch out the part which drops 
the overlapping sections from the section list and get behavior which 
was more-or-less identical to the old one. However, I can't guarantee 
that nothing else will break in this scenario. I also wouldn't be 
opposed to making some change to this logic upstream too, if we can come 
up with some consistent story as to what exactly this means.


regards,
pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [Lldb-commits] [lldb] r360757 - Group forward declarations in one namespace lldb_private {}

2019-05-16 Thread Pavel Labath via lldb-dev

On 16/05/2019 01:10, Jim Ingham via lldb-dev wrote:

When you add to them you are often adding some larger feature which would have 
required a rebuild anyway, and they go long times with no change...  I have 
never found the rebuild required when these files are touched to be a drag on 
my productivity.  And I really appreciate their convenience.

But thanks for your friendly advice.

Jim



I don't want to make a big deal out of it, but I'm also not a fan of the 
lldb-forward header. My two main reasons are:
- it's inconsistent with the rest of llvm, which does not have any 
headers of such sort (LLVM.h, which we talked about last time, is the 
only thing remotely similar, but that's still has a much more narrow scope)
- it makes it easier to violate layering. Eg. right now I can type 
something like:

void do_stuff_with(Target *);
in to a "Utility" header, and it will compile just fine because it will 
have the forward-declaration of the Target class available even though 
nothing in Utility should know about that class.


Neither of these is a big problem: this is not the most important thing 
we differ from llvm, and also the layering violation will become obvious 
once you start to implement the "do_stuff_with" function (because you 
will hopefully need to include the right header to get the full 
definition). However, for these reasons, I would prefer if we got rid of 
this header, or at least moved towards a world where we have one 
forward-declaring header for each top-level module (so 
"lldb/Utility/forward.h", would forward-declare only Utility stuff, etc.).


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Support for AArch64/Arm64 scalable vector extension (SVE)

2019-04-26 Thread Pavel Labath via lldb-dev

On 25/04/2019 00:07, Omair Javaid via lldb-dev wrote:
I would also like to open a discussion on how we can implement variable 
length registers in LLDB and what could be the consequences of those 
changes.


I am not saying I am particularly happy with how it was implemented, but 
you should take a look at the "dynamic_size_dwarf_expr" members of the 
RegisterInfo struct 
. 
These were added to support some variable length registers on mips, and 
it sounds like they could be useful here too.


pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] LLDB Website

2019-04-24 Thread Pavel Labath via lldb-dev

On 24/04/2019 03:19, Jonas Devlieghere via lldb-dev wrote:



On Tue, Apr 23, 2019 at 6:04 PM Jonas Devlieghere > wrote:




On Tue, Apr 23, 2019 at 5:43 PM Tanya Lattner mailto:tanyalatt...@llvm.org>> wrote:




On Apr 23, 2019, at 5:06 PM, Jonas Devlieghere
mailto:jo...@devlieghere.com>> wrote:



On Tue, Apr 23, 2019 at 5:00 PM Tanya Lattner
mailto:tanyalatt...@llvm.org>> wrote:




On Apr 23, 2019, at 11:54 AM, Jonas Devlieghere
mailto:jo...@devlieghere.com>> wrote:

Hey Tanya,

On Tue, Apr 23, 2019 at 11:51 Tanya Lattner
mailto:tanyalatt...@llvm.org>> wrote:

Jonas,

Ignore what I said before as these do need to be
separate targets. It appears the new targets are
running doxygen. This isn’t something we typically do
as a post commit hook since it takes awhile. I’ll
need to do this via the doxygen nightly script. Any
concerns?

That sounds perfect. Can we still do the regular website
post commit? 


Yes, so it will do docs-lldb-html on every commit.


Perfect!


So I am able to generate the cpp reference docs:
https://lldb.llvm.org/cpp_reference/index.html

However, the main website links to
https://lldb.llvm.org/cpp_reference/html/index.html. Do
you want the html in that url? I can change the alias. We
strip for other doxygen.


Let's keep it without the html. I'll update a link on the
website and add a redirect.


As for python docs, what is required to build those? It's
not showing up as a target for me.


This is probably because you don't have `epydoc` installed
(sudo pip install epydoc).
I think you'll have to re-run cmake after for it to pick it
up. The corresponding target should then be `lldb-python-doc`.

https://lldb.llvm.org/cpp_reference/index.html


Well installing epydoc did the trick, but I don’t think the
doxygen script is the right place for this target. I have not
dug into it yet but it appears to require some LLVM libraries
and is building those. I’m letting it finish to verify it builds
but I’ll have to sort out the best way of doing this on the
server. We have other scripts that generate other documentation
that build parts of LLVM. Ideally, I would want to leverage that
and reduce build times.


Yeah, the annoying thing about the Python documentation is that it
builds the C++ API, then runs swig to generate the Python wrapper,
and finally generates the docs from that.


It should be possible to solve this by tweaking the dependency graph a 
bit. There's no fundamental reason why you need to build anything in 
order to run swig. It is purely a textual step -- it ingests header 
files and interface definitions and spits out python and cpp files. The 
inputs are present as static checked in source, so the swig step could 
theoretically be the very first build command that we run.



I wonder if we can just use the static bindings that are checked-in
instead. I will look into that later today/tomorrow. 



Right, so the reason is that we don't have the static bindings on 
llvm.org  (we have them for swift-lldb on GitHub).
Maybe we should check them in upstream too? That's something the 
community will have to weigh in on...




I think it would be good to avoid that...

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Adding a clang-style LLVM.h (or, "Are you tired of typing 'llvm::' everywhere ?")

2019-04-18 Thread Pavel Labath via lldb-dev

On 18/04/2019 20:03, Robbert Haarman via lldb-dev wrote:

On Thu, Apr 18, 2019, at 7:31 AM, Pavel Labath via lldb-dev wrote:


Out of interest, I took a look at what lld is doing. I've found that
while it doesn't have a LLVM.h equivalent, it is a heavy user of "using
namespace llvm" (about 2 out of 3 cpp files have it).


LLD has include/lld/Common/LLVM.h, which is included in 60 or so places.



Woops. That's actually a pretty embarrassing mistake. Thanks for 
correcting me. :)

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] [RFC] Adding a clang-style LLVM.h (or, "Are you tired of typing 'llvm::' everywhere ?")

2019-04-18 Thread Pavel Labath via lldb-dev
Thanks for the replies. I was hoping to get more positive feedback for 
this, so given the current mixed-feelings replies, I think I'll just 
give up on this idea, unless a more vocal supporter appears (probably 
not the best idea to send this out just before the easter holidays).


In the mean time, here are my thoughts on what was said.


On 18/04/2019 01:54, Adrian McCarthy wrote:

I don't have a strong opinion, but I lean against the idea for two reasons:

1.  The `llvm::` prefixes don't really hinder readability for me.  
They're like `std::` prefixes on all the C++ standard library types, 
which I'm perfectly happy to type and read--moreso than using 
declarations.  Sure, anybody who's been here a while knows which classes 
come from LLVM, but new folks might build that knowledge by seeing the 
prefixes.


Yeah, I was wondering why I'm bothered by typing "llvm::" and not by 
"std::". I concluded that this is down to two things:
1. we don't use that many things from the std:: namespace actually. 
Pretty much everything except std::string and std::vector is discouraged 
because llvm has better alternatives


2. llvm names are longer. This is not just due to to "llvm" prefix, 
which is just one char, but also the class names themselves tend to be 
longer. std::vector vs llvm::SmallVector, std::map vs. llvm::DenseMap, 
std::string vs. llvm::StringRef, etc.


This effect gets multiplied once you start to combine things. For 
instance if you have a function returning Expected> (which 
is not an unreasonable thing to do), then by the time you spell out the 
full type, more than half of your usable horizontal space is gone. 
Because of this, I've found myself using "auto" or relying on ADL more 
and more often, which I don't consider very ideal either.


I don't think using "auto" is always a good choice because it hides 
interesting details. E.g. an Optional can look a lot like 
Expected, but there are differences in how they are supposed used 
which should not be overlooked (I wish I was able to type 
"Expected" :P). And ADL is sometimes just too magical...





2.  I'm not a fan of forward declaring types provided by other parts of 
the code, as it requires intimate knowledge of implementation details.  
In practice this may not matter much for the types we're considering.  
If it grew more widespread, however, I'd be more concerned.  (Somewhere 
I've written a long explanation of this opinion.  I'll go search for it 
if anyone cares.  The Google style guide discourages forward 
declarations, but the rationale given there isn't as persuasive.)


Yeah, I agree the forward declarations are not ideal (and the clang file 
did raise my eyebrows when I first saw it), but after a while I started 
to like it.


FWIW, I wouldn't be opposed to just #including the relevant files 
instead of forward-declaring stuff, but I think doing it the same way is 
better for consistency.



Out of interest, I took a look at what lld is doing. I've found that 
while it doesn't have a LLVM.h equivalent, it is a heavy user of "using 
namespace llvm" (about 2 out of 3 cpp files have it). This approach 
wouldn't work that well for us because of naming conflicts ("Module"), 
and I would consider it inferior for the same reason that "using 
namespace std" is discouraged -- it just brings in too much stuff into 
your scope.


regards,
pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


[lldb-dev] [RFC] Adding a clang-style LLVM.h (or, "Are you tired of typing 'llvm::' everywhere ?")

2019-04-17 Thread Pavel Labath via lldb-dev

Hello all,

some llvm classes, are so well-known and widely used, that qualifying 
them with "llvm::" serves no useful purpose and only adds visual noise. 
I'm thinking here mainly of ADT classes like String/ArrayRef, 
Optional/Error, etc. I propose we stop explicitly qualifying these classes.


We can implement this proposal the same way as clang solved the same 
problem, which is by creating a special LLVM.h 
 
header in the Utility library. This header would adopt these classes 
into the lldb_private namespace via a series of forward and "using" 
declarations.


I think clang's LLVM.h is contains a well-balanced collection of adopted 
classes, and it should cover the most widely-used classes in lldb too, 
so I propose we use that as a starting point.


What do you think?

regards,
pavel

PS: I'm not proposing any wholesale removal of "llvm::" qualifiers from 
these types, though I may do some smaller-scale removals if I'm about to 
substantially modify a file.

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


  1   2   3   4   5   >