Re: [lldb-dev] Proposal for refactoring process launching

Greg Clayton Wed, 24 Sep 2014 11:03:52 -0700

Comments:

- When running processes one might want to get signals for the process that are 
delivered while the child process is running. There can be any amount of 
signals over the child process' lifetime. This must be included in the design. 
It might not work for windows, but it will need to be worked into the design.


- You can change the internal API, but you can't change the public API. If you 
make changes to any of the launch info classes internally, you will need to 
keep the SBLaunchInfo API the same. I know you like cleaning things up, but we 
have clients that use this API and we don't want it broken. You will need to 
carefully check for any other things that get modified in the launch info class 
and make sure any such cases are fixed.

- We should make ProcessLauncher a plug-in and be able to request them by name. 
Then we can add things to the launch info to be able to specify the process 
launcher plug-in name ("posix_spawn", "fork", "apple-xpc", "default" which can 
map to the default launcher for a host, etc).

More comments below

> On Sep 23, 2014, at 1:38 PM, Zachary Turner <ztur...@google.com> wrote:
> 
> Apologies in advance for the long read, tried to write this up as clearly as 
> possible.  At a minimum I'd like to complete the HostProcess abstraction and 
> the LaunchProcess signature change, but I think the part that follows (which 
> I've listed as optional) is a net gain for the codebase and improves the 
> maintainability and clarity of all of this code.
> 
> 
> This will discuss a plan for refactoring process management and launching 
> code in the Host layer into a set of reusable abstractions.  This is part of 
> a continued effort to refactor Host.cpp in a way that lends itself to better 
> re-usability, better platform-separation (code specific to platform X is 
> isolated to a file specific to platform X), and a stronger barrier between 
> generic code and platform specific code.  One of the biggest difficulties in 
> my efforts to port LLDB to Windows so far has been locating and fixing places 
> in LLDB where an operation was assumed to be generic (either logically or 
> through the use of a particular API) when it actually was not.  By 
> re-thinking the Host layer from the bottom up with Windows in mind, we can 
> arrive at a set of abstractions that is truly generic enough to be used 
> across all platforms, while still making it possible to use platform specific 
> features from other platform specific code.
> 
> The HostProcess abstraction
> 
> It's probably easiest to start with the core abstraction itself, and then 
> discuss the other supporting changes needed to arrive at that.  At the core 
> of this refactor is the |HostProcess| abstraction.  This is a replacement for 
> what is currently an lldb::process_t, and serves as an abstraction that 
> allows one to manipulate a process running on the system.  In order to create 
> a strong barrier between generic code and platform specific code, 
> |HostProcess| will use a facade pattern similar to that used by |HostThread|, 
> whereby |HostProcess| provides only methods which are generic enough to work 
> on every platform, and wraps a |HostNativeProcessBase| for which there might 
> be many derived implementations, each implementing the generic options for 
> the particular platform, as well as potentially offering any relevant 
> platform specific operations so that other platform specific code can take 
> full advantage of the platform.
> 
> A generic interface for process will provide at a minimum the following 
> methods:
> 
>       • Terminate
>       • GetMainModule
>       • GetProcessId

"GetProcessID" instead of "GetProcessId".

>       • IsRunning
> 
> Additionally, there may be use for methods to inspect the process while 
> running, such as methods like the following:
> 
>       • ReadProcessMemory
>       • WriteProcessMemory

Why have "Process" in the name? It is on a process class.

You might want suspend/resume as well.

> 
> Additional platform-specific methods will be exposed through the 
> corresponding subclasses of |HostNativeProcessBase|.  The most obvious 
> example of this is a Signal method that will be exposed through 
> |HostProcessPosix|, although the proposed design provides the flexibility for 
> this set of platform-specific methods to be arbitrary.
> 
> 
> Launching Processes
> Currently the ability to launch a process is exposed through a single method 
> Host::LaunchProcess which takes a |ProcessLaunchInfo| argument.  Upon 
> completion, LaunchProcess writes the pid of the newly created process into 
> the |ProcessLaunchInfo|.  Because of this, the ProcessLaunchInfo structure is 
> required to be passed in as a non-const reference so it can be modified.  To 
> address this, the launch method will be modified to return a HostProcess.  
> The PID will not be written back into the |ProcessLaunchInfo| (and if 
> possible the pid field will ideally be removed from |ProcessLaunchInfo| 
> entirely), allowing it to be passed const.

We should make this a plug-in type for reasons mentioned at the beginning. You 
also need to keep the public API unchanged for SBLaunchInfo. You might need to 
make a small wrapper class for this to wrap the SBLaunchInfo and the resulting 
HostProcess into a single class to be able to wrap this. You should also add 
the ability to set a process launcher plug-in by name both internally and in 
the SB layer. It is ok to add functions to a public API, but not to change it. 
You can add new API to the SB layer ( like adding a new SBHostProcess and 
having "SBHostProcess SBLaunchInfo::GetHostProcess()", but you need to maintain 
the old API as well).

> 
> Optional code re-org related to process launching
> The process launching code is relatively confusing and difficult to maintain. 
>  Because the code is not modular and not written to be re-usable at a 
> granular level, some hackery exists to make the correct methods be found by 
> the linker on various platforms.  The following sections describes a possible 
> code re-organization that makes everything more modular and re-usable, and 
> makes the code easier to understand and maintain as a result.
> 
> The basis of this code re-organization is the creation of a |ProcessLauncher| 
> interface will be created.  Instead of using Host::LaunchProcess, anyone 
> wishing to launch a process will instantiate some subclass of 
> |ProcessLauncher|, and then invoke the Launch method on this interface, which 
> will return a HostProcess.  For those just wishing to get the default 
> behavior on a particular platform (i.e. get the behavior equivalent to 
> calling Host::LaunchProcess now), |DefaultProcessLauncher| will be typedef'ed 
> to the most sensible process launcher on each platform.  This allows existing 
> code to be written by simply instantiating a |DefaultProcessLauncher| and 
> using it.  The interface of |ProcessLauncher| will provide the following 
> methods:
> 
>       • LaunchProcess
>       • AddExitCallback

> 
> The second of these two methods, AddExitCallback, replaces and improves upon 
> the existing mechanism in |ProcessLaunchInfo| which allows the user to 
> specify an exit callback.  The improvement comes from the fact that multiple 
> exit callbacks can be specified.

That is fine. Not really needed by any case I can think of but a good change.

> 
> This approach yields a number of benefits.  For starters, it allows us to 
> abstract out the notion of the launching strategy into a reusable piece of 
> code so that platforms which have multiple launching strategies have the 
> ability to make use of specific strategies in other platform specific code, 
> or so that remote process launches can be treated the same as local process 
> launches.  As an example, on MacOSX there are three different launching 
> strategies:  Launch with XPC services, launch with Applescript, and launch 
> with posix_spawn.  Each of these could be independent implementations of 
> |ProcessLauncher|, with the default implementation on MacOSX being a fourth 
> implementation that chooses one of the first 3 implementations to delegate to 
> at runtime.  As a result, the 3 implementations can be used independently of 
> each other should the need arise.
> 
> A second benefit is that it opens up the possibility of treating remote 
> process launch transparently with local process launch -- just have a 
> |RemoteProcessLauncher| class.

Uh... Then don't call your class HostProcess. I really would rather not have 
remote launching involved in this at all. Remote launching should result in a 
lldb::pid_t and it won't use the native process layer for anything, so I would 
avoid including remote process launching in this as it mucks up what info might 
need to be in the launch info (remote host/port, password, etc). So no remote 
process support please.

> 
> There are additional benefits in the form of overall code health, platform 
> separation, and just generally improving the modularity of the codebase, 
> although those are perhaps less tangible.
> 
> Monitoring for process exit
> It is important to know when a process exits, for 2 reasons.  First, on some 
> platforms the process needs to be reaped.  Second, asynchronous exit 
> callbacks need to be invoked.  The implementation of the monitoring is 
> platform dependent, and a strategy for each platform can be summarized as 
> follows:
> 
>       • Windows
>               • Create a background thread, use WaitForSingleObject
>               • [Better, alternative implementation] Create one "static" 
> background thread for all processes, use WaitForMultipleObjects() to monitor 
> many processes at once, like select().
>       • MacOSX
>               • Processes spawned with posix_spawn - Use MacOSX "dispatch" 
> library to queue monitoring code on a system thread.
>               • Processes spawned with XPC - Use MacOSX "dispatch" library to 
> queue monitoring code on a system thread.
>               • Processes spawned with Applescript - No monitoring necessary
>       • Non-MacOSX posix systems
>               • Processes spawned with posix_spawn - Create a background 
> thread, use waitpid in the background thread.

We also need to be able to deliver signals during child process execution.

>   
> To handle this, a |ProcessExitMonitor| abstraction will be created, with 
> implementations for each of the above strategies.  Individual implementations 
> of |ProcessLauncher| will provide a method CreateProcessMonitor() which 
> returns an instance to the correct monitor type, and the resulting 
> |ProcessExitMonitor| will be stored in the |HostProcess|. 

That is fine as long as NULL can be returned in cases where a process can't be 
monitored (AppleScript).
> 
> It’s possible to see some of the benefit in code re-organization from the 
> above breakdown of implementations.  All posix-based systems can sometimes 
> use a posix_spawn based strategy for launching processes, but MacOSX might 
> use other strategies depending on the situation.  Even when MacOSX uses 
> posix_spawn though, it still uses a different exit monitoring implementation. 
>  This set of modular abstractions handles this nicely.  
> |ProcessLauncherPosixSpawn| goes in host/posix, and takes a 
> |ProcessExitMonitor| to its constructor.  If MacOSX decides at runtime to 
> launch using posix_spawn, it passes a |DispatchExitMonitor|, which lives in 
> host/macosx to the constructor.  All code always lives at the most generic 
> level where it makes sense for everyone, and there’s no code pollution or 
> pre-processor hackery to get certain functions to be visible.

We might want the rename the ProcessExitMonitor to ProcessStatusMonitor since 
signals can be delivered from the child process that don't cause it to exit 
(SIGSTOP, etc).
> 
> 
> Since the |HostProcess| can go out of scope while the process is still 
> running on the system, it is important that the lifetime of the 
> |ProcessExitMonitor| be tied to the lifetime of the process running on the 
> system,

> and not to the lifetime of the |HostProcess| object inside of LLDB.  This can 
> be achieved by having |HostProcess| store an 
> std::weak_ptr<ProcessExitMonitor>, and having the background thread own the 
> |ProcessExitMonitor| through an std::shared_ptr<ProcessExitMonitor>.  This 
> way, even if LLDB decides it no longer needs the |HostProcess|, if the 
> process is still running in the background it will get reaped and callbacks 
> fired correctly.
> 
> Getting it done
> In an effort to keep the CLs manageable, this work will be split into 
> multiple changes, in the order described below:
>       • Create the HostProcess abstraction, implement it for all platforms.  
> Do not update any code to use HostProcess yet.
>       • Modify the signature of Host::LaunchProcess to return a HostProcess, 
> and remove the pid from ProcessLaunchInfo.  Update calling code accordingly.

Add step here: 
        Create a SBHostProcess.h/.cpp/.i file and plumb it through the public 
API. Modify SBLaunchInfo to be able to return a SBHostProcess, yet still 
maintain its current API. I actually just looked and it doesn't look like we 
have a  SBLaunchInfo::GetProcessID() so there might not be anything to do. Just 
make sure not to remove any API calls from the public API, and feel free to add 
what you want.

>       • Create the various implementations of ProcessExitMonitor, update 
> Host::LaunchProcess to use the appropriate ProcessExitMonitor instead of 
> calling StartMonitoringChildProcess.
>       • Create the various implementations of ProcessLauncher, update calling 
> code to use them instead of Host::LaunchProcess, delete Host::LaunchProcess.

Let me know what you think about the new ProcessLauncher plug-in type and all 
my other comments. Overall sounds like a good plan.

Greg



_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] Proposal for refactoring process launching

Reply via email to