Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

Pavel Labath via lldb-commits Tue, 06 Feb 2018 02:19:48 -0800

Right I see. Thanks for the explanation.

I feel this is a good time to bring up again my idea for implementing
gdb-server connections in a port-forwarder-friendly way. It's
described at 
<http://lists.llvm.org/pipermail/lldb-dev/2017-November/012973.html>,
so I am not going to repeat the details here, but the crux of it is
that it would allow us to connect to gdb-server without requiring the
running of any special port-forwarding programs(*), which is an issue
that periodically comes up as a problem for some one. I don't know if
you're happy with the way you are currently doing the port forwarding,
but just in case you're not, I want to tell you that there is a way to
avoid it. :)


(*) In some situations (which probably includes your iphone scenario
as well) you will still need to forward *one* port for the initial
platform connection, but that can be simpler than needing to forward
dozens of ports to enable fast parallel testing. Also, in a lot of
scenarios (like a typical NAT setup) you wouldn't need to forward any
ports at all.

pl

On 5 February 2018 at 19:39, Jason Molenda <jmole...@apple.com> wrote:
> Yeah I saw the TCPSocket::Accept differences on top of tree after sending the 
> email - unfortunately I have to do my work on this old branch, and I can't 
> build our top of tree sources for the iphone target right now. It's not a 
> great situation!
>
> The min/max ports are needed for running the testsuite on a device, we run a 
> program to relay a specific list of ports between the mac and the phone so we 
> have to depend on that.
>
> I'll see if I can get the testsuite running on my local mac via lldb-server 
> platform on top of tree and see what changes are needed there, hopefully I 
> can repo the same problems I'm hitting on this old branch if they're still 
> applicable.
>
>
>> On Feb 5, 2018, at 2:56 AM, Pavel Labath <lab...@google.com> wrote:
>>
>> Hi Jason,
>>
>> Thanks for the heads up. I look forward to getting rid of fork()
>> there, but there is one thing that's not clear to me. You say that
>> TCPSocket::Accept() calls CloseListenSockets().. I see don't see
>> anything like that in the current code, and I know for a fact that we
>> are able to handle multiple (concurrent) connections, as that is how
>> we run tests on android right now.  Could this be some problem with
>> the branch that you were working on?
>>
>> In general, if you don't need to use the --min-port/--max-port
>> arguments (that is indeed broken in the fork mode), then everything
>> should already be set-up for the parallel remote testing. In fact,
>> unless you have a hard requirement to use --min/max-port then I would
>> advise against it. I think it can cause you a lot of problems down the
>> line. The issue with vending out ports like this is that the
>> "PortCoordinator" does not have control over the whole system, and it
>> cannot prevent some other process from claiming one of the ports that
>> it "owns". So, you may need to implement some retry logic to handle
>> the case when debugserver cannot bind to the port that it was assigned
>> to.
>>
>> I think that a much more reliable and simpler solution would be to
>> have debugserver ask the OS to assign it an unused port (by binding to
>> port zero), and then report that port back to lldb-platform (via the
>> --named-pipe arg -- this is how we use llgs right now). Besides fixing
>> the port allocation problem, this will also ensure that by the time
>> the platform reports the port back to the client, debugserver is
>> already primed and waiting for a connection. I'm not sure if
>> debugserver right now supports the --named-pipe argument, but it
>> should be fairly trivial to add it if needed.
>>
>> Of course, if you need to the min/max-port mode of operation then
>> that's fine, but if this were up to me, i'd try hard to find a way to
>> avoid it. :)
>>
>> regards,
>> pavel
>>
>> On 3 February 2018 at 04:12, Jason Molenda <jmole...@apple.com> wrote:
>>> Hi Pavel, I closed this phabracator, I've been working on this on-and-off 
>>> and I think a different approach would be better.
>>>
>>> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp 
>>> from using a fork() model to using std::thread's.  The main thread listens 
>>> for incoming connections, and when one is received, it starts up a new 
>>> std::thread to handle that connection.  I have a PortCoordinator singleton 
>>> object that manages the ports we can use for communications.  Newly created 
>>> threads request ports from (& free them when the thread is finished) so 
>>> that it would be possible to run multiple tests at the same time.
>>>
>>> The current code fork()'s when it receives a connection, and gives the 
>>> child process a copy of all the ports that are available.  Because it's in 
>>> a separate process, if we were to receive a second connection and fork off 
>>> a new process, it would have the same list of ports listed as available.  
>>> When debugserver is being used, this is a problem - when the lldb-server 
>>> platform thread/process gets a qLaunchGDBServer packet, we run debugserver 
>>> saying "hey debugserver listen for a connection on port N" and then tell 
>>> the remote lldb process "there's a debugserver waiting to hear from you on 
>>> port N" - so lldb-server can't test whether port N is in use or not.
>>>
>>> (there was also a problem in 
>>> GDBRemoteCommunication::StartDebugserverProcess where it has a url like 
>>> localhost:1111 and then it tries to run debugserver in --named-pipe mode 
>>> even though we already have a port # to use in the url.)
>>>
>>> The final problem is that TCPSocket::Accept waits for an incoming 
>>> connection on the assigned port #, and when it comes in it gets a new file 
>>> descriptor for that connection.  It should resume listening to that 
>>> assigned port for the next connection that comes in ... but today 
>>> TCPSocket::Accept calls CloseListenSockets() so after opening the first 
>>> lldb-server platform connection that comes in, when we go to accept() the 
>>> second, the socket has been closed and we exit immediately.  To quickly 
>>> recap the POSIX API (I never do network programming so I have to remind 
>>> myself of this every time), the workflow for a server like this usually 
>>> looks like
>>>
>>> parentfd = socket(AF_INET, SOCK_STREAM, 0);
>>> optval = 1;
>>> setsockopt(parentfd, SOL_SOCKET, SO_REUSEADDR,
>>>             (const void *)&optval , sizeof(int));
>>> serveraddr.sin_family = AF_INET;
>>> serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
>>> serveraddr.sin_port = htons((unsigned short)portno);
>>> bind(parentfd, (struct sockaddr *) &serveraddr, sizeof(serveraddr);
>>> listen(parentfd, 100); // allow 100 connections to queue up -- whatever
>>> childfd = accept(parentfd, (struct sockaddr *) &clientaddr, &clientlen);
>>>
>>> and then we go back to accept()'ing on parentfd after we've spun off 
>>> something to talk to childfd.
>>>
>>>
>>> I've been doing all of my work on an old branch for reasons that are 
>>> boring, so some/all of this may be fixed on top of tree already!  But this 
>>> is what I had to do to get my branch to work correctly on a remote system.
>>>
>>> I also noticed that dotest.py won't run multiple debug sessions with a 
>>> remote target.  That was part of my goal, to speed these up, but it seems 
>>> to run in --no-multiprocess mode currently.
>>>
>>>
>>> I'll be picking this up next week - my changes right now are a mess, and 
>>> they're against this old branch that I have to work on, but I'll get them 
>>> cleaned up & updated to top of tree and make up a patchset.  But I wanted 
>>> to give you a heads up on where I'm headed as this touches a lot of your 
>>> code.
>>>
>>> Thanks!
>>>
>>>
>>> J
>>>
>>>
>>>
>>>> On Jan 18, 2018, at 3:44 AM, Pavel Labath via Phabricator 
>>>> <revi...@reviews.llvm.org> wrote:
>>>>
>>>> labath added a comment.
>>>>
>>>> In https://reviews.llvm.org/D42210#979608, @jasonmolenda wrote:
>>>>
>>>>> Jim remembers some problem with logging across a fork()'ed process, Pavel 
>>>>> does this ring a bell?  This change might be completely bogus -- but it's 
>>>>> very difficult to debug the child side of an lldb-server platform today.
>>>>
>>>>
>>>> I believe Jim is thinking of https://reviews.llvm.org/D38938. The issue is 
>>>> that if another thread holds the logging mutex while we fork(), the mutex 
>>>> will remain in a bogus state in the child process. This would mean that 
>>>> any operation on the Log subsystem (such as enabling logging) could hang. 
>>>> We hold the mutex for just a couple of instructions (just enough to copy a 
>>>> shared_ptr), but after some code reshuffles, we were hitting this case 
>>>> very often in liblldb.
>>>>
>>>> Now, I don't think this can happen here as at the point where we are 
>>>> forking, the platform process is single-threaded. So, enabling logging 
>>>> here should be safe, but only accidentally. It should be possible to make 
>>>> this always safe, but that will need to be done with a very steady hand. 
>>>> My opinion is we should not bother with that (i.e., just check this in 
>>>> as-is) until we actually need more threads in the platform, particularly 
>>>> as I think the long-term direction here should be to replace the fork with 
>>>> a new thread for handling the connection.
>>>>
>>>> As for testing, the problem we have now is that we have no direct platform 
>>>> tests today -- remote tests will test platform by the virtue of all 
>>>> connections going through it but a standard check-lldb will not even run 
>>>> this code. I have been planning to add tests like these, but I haven't got 
>>>> around to that yet, and I'm not sure when will that happen. If you want to 
>>>> take a stab at it, I can give you some pointers.
>>>>
>>>> BTW: for most debugging needs you should be able to just start the 
>>>> platform without the --server argument, which will disable forking and 
>>>> handle each connection in the main process.
>>>>
>>>>
>>>> Repository:
>>>> rL LLVM
>>>>
>>>> https://reviews.llvm.org/D42210
>>>>
>>>>
>>>>
>>>
>
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

Reply via email to