Right I see. Thanks for the explanation. I feel this is a good time to bring up again my idea for implementing gdb-server connections in a port-forwarder-friendly way. It's described at <http://lists.llvm.org/pipermail/lldb-dev/2017-November/012973.html>, so I am not going to repeat the details here, but the crux of it is that it would allow us to connect to gdb-server without requiring the running of any special port-forwarding programs(*), which is an issue that periodically comes up as a problem for some one. I don't know if you're happy with the way you are currently doing the port forwarding, but just in case you're not, I want to tell you that there is a way to avoid it. :)
(*) In some situations (which probably includes your iphone scenario as well) you will still need to forward *one* port for the initial platform connection, but that can be simpler than needing to forward dozens of ports to enable fast parallel testing. Also, in a lot of scenarios (like a typical NAT setup) you wouldn't need to forward any ports at all. pl On 5 February 2018 at 19:39, Jason Molenda <jmole...@apple.com> wrote: > Yeah I saw the TCPSocket::Accept differences on top of tree after sending the > email - unfortunately I have to do my work on this old branch, and I can't > build our top of tree sources for the iphone target right now. It's not a > great situation! > > The min/max ports are needed for running the testsuite on a device, we run a > program to relay a specific list of ports between the mac and the phone so we > have to depend on that. > > I'll see if I can get the testsuite running on my local mac via lldb-server > platform on top of tree and see what changes are needed there, hopefully I > can repo the same problems I'm hitting on this old branch if they're still > applicable. > > >> On Feb 5, 2018, at 2:56 AM, Pavel Labath <lab...@google.com> wrote: >> >> Hi Jason, >> >> Thanks for the heads up. I look forward to getting rid of fork() >> there, but there is one thing that's not clear to me. You say that >> TCPSocket::Accept() calls CloseListenSockets().. I see don't see >> anything like that in the current code, and I know for a fact that we >> are able to handle multiple (concurrent) connections, as that is how >> we run tests on android right now. Could this be some problem with >> the branch that you were working on? >> >> In general, if you don't need to use the --min-port/--max-port >> arguments (that is indeed broken in the fork mode), then everything >> should already be set-up for the parallel remote testing. In fact, >> unless you have a hard requirement to use --min/max-port then I would >> advise against it. I think it can cause you a lot of problems down the >> line. The issue with vending out ports like this is that the >> "PortCoordinator" does not have control over the whole system, and it >> cannot prevent some other process from claiming one of the ports that >> it "owns". So, you may need to implement some retry logic to handle >> the case when debugserver cannot bind to the port that it was assigned >> to. >> >> I think that a much more reliable and simpler solution would be to >> have debugserver ask the OS to assign it an unused port (by binding to >> port zero), and then report that port back to lldb-platform (via the >> --named-pipe arg -- this is how we use llgs right now). Besides fixing >> the port allocation problem, this will also ensure that by the time >> the platform reports the port back to the client, debugserver is >> already primed and waiting for a connection. I'm not sure if >> debugserver right now supports the --named-pipe argument, but it >> should be fairly trivial to add it if needed. >> >> Of course, if you need to the min/max-port mode of operation then >> that's fine, but if this were up to me, i'd try hard to find a way to >> avoid it. :) >> >> regards, >> pavel >> >> On 3 February 2018 at 04:12, Jason Molenda <jmole...@apple.com> wrote: >>> Hi Pavel, I closed this phabracator, I've been working on this on-and-off >>> and I think a different approach would be better. >>> >>> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp >>> from using a fork() model to using std::thread's. The main thread listens >>> for incoming connections, and when one is received, it starts up a new >>> std::thread to handle that connection. I have a PortCoordinator singleton >>> object that manages the ports we can use for communications. Newly created >>> threads request ports from (& free them when the thread is finished) so >>> that it would be possible to run multiple tests at the same time. >>> >>> The current code fork()'s when it receives a connection, and gives the >>> child process a copy of all the ports that are available. Because it's in >>> a separate process, if we were to receive a second connection and fork off >>> a new process, it would have the same list of ports listed as available. >>> When debugserver is being used, this is a problem - when the lldb-server >>> platform thread/process gets a qLaunchGDBServer packet, we run debugserver >>> saying "hey debugserver listen for a connection on port N" and then tell >>> the remote lldb process "there's a debugserver waiting to hear from you on >>> port N" - so lldb-server can't test whether port N is in use or not. >>> >>> (there was also a problem in >>> GDBRemoteCommunication::StartDebugserverProcess where it has a url like >>> localhost:1111 and then it tries to run debugserver in --named-pipe mode >>> even though we already have a port # to use in the url.) >>> >>> The final problem is that TCPSocket::Accept waits for an incoming >>> connection on the assigned port #, and when it comes in it gets a new file >>> descriptor for that connection. It should resume listening to that >>> assigned port for the next connection that comes in ... but today >>> TCPSocket::Accept calls CloseListenSockets() so after opening the first >>> lldb-server platform connection that comes in, when we go to accept() the >>> second, the socket has been closed and we exit immediately. To quickly >>> recap the POSIX API (I never do network programming so I have to remind >>> myself of this every time), the workflow for a server like this usually >>> looks like >>> >>> parentfd = socket(AF_INET, SOCK_STREAM, 0); >>> optval = 1; >>> setsockopt(parentfd, SOL_SOCKET, SO_REUSEADDR, >>> (const void *)&optval , sizeof(int)); >>> serveraddr.sin_family = AF_INET; >>> serveraddr.sin_addr.s_addr = htonl(INADDR_ANY); >>> serveraddr.sin_port = htons((unsigned short)portno); >>> bind(parentfd, (struct sockaddr *) &serveraddr, sizeof(serveraddr); >>> listen(parentfd, 100); // allow 100 connections to queue up -- whatever >>> childfd = accept(parentfd, (struct sockaddr *) &clientaddr, &clientlen); >>> >>> and then we go back to accept()'ing on parentfd after we've spun off >>> something to talk to childfd. >>> >>> >>> I've been doing all of my work on an old branch for reasons that are >>> boring, so some/all of this may be fixed on top of tree already! But this >>> is what I had to do to get my branch to work correctly on a remote system. >>> >>> I also noticed that dotest.py won't run multiple debug sessions with a >>> remote target. That was part of my goal, to speed these up, but it seems >>> to run in --no-multiprocess mode currently. >>> >>> >>> I'll be picking this up next week - my changes right now are a mess, and >>> they're against this old branch that I have to work on, but I'll get them >>> cleaned up & updated to top of tree and make up a patchset. But I wanted >>> to give you a heads up on where I'm headed as this touches a lot of your >>> code. >>> >>> Thanks! >>> >>> >>> J >>> >>> >>> >>>> On Jan 18, 2018, at 3:44 AM, Pavel Labath via Phabricator >>>> <revi...@reviews.llvm.org> wrote: >>>> >>>> labath added a comment. >>>> >>>> In https://reviews.llvm.org/D42210#979608, @jasonmolenda wrote: >>>> >>>>> Jim remembers some problem with logging across a fork()'ed process, Pavel >>>>> does this ring a bell? This change might be completely bogus -- but it's >>>>> very difficult to debug the child side of an lldb-server platform today. >>>> >>>> >>>> I believe Jim is thinking of https://reviews.llvm.org/D38938. The issue is >>>> that if another thread holds the logging mutex while we fork(), the mutex >>>> will remain in a bogus state in the child process. This would mean that >>>> any operation on the Log subsystem (such as enabling logging) could hang. >>>> We hold the mutex for just a couple of instructions (just enough to copy a >>>> shared_ptr), but after some code reshuffles, we were hitting this case >>>> very often in liblldb. >>>> >>>> Now, I don't think this can happen here as at the point where we are >>>> forking, the platform process is single-threaded. So, enabling logging >>>> here should be safe, but only accidentally. It should be possible to make >>>> this always safe, but that will need to be done with a very steady hand. >>>> My opinion is we should not bother with that (i.e., just check this in >>>> as-is) until we actually need more threads in the platform, particularly >>>> as I think the long-term direction here should be to replace the fork with >>>> a new thread for handling the connection. >>>> >>>> As for testing, the problem we have now is that we have no direct platform >>>> tests today -- remote tests will test platform by the virtue of all >>>> connections going through it but a standard check-lldb will not even run >>>> this code. I have been planning to add tests like these, but I haven't got >>>> around to that yet, and I'm not sure when will that happen. If you want to >>>> take a stab at it, I can give you some pointers. >>>> >>>> BTW: for most debugging needs you should be able to just start the >>>> platform without the --server argument, which will disable forking and >>>> handle each connection in the main process. >>>> >>>> >>>> Repository: >>>> rL LLVM >>>> >>>> https://reviews.llvm.org/D42210 >>>> >>>> >>>> >>> > _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits