Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

2018-02-06 Thread Pavel Labath via lldb-commits
Right I see. Thanks for the explanation.

I feel this is a good time to bring up again my idea for implementing
gdb-server connections in a port-forwarder-friendly way. It's
described at 
,
so I am not going to repeat the details here, but the crux of it is
that it would allow us to connect to gdb-server without requiring the
running of any special port-forwarding programs(*), which is an issue
that periodically comes up as a problem for some one. I don't know if
you're happy with the way you are currently doing the port forwarding,
but just in case you're not, I want to tell you that there is a way to
avoid it. :)

(*) In some situations (which probably includes your iphone scenario
as well) you will still need to forward *one* port for the initial
platform connection, but that can be simpler than needing to forward
dozens of ports to enable fast parallel testing. Also, in a lot of
scenarios (like a typical NAT setup) you wouldn't need to forward any
ports at all.

pl

On 5 February 2018 at 19:39, Jason Molenda  wrote:
> Yeah I saw the TCPSocket::Accept differences on top of tree after sending the 
> email - unfortunately I have to do my work on this old branch, and I can't 
> build our top of tree sources for the iphone target right now. It's not a 
> great situation!
>
> The min/max ports are needed for running the testsuite on a device, we run a 
> program to relay a specific list of ports between the mac and the phone so we 
> have to depend on that.
>
> I'll see if I can get the testsuite running on my local mac via lldb-server 
> platform on top of tree and see what changes are needed there, hopefully I 
> can repo the same problems I'm hitting on this old branch if they're still 
> applicable.
>
>
>> On Feb 5, 2018, at 2:56 AM, Pavel Labath  wrote:
>>
>> Hi Jason,
>>
>> Thanks for the heads up. I look forward to getting rid of fork()
>> there, but there is one thing that's not clear to me. You say that
>> TCPSocket::Accept() calls CloseListenSockets().. I see don't see
>> anything like that in the current code, and I know for a fact that we
>> are able to handle multiple (concurrent) connections, as that is how
>> we run tests on android right now.  Could this be some problem with
>> the branch that you were working on?
>>
>> In general, if you don't need to use the --min-port/--max-port
>> arguments (that is indeed broken in the fork mode), then everything
>> should already be set-up for the parallel remote testing. In fact,
>> unless you have a hard requirement to use --min/max-port then I would
>> advise against it. I think it can cause you a lot of problems down the
>> line. The issue with vending out ports like this is that the
>> "PortCoordinator" does not have control over the whole system, and it
>> cannot prevent some other process from claiming one of the ports that
>> it "owns". So, you may need to implement some retry logic to handle
>> the case when debugserver cannot bind to the port that it was assigned
>> to.
>>
>> I think that a much more reliable and simpler solution would be to
>> have debugserver ask the OS to assign it an unused port (by binding to
>> port zero), and then report that port back to lldb-platform (via the
>> --named-pipe arg -- this is how we use llgs right now). Besides fixing
>> the port allocation problem, this will also ensure that by the time
>> the platform reports the port back to the client, debugserver is
>> already primed and waiting for a connection. I'm not sure if
>> debugserver right now supports the --named-pipe argument, but it
>> should be fairly trivial to add it if needed.
>>
>> Of course, if you need to the min/max-port mode of operation then
>> that's fine, but if this were up to me, i'd try hard to find a way to
>> avoid it. :)
>>
>> regards,
>> pavel
>>
>> On 3 February 2018 at 04:12, Jason Molenda  wrote:
>>> Hi Pavel, I closed this phabracator, I've been working on this on-and-off 
>>> and I think a different approach would be better.
>>>
>>> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp 
>>> from using a fork() model to using std::thread's.  The main thread listens 
>>> for incoming connections, and when one is received, it starts up a new 
>>> std::thread to handle that connection.  I have a PortCoordinator singleton 
>>> object that manages the ports we can use for communications.  Newly created 
>>> threads request ports from (& free them when the thread is finished) so 
>>> that it would be possible to run multiple tests at the same time.
>>>
>>> The current code fork()'s when it receives a connection, and gives the 
>>> child process a copy of all the ports that are available.  Because it's in 
>>> a separate process, if we were to receive a second connection and fork off 
>>> a new process, it would have the same list of ports listed as available.  
>>> When debugserver is being 

Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

2018-02-05 Thread Jason Molenda via lldb-commits
Yeah I saw the TCPSocket::Accept differences on top of tree after sending the 
email - unfortunately I have to do my work on this old branch, and I can't 
build our top of tree sources for the iphone target right now. It's not a great 
situation!

The min/max ports are needed for running the testsuite on a device, we run a 
program to relay a specific list of ports between the mac and the phone so we 
have to depend on that.

I'll see if I can get the testsuite running on my local mac via lldb-server 
platform on top of tree and see what changes are needed there, hopefully I can 
repo the same problems I'm hitting on this old branch if they're still 
applicable.


> On Feb 5, 2018, at 2:56 AM, Pavel Labath  wrote:
> 
> Hi Jason,
> 
> Thanks for the heads up. I look forward to getting rid of fork()
> there, but there is one thing that's not clear to me. You say that
> TCPSocket::Accept() calls CloseListenSockets().. I see don't see
> anything like that in the current code, and I know for a fact that we
> are able to handle multiple (concurrent) connections, as that is how
> we run tests on android right now.  Could this be some problem with
> the branch that you were working on?
> 
> In general, if you don't need to use the --min-port/--max-port
> arguments (that is indeed broken in the fork mode), then everything
> should already be set-up for the parallel remote testing. In fact,
> unless you have a hard requirement to use --min/max-port then I would
> advise against it. I think it can cause you a lot of problems down the
> line. The issue with vending out ports like this is that the
> "PortCoordinator" does not have control over the whole system, and it
> cannot prevent some other process from claiming one of the ports that
> it "owns". So, you may need to implement some retry logic to handle
> the case when debugserver cannot bind to the port that it was assigned
> to.
> 
> I think that a much more reliable and simpler solution would be to
> have debugserver ask the OS to assign it an unused port (by binding to
> port zero), and then report that port back to lldb-platform (via the
> --named-pipe arg -- this is how we use llgs right now). Besides fixing
> the port allocation problem, this will also ensure that by the time
> the platform reports the port back to the client, debugserver is
> already primed and waiting for a connection. I'm not sure if
> debugserver right now supports the --named-pipe argument, but it
> should be fairly trivial to add it if needed.
> 
> Of course, if you need to the min/max-port mode of operation then
> that's fine, but if this were up to me, i'd try hard to find a way to
> avoid it. :)
> 
> regards,
> pavel
> 
> On 3 February 2018 at 04:12, Jason Molenda  wrote:
>> Hi Pavel, I closed this phabracator, I've been working on this on-and-off 
>> and I think a different approach would be better.
>> 
>> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp 
>> from using a fork() model to using std::thread's.  The main thread listens 
>> for incoming connections, and when one is received, it starts up a new 
>> std::thread to handle that connection.  I have a PortCoordinator singleton 
>> object that manages the ports we can use for communications.  Newly created 
>> threads request ports from (& free them when the thread is finished) so that 
>> it would be possible to run multiple tests at the same time.
>> 
>> The current code fork()'s when it receives a connection, and gives the child 
>> process a copy of all the ports that are available.  Because it's in a 
>> separate process, if we were to receive a second connection and fork off a 
>> new process, it would have the same list of ports listed as available.  When 
>> debugserver is being used, this is a problem - when the lldb-server platform 
>> thread/process gets a qLaunchGDBServer packet, we run debugserver saying 
>> "hey debugserver listen for a connection on port N" and then tell the remote 
>> lldb process "there's a debugserver waiting to hear from you on port N" - so 
>> lldb-server can't test whether port N is in use or not.
>> 
>> (there was also a problem in GDBRemoteCommunication::StartDebugserverProcess 
>> where it has a url like localhost: and then it tries to run debugserver 
>> in --named-pipe mode even though we already have a port # to use in the url.)
>> 
>> The final problem is that TCPSocket::Accept waits for an incoming connection 
>> on the assigned port #, and when it comes in it gets a new file descriptor 
>> for that connection.  It should resume listening to that assigned port for 
>> the next connection that comes in ... but today TCPSocket::Accept calls 
>> CloseListenSockets() so after opening the first lldb-server platform 
>> connection that comes in, when we go to accept() the second, the socket has 
>> been closed and we exit immediately.  To quickly recap the POSIX API (I 
>> never do network programming so I have to 

Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

2018-02-05 Thread Pavel Labath via lldb-commits
Hi Jason,

Thanks for the heads up. I look forward to getting rid of fork()
there, but there is one thing that's not clear to me. You say that
TCPSocket::Accept() calls CloseListenSockets().. I see don't see
anything like that in the current code, and I know for a fact that we
are able to handle multiple (concurrent) connections, as that is how
we run tests on android right now.  Could this be some problem with
the branch that you were working on?

In general, if you don't need to use the --min-port/--max-port
arguments (that is indeed broken in the fork mode), then everything
should already be set-up for the parallel remote testing. In fact,
unless you have a hard requirement to use --min/max-port then I would
advise against it. I think it can cause you a lot of problems down the
line. The issue with vending out ports like this is that the
"PortCoordinator" does not have control over the whole system, and it
cannot prevent some other process from claiming one of the ports that
it "owns". So, you may need to implement some retry logic to handle
the case when debugserver cannot bind to the port that it was assigned
to.

I think that a much more reliable and simpler solution would be to
have debugserver ask the OS to assign it an unused port (by binding to
port zero), and then report that port back to lldb-platform (via the
--named-pipe arg -- this is how we use llgs right now). Besides fixing
the port allocation problem, this will also ensure that by the time
the platform reports the port back to the client, debugserver is
already primed and waiting for a connection. I'm not sure if
debugserver right now supports the --named-pipe argument, but it
should be fairly trivial to add it if needed.

Of course, if you need to the min/max-port mode of operation then
that's fine, but if this were up to me, i'd try hard to find a way to
avoid it. :)

regards,
pavel

On 3 February 2018 at 04:12, Jason Molenda  wrote:
> Hi Pavel, I closed this phabracator, I've been working on this on-and-off and 
> I think a different approach would be better.
>
> I am working up a patch where I change tools/lldb-server/lldb-platform.cpp 
> from using a fork() model to using std::thread's.  The main thread listens 
> for incoming connections, and when one is received, it starts up a new 
> std::thread to handle that connection.  I have a PortCoordinator singleton 
> object that manages the ports we can use for communications.  Newly created 
> threads request ports from (& free them when the thread is finished) so that 
> it would be possible to run multiple tests at the same time.
>
> The current code fork()'s when it receives a connection, and gives the child 
> process a copy of all the ports that are available.  Because it's in a 
> separate process, if we were to receive a second connection and fork off a 
> new process, it would have the same list of ports listed as available.  When 
> debugserver is being used, this is a problem - when the lldb-server platform 
> thread/process gets a qLaunchGDBServer packet, we run debugserver saying "hey 
> debugserver listen for a connection on port N" and then tell the remote lldb 
> process "there's a debugserver waiting to hear from you on port N" - so 
> lldb-server can't test whether port N is in use or not.
>
> (there was also a problem in GDBRemoteCommunication::StartDebugserverProcess 
> where it has a url like localhost: and then it tries to run debugserver 
> in --named-pipe mode even though we already have a port # to use in the url.)
>
> The final problem is that TCPSocket::Accept waits for an incoming connection 
> on the assigned port #, and when it comes in it gets a new file descriptor 
> for that connection.  It should resume listening to that assigned port for 
> the next connection that comes in ... but today TCPSocket::Accept calls 
> CloseListenSockets() so after opening the first lldb-server platform 
> connection that comes in, when we go to accept() the second, the socket has 
> been closed and we exit immediately.  To quickly recap the POSIX API (I never 
> do network programming so I have to remind myself of this every time), the 
> workflow for a server like this usually looks like
>
> parentfd = socket(AF_INET, SOCK_STREAM, 0);
> optval = 1;
> setsockopt(parentfd, SOL_SOCKET, SO_REUSEADDR,
>  (const void *) , sizeof(int));
> serveraddr.sin_family = AF_INET;
> serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
> serveraddr.sin_port = htons((unsigned short)portno);
> bind(parentfd, (struct sockaddr *) , sizeof(serveraddr);
> listen(parentfd, 100); // allow 100 connections to queue up -- whatever
> childfd = accept(parentfd, (struct sockaddr *) , );
>
> and then we go back to accept()'ing on parentfd after we've spun off 
> something to talk to childfd.
>
>
> I've been doing all of my work on an old branch for reasons that are boring, 
> so some/all of this may be fixed on top of tree already!  But this is what I 
> had to do to get 

Re: [Lldb-commits] [PATCH] D42210: Re-enable logging on the child side of a fork() in lldb-server platform mode

2018-02-02 Thread Jason Molenda via lldb-commits
Hi Pavel, I closed this phabracator, I've been working on this on-and-off and I 
think a different approach would be better.

I am working up a patch where I change tools/lldb-server/lldb-platform.cpp from 
using a fork() model to using std::thread's.  The main thread listens for 
incoming connections, and when one is received, it starts up a new std::thread 
to handle that connection.  I have a PortCoordinator singleton object that 
manages the ports we can use for communications.  Newly created threads request 
ports from (& free them when the thread is finished) so that it would be 
possible to run multiple tests at the same time.  

The current code fork()'s when it receives a connection, and gives the child 
process a copy of all the ports that are available.  Because it's in a separate 
process, if we were to receive a second connection and fork off a new process, 
it would have the same list of ports listed as available.  When debugserver is 
being used, this is a problem - when the lldb-server platform thread/process 
gets a qLaunchGDBServer packet, we run debugserver saying "hey debugserver 
listen for a connection on port N" and then tell the remote lldb process 
"there's a debugserver waiting to hear from you on port N" - so lldb-server 
can't test whether port N is in use or not.

(there was also a problem in GDBRemoteCommunication::StartDebugserverProcess 
where it has a url like localhost: and then it tries to run debugserver in 
--named-pipe mode even though we already have a port # to use in the url.)

The final problem is that TCPSocket::Accept waits for an incoming connection on 
the assigned port #, and when it comes in it gets a new file descriptor for 
that connection.  It should resume listening to that assigned port for the next 
connection that comes in ... but today TCPSocket::Accept calls 
CloseListenSockets() so after opening the first lldb-server platform connection 
that comes in, when we go to accept() the second, the socket has been closed 
and we exit immediately.  To quickly recap the POSIX API (I never do network 
programming so I have to remind myself of this every time), the workflow for a 
server like this usually looks like

parentfd = socket(AF_INET, SOCK_STREAM, 0);
optval = 1;
setsockopt(parentfd, SOL_SOCKET, SO_REUSEADDR, 
 (const void *) , sizeof(int));
serveraddr.sin_family = AF_INET;
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
serveraddr.sin_port = htons((unsigned short)portno);
bind(parentfd, (struct sockaddr *) , sizeof(serveraddr);
listen(parentfd, 100); // allow 100 connections to queue up -- whatever
childfd = accept(parentfd, (struct sockaddr *) , );

and then we go back to accept()'ing on parentfd after we've spun off something 
to talk to childfd.


I've been doing all of my work on an old branch for reasons that are boring, so 
some/all of this may be fixed on top of tree already!  But this is what I had 
to do to get my branch to work correctly on a remote system.

I also noticed that dotest.py won't run multiple debug sessions with a remote 
target.  That was part of my goal, to speed these up, but it seems to run in 
--no-multiprocess mode currently.


I'll be picking this up next week - my changes right now are a mess, and 
they're against this old branch that I have to work on, but I'll get them 
cleaned up & updated to top of tree and make up a patchset.  But I wanted to 
give you a heads up on where I'm headed as this touches a lot of your code.

Thanks!


J



> On Jan 18, 2018, at 3:44 AM, Pavel Labath via Phabricator 
>  wrote:
> 
> labath added a comment.
> 
> In https://reviews.llvm.org/D42210#979608, @jasonmolenda wrote:
> 
>> Jim remembers some problem with logging across a fork()'ed process, Pavel 
>> does this ring a bell?  This change might be completely bogus -- but it's 
>> very difficult to debug the child side of an lldb-server platform today.
> 
> 
> I believe Jim is thinking of https://reviews.llvm.org/D38938. The issue is 
> that if another thread holds the logging mutex while we fork(), the mutex 
> will remain in a bogus state in the child process. This would mean that any 
> operation on the Log subsystem (such as enabling logging) could hang. We hold 
> the mutex for just a couple of instructions (just enough to copy a 
> shared_ptr), but after some code reshuffles, we were hitting this case very 
> often in liblldb.
> 
> Now, I don't think this can happen here as at the point where we are forking, 
> the platform process is single-threaded. So, enabling logging here should be 
> safe, but only accidentally. It should be possible to make this always safe, 
> but that will need to be done with a very steady hand. My opinion is we 
> should not bother with that (i.e., just check this in as-is) until we 
> actually need more threads in the platform, particularly as I think the 
> long-term direction here should be to replace the fork with a new thread for 
> handling the connection.