Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-25 Thread Michael Diener
FYI, the bug is submitted:
https://bz.apache.org/bugzilla/show_bug.cgi?id=59897

Chris and Rainer, thanks for pointing me in the right direction!

Michael

On 19 July 2016 at 11:42, Michael Diener  wrote:

> Chris,
>
> thanks a lot for explaining what could be overflowing the FD_SETSIZE of
> 1024.
>
> Just for reference, I use mpm_event mostly with SSL connections, so it
> should behave like mpm_worker. This is my configuration:
>
> StartServers 2
> ServerLimit  16
>
> MinSpareThreads  256
> MaxSpareThreads  1280
>
> ThreadLimit  1024
> ThreadsPerChild  1024
>
> MaxRequestWorkers 16384
> MaxConnectionsPerChild   0
>
>
> One more thing, poll() seems to be used in jk_connect.c in other places
> already, just not at the spot that matters in my case.
>
> Anyhow, I will submit a bug report later this week with all the
> information and will post a link over here as well.
>
> Thank you,
> Michael
>
>
>
>
> On 18 July 2016 at 16:56, Christopher Schultz <
> ch...@christopherschultz.net> wrote:
>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA256
>>
>> Michael,
>>
>> On 7/18/16 10:10 AM, Christopher Schultz wrote:
>> > Michael,
>> >
>> > On 7/18/16 8:53 AM, Michael Diener wrote:
>> >> On 6 July 2016 at 00:09, Christopher Schultz
>> >>  wrote:
>> >
>>  From what I understand a buffer overflow would only happen
>>  for FD_SET if the fd_set gets over 1024 descriptors. I made
>>  sure that my ulimit for open files is set and applied large
>>  enough, so that's not it.
>> >>>
>> >>> There's nothing magic about the ulimit. An fd_set should size
>> >>> appropriately for your OS. On my Linux system, FD_SETSIZE
>> >>> happens to be set to 1024. Reading through the byzantine
>> >>> labyrinth of includes, it appears that FD_SET has zero
>> >>> boundary-checking, so it's therefore possible that overflow
>> >>> will occur.
>> >
>> >
>> >> Regarding the FD_SETSIZE, it is also set for me to 1024 although
>> >> the ulimit is set higher.
>> >
>> > Well, the FD_SETSIZE is just the maximum size of the fd set that
>> > can be passed-into select() specifically. That doesn't limit the
>> > number of file descriptors your processes can open. The ulimit
>> > settings are the limits on those.
>> >
>> >> I'm a bit lost now on what I should do now. What makes me wonder
>> >> is, that nobody else seems to hit this limitation of FD_SET and
>> >> this makes me think something on my Linux machine is not right.
>> >
>> > Well, not everyone is using the same setup. For example, using NIO
>> > through Java is likely to get you the poll() call under the hood,
>> > so supporting more than e.g. 1024 file descriptors is not an issue
>> > there. That's just a guess, but I think it's a reasonable guess.
>> >
>> > I think tcnative+APR is not widely deployed. I have no numbers to
>> > back that up, but we don't get a huge volume of questions in the
>> > list about it.
>>
>> Of course, the above statement makes no sense whatsoever. This is
>> about mod_jk and not tcnative. Sorry for the confusion.
>>
>> But I was thinking, the case above would require a single httpd thread
>> to accumulate more than 1024 connections, and that would require some
>> very specific circumstances.
>>
>> First, you'd have to be using an MPM that allowed more than one
>> connection per process/thread, so that means no pre-fork, leaving
>> basically event or worker available.
>>
>> Then, you'd have to have enough traffic to cause one thread to grow to
>> more than 1024 connections. The default for httpd's worker MPM has 64
>> threads per child and 16 server processes.
>>
>> For one process to exceed the 1024 fd limit, you'd have to be handling
>> roughly 16 simultaneous connections per thread PER PROCESS. I'm not
>> sure how httpd auto-scales-out its child processes, so you could
>> conceivably get 1024 simultaneous connections in a burst of requests
>> to a single child process, but it seems likely that load would be
>> roughly-split between 16 child processes, keeping those 1024
>> connections down to a mild 64-jk-connections per-process.
>>
>> Assuming you have 16 child processes and a perfectly uniform
>> load-distribution between them, you'd have to get a burst of 16k
>> simultaneous requests in order to max-out that fd array on a single
>> server. That's probably why it's quite rare.
>>
>> - -chris
>> -BEGIN PGP SIGNATURE-
>> Comment: GPGTools - http://gpgtools.org
>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>
>> iQIcBAEBCAAGBQJXjO4tAAoJEBzwKT+lPKRYb9wP/j2i6FW7gOGf/G4/CTundEMY
>> WcKKPfNXAOU270KyiBkuSLw06Cjw8mk3vsjpg5i5JkPC4TUNy1jyhaI5cs0RvisV
>> pzqFgZCuyz4gPtg4TRCXw5RrmLwe7unBldTETjK54P9/Nd6Vuj34mUV8OLcnYZh6
>> UMCe0ULbBV5IoGhLmGQ5yy7MfHGdq1vwmxR41i4A4rc4J9fOC1UI4pIOvJRM1cnT
>> 

Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-19 Thread Michael Diener
Chris,

thanks a lot for explaining what could be overflowing the FD_SETSIZE of
1024.

Just for reference, I use mpm_event mostly with SSL connections, so it
should behave like mpm_worker. This is my configuration:

StartServers 2
ServerLimit  16

MinSpareThreads  256
MaxSpareThreads  1280

ThreadLimit  1024
ThreadsPerChild  1024

MaxRequestWorkers 16384
MaxConnectionsPerChild   0


One more thing, poll() seems to be used in jk_connect.c in other places
already, just not at the spot that matters in my case.

Anyhow, I will submit a bug report later this week with all the information
and will post a link over here as well.

Thank you,
Michael




On 18 July 2016 at 16:56, Christopher Schultz 
wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Michael,
>
> On 7/18/16 10:10 AM, Christopher Schultz wrote:
> > Michael,
> >
> > On 7/18/16 8:53 AM, Michael Diener wrote:
> >> On 6 July 2016 at 00:09, Christopher Schultz
> >>  wrote:
> >
>  From what I understand a buffer overflow would only happen
>  for FD_SET if the fd_set gets over 1024 descriptors. I made
>  sure that my ulimit for open files is set and applied large
>  enough, so that's not it.
> >>>
> >>> There's nothing magic about the ulimit. An fd_set should size
> >>> appropriately for your OS. On my Linux system, FD_SETSIZE
> >>> happens to be set to 1024. Reading through the byzantine
> >>> labyrinth of includes, it appears that FD_SET has zero
> >>> boundary-checking, so it's therefore possible that overflow
> >>> will occur.
> >
> >
> >> Regarding the FD_SETSIZE, it is also set for me to 1024 although
> >> the ulimit is set higher.
> >
> > Well, the FD_SETSIZE is just the maximum size of the fd set that
> > can be passed-into select() specifically. That doesn't limit the
> > number of file descriptors your processes can open. The ulimit
> > settings are the limits on those.
> >
> >> I'm a bit lost now on what I should do now. What makes me wonder
> >> is, that nobody else seems to hit this limitation of FD_SET and
> >> this makes me think something on my Linux machine is not right.
> >
> > Well, not everyone is using the same setup. For example, using NIO
> > through Java is likely to get you the poll() call under the hood,
> > so supporting more than e.g. 1024 file descriptors is not an issue
> > there. That's just a guess, but I think it's a reasonable guess.
> >
> > I think tcnative+APR is not widely deployed. I have no numbers to
> > back that up, but we don't get a huge volume of questions in the
> > list about it.
>
> Of course, the above statement makes no sense whatsoever. This is
> about mod_jk and not tcnative. Sorry for the confusion.
>
> But I was thinking, the case above would require a single httpd thread
> to accumulate more than 1024 connections, and that would require some
> very specific circumstances.
>
> First, you'd have to be using an MPM that allowed more than one
> connection per process/thread, so that means no pre-fork, leaving
> basically event or worker available.
>
> Then, you'd have to have enough traffic to cause one thread to grow to
> more than 1024 connections. The default for httpd's worker MPM has 64
> threads per child and 16 server processes.
>
> For one process to exceed the 1024 fd limit, you'd have to be handling
> roughly 16 simultaneous connections per thread PER PROCESS. I'm not
> sure how httpd auto-scales-out its child processes, so you could
> conceivably get 1024 simultaneous connections in a burst of requests
> to a single child process, but it seems likely that load would be
> roughly-split between 16 child processes, keeping those 1024
> connections down to a mild 64-jk-connections per-process.
>
> Assuming you have 16 child processes and a perfectly uniform
> load-distribution between them, you'd have to get a burst of 16k
> simultaneous requests in order to max-out that fd array on a single
> server. That's probably why it's quite rare.
>
> - -chris
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIcBAEBCAAGBQJXjO4tAAoJEBzwKT+lPKRYb9wP/j2i6FW7gOGf/G4/CTundEMY
> WcKKPfNXAOU270KyiBkuSLw06Cjw8mk3vsjpg5i5JkPC4TUNy1jyhaI5cs0RvisV
> pzqFgZCuyz4gPtg4TRCXw5RrmLwe7unBldTETjK54P9/Nd6Vuj34mUV8OLcnYZh6
> UMCe0ULbBV5IoGhLmGQ5yy7MfHGdq1vwmxR41i4A4rc4J9fOC1UI4pIOvJRM1cnT
> 5cfLEavT3tbqxaxLCs5V/pQkkuwQMKITSW+JGdDPxN3oXL7b1QCw8RGBqhpDgjE/
> FIIIBGfbMGHDXstmSBRXGlxkASKKdYlK9qoYU3f7G0PK053zx5TD+2vOfb0u0Vi/
> lwIkk3lhL7Azw9hYKFr1R+PW3ewUqXI7Nh05HldNWlJ48I91cTGLF2mC7uRo6uQ2
> M9pUCuyCtL1ajgG6eUmBlo2soIAaHPkorCmCdUAiv/zWfHKSSEGTwr3l81DtSapE
> iORRoCyLVIhxKQprvBlTHp2uDIa7lzXOI83RcMb6ZqdxiNe3LscTRsVl/Ll8cVHj
> Fw7klJgbPrRPmMn02hANdalDE96CvagPlEmgCGhp9h3TPD7fV28a3vY154IYxLBW
> 

Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-18 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Michael,

On 7/18/16 10:10 AM, Christopher Schultz wrote:
> Michael,
> 
> On 7/18/16 8:53 AM, Michael Diener wrote:
>> On 6 July 2016 at 00:09, Christopher Schultz 
>>  wrote:
> 
 From what I understand a buffer overflow would only happen
 for FD_SET if the fd_set gets over 1024 descriptors. I made
 sure that my ulimit for open files is set and applied large
 enough, so that's not it.
>>> 
>>> There's nothing magic about the ulimit. An fd_set should size 
>>> appropriately for your OS. On my Linux system, FD_SETSIZE
>>> happens to be set to 1024. Reading through the byzantine
>>> labyrinth of includes, it appears that FD_SET has zero
>>> boundary-checking, so it's therefore possible that overflow
>>> will occur.
> 
> 
>> Regarding the FD_SETSIZE, it is also set for me to 1024 although 
>> the ulimit is set higher.
> 
> Well, the FD_SETSIZE is just the maximum size of the fd set that
> can be passed-into select() specifically. That doesn't limit the
> number of file descriptors your processes can open. The ulimit
> settings are the limits on those.
> 
>> I'm a bit lost now on what I should do now. What makes me wonder 
>> is, that nobody else seems to hit this limitation of FD_SET and 
>> this makes me think something on my Linux machine is not right.
> 
> Well, not everyone is using the same setup. For example, using NIO 
> through Java is likely to get you the poll() call under the hood,
> so supporting more than e.g. 1024 file descriptors is not an issue
> there. That's just a guess, but I think it's a reasonable guess.
> 
> I think tcnative+APR is not widely deployed. I have no numbers to
> back that up, but we don't get a huge volume of questions in the
> list about it.

Of course, the above statement makes no sense whatsoever. This is
about mod_jk and not tcnative. Sorry for the confusion.

But I was thinking, the case above would require a single httpd thread
to accumulate more than 1024 connections, and that would require some
very specific circumstances.

First, you'd have to be using an MPM that allowed more than one
connection per process/thread, so that means no pre-fork, leaving
basically event or worker available.

Then, you'd have to have enough traffic to cause one thread to grow to
more than 1024 connections. The default for httpd's worker MPM has 64
threads per child and 16 server processes.

For one process to exceed the 1024 fd limit, you'd have to be handling
roughly 16 simultaneous connections per thread PER PROCESS. I'm not
sure how httpd auto-scales-out its child processes, so you could
conceivably get 1024 simultaneous connections in a burst of requests
to a single child process, but it seems likely that load would be
roughly-split between 16 child processes, keeping those 1024
connections down to a mild 64-jk-connections per-process.

Assuming you have 16 child processes and a perfectly uniform
load-distribution between them, you'd have to get a burst of 16k
simultaneous requests in order to max-out that fd array on a single
server. That's probably why it's quite rare.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJXjO4tAAoJEBzwKT+lPKRYb9wP/j2i6FW7gOGf/G4/CTundEMY
WcKKPfNXAOU270KyiBkuSLw06Cjw8mk3vsjpg5i5JkPC4TUNy1jyhaI5cs0RvisV
pzqFgZCuyz4gPtg4TRCXw5RrmLwe7unBldTETjK54P9/Nd6Vuj34mUV8OLcnYZh6
UMCe0ULbBV5IoGhLmGQ5yy7MfHGdq1vwmxR41i4A4rc4J9fOC1UI4pIOvJRM1cnT
5cfLEavT3tbqxaxLCs5V/pQkkuwQMKITSW+JGdDPxN3oXL7b1QCw8RGBqhpDgjE/
FIIIBGfbMGHDXstmSBRXGlxkASKKdYlK9qoYU3f7G0PK053zx5TD+2vOfb0u0Vi/
lwIkk3lhL7Azw9hYKFr1R+PW3ewUqXI7Nh05HldNWlJ48I91cTGLF2mC7uRo6uQ2
M9pUCuyCtL1ajgG6eUmBlo2soIAaHPkorCmCdUAiv/zWfHKSSEGTwr3l81DtSapE
iORRoCyLVIhxKQprvBlTHp2uDIa7lzXOI83RcMb6ZqdxiNe3LscTRsVl/Ll8cVHj
Fw7klJgbPrRPmMn02hANdalDE96CvagPlEmgCGhp9h3TPD7fV28a3vY154IYxLBW
C2ksoMNv12ha+kiTvrYLc7j85drtZg7V0C/5TfRdBRSxxJ0KF/ye7ed2SN/8RRKC
QgyoIkZ2oSBIoHc/ss26
=f33p
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-18 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Michael,

On 7/18/16 10:10 AM, Christopher Schultz wrote:
> Michael,
> 
> On 7/18/16 8:53 AM, Michael Diener wrote:
>> On 6 July 2016 at 00:09, Christopher Schultz 
>>  wrote:
> 
 From what I understand a buffer overflow would only happen
 for FD_SET if the fd_set gets over 1024 descriptors. I made
 sure that my ulimit for open files is set and applied large
 enough, so that's not it.
>>> 
>>> There's nothing magic about the ulimit. An fd_set should size 
>>> appropriately for your OS. On my Linux system, FD_SETSIZE
>>> happens to be set to 1024. Reading through the byzantine
>>> labyrinth of includes, it appears that FD_SET has zero
>>> boundary-checking, so it's therefore possible that overflow
>>> will occur.
> 
> 
>> Regarding the FD_SETSIZE, it is also set for me to 1024 although 
>> the ulimit is set higher.
> 
> Well, the FD_SETSIZE is just the maximum size of the fd set that
> can be passed-into select() specifically. That doesn't limit the
> number of file descriptors your processes can open. The ulimit
> settings are the limits on those.
> 
>> I'm a bit lost now on what I should do now. What makes me wonder 
>> is, that nobody else seems to hit this limitation of FD_SET and 
>> this makes me think something on my Linux machine is not right.
> 
> Well, not everyone is using the same setup. For example, using NIO 
> through Java is likely to get you the poll() call under the hood,
> so supporting more than e.g. 1024 file descriptors is not an issue
> there. That's just a guess, but I think it's a reasonable guess.
> 
> I think tcnative+APR is not widely deployed. I have no numbers to
> back that up, but we don't get a huge volume of questions in the
> list about i t.
> 
>> What would you guys suggest? Should I file a bug report? My
>> system runs stable now after the change to poll() and I don't hit
>> that problem anymore.
> 
> You should absolutely file the bug report and attach your patch to
> it. Thanks for your great analysis and a proposed patch.

Oh, it's worth pointing-out that while poll() is part of the
POSIX.1-2001, not all supported platforms also support that standard.
So we'll probably need to have a fallback mode that uses select()
instead of poll() when poll() isn't available.

That probably also means that we need to detect the unavailability of
poll() and (a) issue a warning to the logger and (b) write-down the
maximum size of that fd-array to avoid the buffer overflow. (Or, add
better checking when select() is in use... we must always ensure that
the "sd" local in nb_connect does not exceed FD_SIZE + 1.) In fact,
I'll make that fix right now.

Finally, poll() has a few versions, some of which allow e.g. -1 to be
used in the array to mean "ignore this fd" while others do not. That
means that, on some platforms, we'll need to be very careful about the
management of that fd array, possibly re-sizing it, etc. when necessary.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJXjOnAAAoJEBzwKT+lPKRYKQcP/RwOPb7Jm0w2kEDI+2DAhlHS
+hho+yR9N/CfoK73Zo1PDDc/MLT69SFbkrP4jUtAWL3r9mzzpTfn5IuLuY/XxDjE
2hZdLpu3tG8NFjgZ9i1Z4fauAifMS7WVnyWH2oFWP37BF2s3d8ZeAWWdfsgrVPBJ
ojNc3hsDPpJb3DGbRMgEVs43SfRNoKLvTCotlcRozadDDi/pAcoMft1IpOPnwh77
Sr9e17YiFCIJvOmqw57ljfbeWLeFTH3kbmDQSNNbiSdZ9ZKrTXBJOyVCxPvZW3uu
uz5/GIWhmHyjSIEC3TiQKgt8DFLkcfZDU7LUD2gYPyRsmKw0KwEFl8pPBQJWST98
lV3xUODfr1KcM2rmNoisnIPsVUrDH7biu/qGYPqPBeWxwuJhgwrYwTExmkmpbcce
1IRk+5P/DvZHYn9LVBcNIb+sE5TroMeCxSB+rt6kN6L0kPX1Zar97Qels2Xd5VVp
g650gOuNcgJAXJyJx/y3/7oe/GJKdscbY+W23JhMp1gWEdGbrf8Ki6WogOEcfNPR
Bj+DxpINJfYqCpqE1GluF7s0eou/ybaFzmanIwHzV2xCezA06VJ7GtzJgvwbuSq4
dbf05eV4TNvq7BRZ+nuT3UKB6Ds824Kfmc0fPy9AimIdmuqlgoB1nR6ZLromuweV
7glBmIsnBXO+ZomoItxT
=Oet1
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-18 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Michael,

On 7/18/16 8:53 AM, Michael Diener wrote:
> On 6 July 2016 at 00:09, Christopher Schultz
>  wrote:
> 
>>> From what I understand a buffer overflow would only happen for 
>>> FD_SET if the fd_set gets over 1024 descriptors. I made sure
>>> that my ulimit for open files is set and applied large enough,
>>> so that's not it.
>> 
>> There's nothing magic about the ulimit. An fd_set should size 
>> appropriately for your OS. On my Linux system, FD_SETSIZE happens
>> to be set to 1024. Reading through the byzantine labyrinth of
>> includes, it appears that FD_SET has zero boundary-checking, so
>> it's therefore possible that overflow will occur.
> 
> 
> Regarding the FD_SETSIZE, it is also set for me to 1024 although
> the ulimit is set higher.

Well, the FD_SETSIZE is just the maximum size of the fd set that can
be passed-into select() specifically. That doesn't limit the number of
file descriptors your processes can open. The ulimit settings are the
limits on those.

> I'm a bit lost now on what I should do now. What makes me wonder
> is, that nobody else seems to hit this limitation of FD_SET and
> this makes me think something on my Linux machine is not right.

Well, not everyone is using the same setup. For example, using NIO
through Java is likely to get you the poll() call under the hood, so
supporting more than e.g. 1024 file descriptors is not an issue there.
That's just a guess, but I think it's a reasonable guess.

I think tcnative+APR is not widely deployed. I have no numbers to back
that up, but we don't get a huge volume of questions in the list about i
t.

> What would you guys suggest? Should I file a bug report? My system
> runs stable now after the change to poll() and I don't hit that
> problem anymore.

You should absolutely file the bug report and attach your patch to it.
Thanks for your great analysis and a proposed patch.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJXjONYAAoJEBzwKT+lPKRYW/4QAJs/o5JFxPHhv5+mNLquUxCS
VwaDoqQX2t8jXkkpAauRYxpotTBKl3XYOnXzB8XDtuGhdKzuuIDga6EQiRj0vGMP
/wOcLoXBzXsEt03ORiIv/KJMiH3VGHkl9EXtbkEcuX/qLjsv9hdsALmtmzR/v8sY
1VZqPMJ7PT3Lzs62Jf+WpVH9yp+88SDbT8LlS1PijvOe9t/2jzTThmH/Tfm5cYjh
Fay4Klbvye98hzB60sF9ReesDd6O0nxjksPnSgae3DHXF4nbMpgwtn3VQKALzbMG
wFPRGUFJWPmEpvsyM3kdVINuXxakk7bh3xDXhiMJ4etjQZbT0blMA7ZM0hoMAK3d
PH0KC9Y10IwUqv/98BjBx+4Y6zT59m9fEMeX54upDplMP/muR9SCezWWMfgG+CeG
2xJKZjjHAB0WEdpARPgbraYvxm5oyXiBdYlYuK7Wum+jt6bpd1cIQwrlKSKsexIg
YhLaU2zR5pcu2OBxuK7E9C3isQ35ESfd7ErS/v366rwBdsbKHilV7afWJilpjtnI
w4ui+58fpPSt5+L+X5udk13nbTEYfXcIaejvXRTL33+wiqIfr87r+B4PdTfzb3Gh
7apf0rdMLBb57/KqJfMy75L95vtbyzMhIase/cB7WjGu5CO0CGL05xypTq6oh6hY
a1VE+7031PBgjOhnC3qR
=jplT
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-18 Thread Michael Diener
On 6 July 2016 at 00:09, Christopher Schultz 
wrote:

> > From what I understand a buffer overflow would only happen for
> > FD_SET if the fd_set gets over 1024 descriptors. I made sure that
> > my ulimit for open files is set and applied large enough, so that's
> > not it.
>
> There's nothing magic about the ulimit. An fd_set should size
> appropriately for your OS. On my Linux system, FD_SETSIZE happens to
> be set to 1024. Reading through the byzantine labyrinth of includes,
> it appears that FD_SET has zero boundary-checking, so it's therefore
> possible that overflow will occur.


Regarding the FD_SETSIZE, it is also set for me to 1024 although the ulimit
is set higher.


I'm a bit lost now on what I should do now. What makes me wonder is, that
nobody else seems to hit this limitation of FD_SET and this makes me think
something on my Linux machine is not right.

What would you guys suggest? Should I file a bug report? My system runs
stable now after the change to poll() and I don't hit that problem anymore.

Thanks,
Michael


Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-05 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Michael,

On 7/5/16 11:31 AM, Michael Diener wrote:
> Alright, I did my homework this time and worked with a self
> compiled version of mod_jk (1.2.41). Still the same error is
> happening. I traced the buffer overflow down to line 291 in
> jk_connect.c (nb_connect):
> 
> 280>   do { 281>rc = connect(sd, (const struct sockaddr
> *)>sa.sin, addr->salen); 282>} while (rc == -1 && errno
> == EINTR); 283> 284>if ((rc == -1) && (errno == EINPROGRESS ||
> errno == EALREADY) 285>   && (timeout > 0)) { 286>
> fd_set wfdset; 287>struct timeval tv; 288>socklen_t
> rclen = (socklen_t)sizeof(rc); 289> 290>FD_ZERO(); 
> *291>FD_SET(sd, );* 292>tv.tv_sec = timeout
> / 1000; 293>tv.tv_usec = (timeout % 1000) * 1000; 294>
> rc = select(sd + 1, NULL, , NULL, );
> 
> 
> From what I understand a buffer overflow would only happen for
> FD_SET if the fd_set gets over 1024 descriptors. I made sure that
> my ulimit for open files is set and applied large enough, so that's
> not it.

There's nothing magic about the ulimit. An fd_set should size
appropriately for your OS. On my Linux system, FD_SETSIZE happens to
be set to 1024. Reading through the byzantine labyrinth of includes,
it appears that FD_SET has zero boundary-checking, so it's therefore
possible that overflow will occur.

> I tried to switch FD_SET to poll and it seems to work now also for
> sd > 1024:
> 
> struct pollfd pfd_read; pfd_read.fd = sd; pfd_read.events =
> POLLOUT; rc = poll(_read, 1, timeout);
> 
> As C/C++ is not my preferred language and I understand the
> internals for mod_jk not well enough for a change like this, I have
> a few questions:
> 
> 1. Is it normal/expected for nb_connect() to evaluate the IF in
> line 284 to TRUE? I wonder if this might be the real cause for my
> problems in the first place.
> 
> 2. In line 305 of the original jk_connect.c there is a FD_ISSET
> inside an IF. Is there an equivalent operation for poll or is the
> whole IF unnecessary then?

IMHO poll() is superior to select() but unfortunately somewhat less
portable (and also requires a bit more maintenance). It means being
able to handle more than some arbitrary limit of fds (1024 in my case).

I'm unsure if the goal for tcnative is to get away from more
dependencies on APR, but presumably APR has a portable-poll() function
of some kind?

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJXfDAwAAoJEBzwKT+lPKRYT7YQAIPsBTAx+zk2SYVJHzP/86v4
6p7mmUDCcb/EAuyMsWkIqueYlDEGJ5syox4JnoQ987wFBzmrTpuLAnn2x3HcrzE6
ruNjxKtqnDhWwBkLHd8ZzlfiTTP+xRAiWlxb1kA+EHJxSaMeimJSBHJvzjP3sTG6
jJg8XJqz4IeGc3oSf0VdxPEFLDlpA8ixPffnfq2yZry+ux8Y3NuW3S9k1mORNavm
gdJs7fhEj1hzSGtb788LQwmXFH5HXC1mprFvnDQQs47wY72ELe4nk03AT1LNEGeE
dNLvREPRqG+fDglcbJH9UctfyOZZAu67a1sdy71SW2Coa1Od8TlidXACO2L/0NXK
dJMmf1i19wumwZPmTvZP+MXk9qp1OYFN4mG1hWIOA6A/8KfUcYi221tIYqAc8L1w
rm8W0Rf/QlyyZdWOeu1FG4XWmJEg2rf79YlD1sDj5VO7K2Po92rFlaDxaESoiFmb
qa+mDRFQVAYxti25jGuawnHLMdRcaa/j86buwVn9xSwI9Ij4UVxNv5tqWSPG6K9V
rZ4SG8dcoR8roGWAXtm5oLPtDutXvvm4VxFC3sbxzDiwZHzix6k/lbaIT13GG5aJ
3VGpNAqCnwsOeTMjoN5amuWnJJo8Hrb3Qw/Jr6AhYlofKvwizukmjBT0qXuhggpw
IxOdAPzXMS0ppcc9Nsbb
=YviT
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-07-05 Thread Michael Diener
Alright, I did my homework this time and worked with a self compiled
version of mod_jk (1.2.41). Still the same error is happening. I traced the
buffer overflow down to line 291 in jk_connect.c (nb_connect):

280>   do {
281>rc = connect(sd, (const struct sockaddr *)>sa.sin,
addr->salen);
282>} while (rc == -1 && errno == EINTR);
283>
284>if ((rc == -1) && (errno == EINPROGRESS || errno == EALREADY)
285>   && (timeout > 0)) {
286>fd_set wfdset;
287>struct timeval tv;
288>socklen_t rclen = (socklen_t)sizeof(rc);
289>
290>FD_ZERO();
*291>FD_SET(sd, );*
292>tv.tv_sec = timeout / 1000;
293>tv.tv_usec = (timeout % 1000) * 1000;
294>rc = select(sd + 1, NULL, , NULL, );


>From what I understand a buffer overflow would only happen for FD_SET if
the fd_set gets over 1024 descriptors. I made sure that my ulimit for open
files is set and applied large enough, so that's not it.

I tried to switch FD_SET to poll and it seems to work now also for sd >
1024:

struct pollfd pfd_read;
pfd_read.fd = sd;
pfd_read.events = POLLOUT;
rc = poll(_read, 1, timeout);

As C/C++ is not my preferred language and I understand the internals for
mod_jk not well enough for a change like this, I have a few questions:

1. Is it normal/expected for nb_connect() to evaluate the IF in line 284 to
TRUE? I wonder if this might be the real cause for my problems in the first
place.

2. In line 305 of the original jk_connect.c there is a FD_ISSET inside an
IF. Is there an equivalent operation for poll or is the whole IF
unnecessary then?

Thanks,
Michael


On 30 June 2016 at 12:16, Michael Diener  wrote:

> Thank you Rainer!
>
> On 29 June 2016 at 14:50, Rainer Jung  wrote:
>
>> Can you reproduce? Does it also happen on a test system?
>
>
> It only happens on a live system and I'm not able to reproduce it.
>
>
>
>> Latest we provide in the project is 1.2.41. It is pretty easy to compile
>> yourself and would be an interesting check to see, whether it is just an
>> old already fixed problem.
>
>
>
> You are right, I will test and get back.
>
>
> Viele Grüße,
> Michael
>
>
> --
>
>


-- 

__
NEW GAME! http://www.dig-pig.com

Michael Diener - Software e.K.

mdie...@mdiener.de
+49 178 501 601 8
www.mdiener.de

@mdienersoftware

Grünberger Str. 62,
10245 Berlin, Germany

Sitz Berlin, Amtsgericht Charlottenburg, HRA 46760 B
USt-IdNr. DE233968393


Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-06-30 Thread Michael Diener
Thank you Rainer!

On 29 June 2016 at 14:50, Rainer Jung  wrote:

> Can you reproduce? Does it also happen on a test system?


It only happens on a live system and I'm not able to reproduce it.



> Latest we provide in the project is 1.2.41. It is pretty easy to compile
> yourself and would be an interesting check to see, whether it is just an
> old already fixed problem.



You are right, I will test and get back.


Viele Grüße,
Michael


--


Re: mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-06-29 Thread Rainer Jung

Am 29.06.2016 um 11:58 schrieb Michael Diener:

I get occasional Apache 2 crashes being caused by mod_jk and I'm running
out of ideas about the cause of the problem. I hope somebody here can point
me in the right direction.


Can you reproduce? Does it also happen on a test system?


tomcat6 6.0.39-1

libapache2-mod-jk 1:1.2.37-3


Latest we provide in the project is 1.2.41. It is pretty easy to compile 
yourself and would be an interesting check to see, whether it is just an 
old already fixed problem.



apache2 2.4.7-1ubuntu4

java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

/var/log/apache2/error.log



 buffer overflow detected ***: /usr/sbin/apache2 terminated===
Backtrace:
=/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7fe9aa7de29f]/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fe9aa875bbc]/lib/x86_64-linux-gnu/libc.so.6(+0x109a90)[0x7fe9aa874a90]/lib/x86_64-linux-gnu/libc.so.6(+0x10ab07)[0x7fe9aa875b07]/usr/lib/apache2/modules/mod_jk.so(jk_open_socket+0x8d8)[0x7fe9a7c60cb8]/usr/lib/apache2/modules/mod_jk.so(ajp_connect_to_endpoint+0x65)[0x7fe9a7c7bf75]/usr/lib/apache2/modules/mod_jk.so(+0x36422)[0x7fe9a7c7d422]/usr/lib/apache2/modules/mod_jk.so(+0x1674c)[0x7fe9a7c5d74c]/usr/sbin/apache2(ap_run_handler+0x40)[0x7fe9ab65fbe0]/usr/sbin/apache2(ap_invoke_handler+0x69)[0x7fe9ab660129]/usr/sbin/apache2(ap_process_async_request+0x20a)[0x7fe9ab6756ca]/usr/sbin/apache2(+0x69500)[0x7fe9ab672500]/usr/sbin/apache2(ap_run_process_connection+0x40)[0x7fe9ab669220]/usr/lib/apache2/modules/mod_mpm_event.so(+0x681b)[0x7fe9a783981b]/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fe9aab38184]/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe9aa86537d]*
=== Memory map: 
7fe68800-7fe68806a000 rw-p  00:00 0
7fe68806a000-7fe68c00 ---p  00:00 0
...
7fffa6c27000-7fffa6c48000 rw-p  00:00 0 [stack]
7fffa6c86000-7fffa6c88000 r-xp  00:00 0 [vdso]
ff60-ff601000 r-xp  00:00 0 [vsyscall]
[Wed Jun 29 05:01:50.052325 2016] [core:notice] [pid 1747:tid
140641581987712] AH00051: child pid 17018 exit signal Aborted (6), possible
coredump in /etc/apache2

The log indicates there might be a coredump, but there is not.


Configure your system so that it writes core dump files, look at the 
core dump file with "gdb", e.g. running "thread apply all bt full" and 
provide the output.


Likely your httpd is running as a suid process (started as root but the 
children switch uid to something else, like www or similar). You have to 
enable core dump for suid processes explicitly in your Linux.



There is no log in /var/log/apache2/mod_jk.log at the same time.

/var/log/tomcat6/catalina.out

Jun 29, 2016 5:01:49 AM org.apache.jk.common.ChannelSocket processConnection
WARNING: processCallbacks status 2
Jun 29, 2016 5:01:49 AM org.apache.jk.common.ChannelSocket processConnection
WARNING: processCallbacks status 2

The Tomcat log indicates AFAIK that the client connection has been lost.

/etc/libapache2-mod-jk/httpd-jk.conf



JkWorkersFile /etc/libapache2-mod-jk/workers.properties
JkLogFile /var/log/apache2/mod_jk.log
JkLogLevel warn
JkShmFile /var/log/apache2/jk-runtime-status



/etc/libapache2-mod-jk/workers.properties


The next lines look like they are coming from very old and partially 
nonsense recipies. Your crash won't be a result of those but you should 
probably start to create your config based on what's inside the conf 
folder of our source distribution for mod_jk. The files can also be 
found under http://svn.apache.org/viewvc/tomcat/jk/trunk/conf/.



workers.tomcat_home=/usr/share/tomcat6
workers.java_home=/usr/lib/jvm/java-6-sun
ps=/

worker.list=loadbalancer

worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=ajp13_worker,ajp13_worker2
worker.loadbalancer.sticky_session=0

worker.ajp13_worker.port=xxx
worker.ajp13_worker.host=localhost
worker.ajp13_worker.type=ajp13
worker.ajp13_worker.lbfactor=50


If you don't really need the increased max_packet_size, you should 
comment it out here and the packetSize below and retry. Those settings 
are not so common. There were also changes related to them in newer 
versions.



worker.ajp13_worker.max_packet_size=65536


Not crash related, but I don't like the general socket_timeout. Look at 
the configs proposed above. They use lots of good timeouts, but 
socket_timeout is not one of them.



worker.ajp13_worker.socket_timeout=300
worker.ajp13_worker.ping_mode=A
worker.ajp13_worker.secret=xxx
worker.ajp13_worker.fail_on_status=503
worker.ajp13_worker.connection_pool_size=32768

#worker.ajp13_worker.activation=disabled
worker.ajp13_worker.redirect=ajp13_worker2

worker.ajp13_worker2.port=xxx
worker.ajp13_worker2.host=otherhost
worker.ajp13_worker2.type=ajp13
worker.ajp13_worker2.lbfactor=1
worker.ajp13_worker2.max_packet_size=65536

mod-jk (1.2.37) crashes Apache 2 (2.4.7) occasionally with a buffer overflow on Ubuntu 14.04 x64

2016-06-29 Thread Michael Diener
I get occasional Apache 2 crashes being caused by mod_jk and I'm running
out of ideas about the cause of the problem. I hope somebody here can point
me in the right direction.


-Michael


tomcat6 6.0.39-1

libapache2-mod-jk 1:1.2.37-3
apache2 2.4.7-1ubuntu4

java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

/var/log/apache2/error.log


















 buffer overflow detected ***: /usr/sbin/apache2 terminated===
Backtrace:
=/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7fe9aa7de29f]/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fe9aa875bbc]/lib/x86_64-linux-gnu/libc.so.6(+0x109a90)[0x7fe9aa874a90]/lib/x86_64-linux-gnu/libc.so.6(+0x10ab07)[0x7fe9aa875b07]/usr/lib/apache2/modules/mod_jk.so(jk_open_socket+0x8d8)[0x7fe9a7c60cb8]/usr/lib/apache2/modules/mod_jk.so(ajp_connect_to_endpoint+0x65)[0x7fe9a7c7bf75]/usr/lib/apache2/modules/mod_jk.so(+0x36422)[0x7fe9a7c7d422]/usr/lib/apache2/modules/mod_jk.so(+0x1674c)[0x7fe9a7c5d74c]/usr/sbin/apache2(ap_run_handler+0x40)[0x7fe9ab65fbe0]/usr/sbin/apache2(ap_invoke_handler+0x69)[0x7fe9ab660129]/usr/sbin/apache2(ap_process_async_request+0x20a)[0x7fe9ab6756ca]/usr/sbin/apache2(+0x69500)[0x7fe9ab672500]/usr/sbin/apache2(ap_run_process_connection+0x40)[0x7fe9ab669220]/usr/lib/apache2/modules/mod_mpm_event.so(+0x681b)[0x7fe9a783981b]/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fe9aab38184]/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe9aa86537d]*
=== Memory map: 
7fe68800-7fe68806a000 rw-p  00:00 0
7fe68806a000-7fe68c00 ---p  00:00 0
...
7fffa6c27000-7fffa6c48000 rw-p  00:00 0 [stack]
7fffa6c86000-7fffa6c88000 r-xp  00:00 0 [vdso]
ff60-ff601000 r-xp  00:00 0 [vsyscall]
[Wed Jun 29 05:01:50.052325 2016] [core:notice] [pid 1747:tid
140641581987712] AH00051: child pid 17018 exit signal Aborted (6), possible
coredump in /etc/apache2

The log indicates there might be a coredump, but there is not.
There is no log in /var/log/apache2/mod_jk.log at the same time.

/var/log/tomcat6/catalina.out

Jun 29, 2016 5:01:49 AM org.apache.jk.common.ChannelSocket processConnection
WARNING: processCallbacks status 2
Jun 29, 2016 5:01:49 AM org.apache.jk.common.ChannelSocket processConnection
WARNING: processCallbacks status 2

The Tomcat log indicates AFAIK that the client connection has been lost.

/etc/libapache2-mod-jk/httpd-jk.conf



JkWorkersFile /etc/libapache2-mod-jk/workers.properties
JkLogFile /var/log/apache2/mod_jk.log
JkLogLevel warn
JkShmFile /var/log/apache2/jk-runtime-status



/etc/libapache2-mod-jk/workers.properties

workers.tomcat_home=/usr/share/tomcat6
workers.java_home=/usr/lib/jvm/java-6-sun
ps=/

worker.list=loadbalancer

worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=ajp13_worker,ajp13_worker2
worker.loadbalancer.sticky_session=0

worker.ajp13_worker.port=xxx
worker.ajp13_worker.host=localhost
worker.ajp13_worker.type=ajp13
worker.ajp13_worker.lbfactor=50
worker.ajp13_worker.max_packet_size=65536
worker.ajp13_worker.socket_timeout=300
worker.ajp13_worker.ping_mode=A
worker.ajp13_worker.secret=xxx
worker.ajp13_worker.fail_on_status=503
worker.ajp13_worker.connection_pool_size=32768

#worker.ajp13_worker.activation=disabled
worker.ajp13_worker.redirect=ajp13_worker2

worker.ajp13_worker2.port=xxx
worker.ajp13_worker2.host=otherhost
worker.ajp13_worker2.type=ajp13
worker.ajp13_worker2.lbfactor=1
worker.ajp13_worker2.max_packet_size=65536
worker.ajp13_worker2.socket_timeout=300
worker.ajp13_worker2.ping_mode=A
worker.ajp13_worker2.secret=xxx
worker.ajp13_worker2.fail_on_status=503
worker.ajp13_worker2.connection_pool_size=32768

worker.ajp13_worker2.activation=disabled
#worker.ajp13_worker2.redirect=ajp13_worker


/etc/tomcat6/server.xml



ls /etc/apache2/mods-enabled/
access_compat.load auth_basic.load authz_core.load autoindex.conf
deflate.load env.load headers.load mime.conf mpm_event.load rewrite.load
socache_shmcb.load status.conf
alias.conf authn_core.load authz_host.load autoindex.load dir.conf
expires.load jk.conf mime.load negotiation.conf setenvif.conf ssl.conf
status.load
alias.load authn_file.load authz_user.load deflate.conf dir.load
filter.load jk.load mpm_event.conf negotiation.load setenvif.load ssl.load

lsb_release -rd
Description: Ubuntu 14.04.4 LTS
Release: 14.04