Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

2014-11-06 Thread Yann Ylavic
Rebasing discussion here since this thread seems to be referenced in
PR55897, and the discussion has somehow been forked and continued in
[1].

[1]. 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201410.mbox/%3c9acd5b67aac5594cb6268234cf29cf9aa37e9...@orsmsx113.amr.corp.intel.com%3E

On Sat, Oct 11, 2014 at 1:55 AM, Lu, Yingqi yingqi...@intel.com wrote:
 Attached patch is generated based on current trunk. It covers for 
 prefork/worker/event/eventopt MPM.

The patch (modified) has now been applied to trunk with r1635521.

On Thu, Oct 30, 2014 at 5:10 PM, Lu, Yingqi yingqi...@intel.com wrote:
 As this is getting better, I am wondering if you guys have plan to put this 
 SO_REUSEPORT patch into the stable version.
 If yes, do you have a rough timeline?

The whole feature could certainly be proposed for 2.4.x since there is
no (MAJOR) API change.

On Thu, Nov 6, 2014 at 6:52 AM, Lu, Yingqi yingqi...@intel.com wrote:
 I just took some testing on the most recent trunk version.
 I found out that num_buckets is default to 1 (ListenCoresBucketsRatio is 
 default to 0).
 Adding ListenCoresBucketsRatio is great since user can have control over this.
 However, I am thinking it may be better to make this default at 8. This will 
 make the SO_REUSEPORT
 support to be default enabled (8 buckets).
(8 buckets with 64 CPU cores, lucky you...).

Yes this change wrt to your original patch is documented in the commit
message, including how change it to an opt-out.
I chose the opt-in way because I almost always find it safer,
especially for backports to stable.
I have no strong opinion on this regarding trunk, though, could be an
opt-out (easily) there.

Let's see what others say on this and the backport to 2.4.x.
Anyone?

 In case users are not aware of this new ListenCoresBucketsRatio
 configurable flag, they can still enjoy the performance benefits.

Users with 64 cores available should always care about performances tuning ;)

Btw, I wonder if there are other ways to take advantage of the
listeners buckets (even with fewer cores).
The other advantage of SO_REUSEPORT is that, provided that each child
has its own listeners bucket, we can avoid the accept mutex lock
(which also seemed yo be a bottleneck if I recall your original
proposal's discussion correctly).

Did you make any testing without the accept mutex and with a number of
buckets equal to some reasonable ServerLimit (and then play with
ThreadPerChild to reach the MaxClient/MaxRequestWorkers goal)?

Regards,
Yann.


Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

2014-11-06 Thread Yann Ylavic
Hi Yingqi,

let's continue discussing this on the original thread if you don't
mind, I made an update there.

Thanks,
Yann.

On Thu, Nov 6, 2014 at 6:52 AM, Lu, Yingqi yingqi...@intel.com wrote:
 Hi Yann,

 I just took some testing on the most recent trunk version. I found out that 
 num_buckets is default to 1 (ListenCoresBucketsRatio is default to 0). Adding 
 ListenCoresBucketsRatio is great since user can have control over this. 
 However, I am thinking it may be better to make this default at 8. This will 
 make the SO_REUSEPORT support to be default enabled (8 buckets). In case 
 users are not aware of this new ListenCoresBucketsRatio configurable flag, 
 they can still enjoy the performance benefits.

 Please let me know what you think.

 Thanks,
 Yingqi

 -Original Message-
 From: Lu, Yingqi [mailto:yingqi...@intel.com]
 Sent: Thursday, October 30, 2014 9:10 AM
 To: dev@httpd.apache.org
 Subject: RE: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on 
 trunk

 Hi Yann,

 Thank you very much for your help!

 As this is getting better, I am wondering if you guys have plan to put this 
 SO_REUSEPORT patch into the stable version. If yes, do you have a rough 
 timeline?

 The performance gain is great from the patch, I just want to more people 
 being able to take advantage of it.

 Thanks,
 Lucy

 -Original Message-
 From: Yann Ylavic [mailto:ylavic@gmail.com]
 Sent: Thursday, October 30, 2014 8:29 AM
 To: httpd
 Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on 
 trunk

 Hi Yingqi,

 commited in r1635521, with some changes (please see commit log).
 These are not big changes, and your work on removing the global variables and 
 restoring some previous behaviour is great, thanks for the good work.

 Regards,
 Yann.


 On Wed, Oct 29, 2014 at 6:36 PM, Lu, Yingqi yingqi...@intel.com wrote:
 Thank you very much for your help!

 Thanks,
 Yingqi

 -Original Message-
 From: Yann Ylavic [mailto:ylavic@gmail.com]
 Sent: Wednesday, October 29, 2014 10:34 AM
 To: httpd
 Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT
 on trunk

 Hi Yingqi,

 I'm working on it currently, will commit soon.

 Regards,
 Yann.

 On Wed, Oct 29, 2014 at 6:20 PM, Lu, Yingqi yingqi...@intel.com wrote:
 Hi All,

 I just want to check if there is any feedback/comments on this?

 For details, please refer to Yann Ylavic's notes and my responses below.

 Thanks,
 Yingqi

 -Original Message-
 From: Lu, Yingqi [mailto:yingqi...@intel.com]
 Sent: Friday, October 10, 2014 4:56 PM
 To: dev@httpd.apache.org
 Subject: RE: Listeners buckets and duplication w/ and w/o
 SO_REUSEPORT on trunk

 Dear All,

 Attached patch is generated based on current trunk. It covers for 
 prefork/worker/event/eventopt MPM. It supposes to address following issues 
 regarding to SO_RESUEPORT support vs. current trunk version:

 1. Same as current trunk version implementation, when active_CPU_num = 8 
 or when so_reuseport is not supported by the kernel, ap_num_buckets is set 
 to 1. In any case, there is 1 dedicated listener per bucket.

 2. Remove global variables (mpm_listen, enable_default_listeners and 
 num_buckets). mpm_listen is changed to MPM local. enabled_default_listener 
 is completely removed. num_buckets is changed to MPM local 
 (ap_num_buckets). I rename have_so_reuseport to ap_have_so_reuseport. The 
 reason for keeping that global is because this variable is being used by 
 ap_log_common(). Based on the feedback here, I think it may not be a good 
 idea to change the function interface.

 3. Change ap_duplicated_listener to have more parameters. This function is 
 being called from MPM local (prefork.c/worker.c/event.c/eventopt.c). In 
 this function, prefork_listener (or worker_listener/even_listener/etc) 
 array will be initialized and be set value. ap_num_buckets is also 
 calculated inside this function. In addition, this version solves the issue 
 with one_process case (current trunk version has issue with one_process 
 enabled).

 4. Change ap_close_listener() back to previous (2.4.X version).

 5. Change dummy_connection back to previous (2.4.X version).

 6. Add ap_close_duplicated_listeners(). This is called from mpms when 
 stopping httpd.

 7. Add ap_close_child_listener(). When listener_thread (child process in 
 prefork) exit, only the dedicated listener needs to be closed (the rest are 
 already being closed in child_main when the child process starts).

 8. Remove duplication of listener when ap_num_buckets = 1 or without 
 SO_REUSEPORT support (ap_num_buckets = 1). With so_reuseport, only 
 duplicated (ap_num_buckets - 1) listeners (1 less duplication less current 
 trunk implementation).

 9. Inside each mpm, move child_bucket, child_pod and child_mutex 
 (worker/prefork only) to a struct. Also, add member bucket to the same 
 struct.

 Please review and let me know your feedback.

 Thanks,
 Yingqi

 -Original Message-
 From: Yann 

Re: [Patch] mod_ssl SSL_CLIENT_CERT_SUBJECTS - access to full client certificate chain

2014-11-06 Thread Andreas B.

Am 06.11.2014 um 08:34 schrieb Dirk-Willem van Gulik:

On 06 Nov 2014, at 07:05, Kaspar Brand httpd-dev.2...@velox.ch wrote:


11.3.1 Certificate exact match
…
  CertificateExactAssertion ::= SEQUENCE {
  serialNumber  CertificateSerialNumber,
  issuerName }

...

(i.e., we are again back at the point that uniqueness of an X.509
certificate is achieved by its issuer DN plus serial number)

And even that is LDAPs take on it - in a lot of practical cases it gets more 
complex; especially if the leave of the  one but last intermediate is cross 
signed.

Making above more of an LDAP thing than a ‘protocol’ thing.

So therefore:


Is there another way to do this?

Manually performing what certificateExactMatch is specifying, I would
say - i.e., use the (SSL_CLIENT_M_SERIAL,SSL_CLIENT_I_DN) tuple as a
unique identifier for a specific client certificate.

I would argue that the ‘best’ thing we can do is first on the SSL termination 
side — follow the chain; and if there we *conclude* that all is well - and we 
present the evidence of that conclusion ’to what is behind’.

And ideally make that as hard to mis-interpret. So Graham his idea of providing 
a single ‘unique' string which gives the DN’s in order (along with the usual 
SSL_CLIENT.. env vars; including the full ANS.1 if you are so inclined) is 
quite pragmatic. As that string is ‘unique’ within the promise the web frontend 
with its current settings is making.

And it cuts out a lot of easy to make errors. And those who want to do better 
can simply use SSL_CLIENT_CERT and SSL_CLIENT_CERT_0…N — with sufficient code 
to understand things like cross signing and funny order issues. As parsing that 
is not trivial when there are multiple selfsigned/roots in a chain ‘up’.

Dw.
If you want to identify a specific certificate, wouldn't it be possible 
to use sha1/256 fingerprints created with X509_digest||? Or is that 
something LDAP doesn't support? That could be exported with 
SSL_CLIENT_CERT_THUMB and SSL_CLIENT_CERT_THUMB_ALG for the algorithm used.


Re: PR still opened but stated as fixed in CHANGES (2.4 branch)

2014-11-06 Thread Jeff Trawick
On Thu, Nov 6, 2014 at 2:21 AM, Christophe JAILLET 
christophe.jail...@wanadoo.fr wrote:

 Hi,

 if someone is interested, here is a list of all PR still marked as NEW in
 bugzilla but that are stated as fixed in CHANGES in the 2.4 branch.

 I had started to have a look at them in order to close them but:
 - some are confusing to me
 - some have comments that they should be backported to 2.2.


My 2 cents:

It is worth simplifying the process for bug handlers/helpers to reduce the
chance that bugs stay open too long, even if it means that a user
interested in a fix has to know that closed !=
released-from-desired-branch.  I suspect that some of the confusion about
older bugs wouldn't exist if the bug was dispensed with sooner by the
person working on the fix.

* close no later than when the fix is in a stable branch; some fixes may
not seem necessary to backport when the developer first handles, such as
when there is a trivial workaround or it is reported on a dev branch; those
can be closed ASAP
* update a bug with a one-liner when it is committed to any earlier branch
* update a bug with a one-liner when it is first released from any branch
(part of RM?)
* if a user explicitly requests that it be backported, place in appropriate
status file (even without votes if you're just helping address bug updates)
but keep the bug closed; update the bug to mention that it has been proposed



 So I provide it as-is, in case someone want to have a look at it, close
 and comment the ones that should be closed and propose for backport the
 ones that worth it.

 I did the same for 2.2, but the list is more or less empty.

 [{status:NEW,id:56924,version:2.4.10}]
 [{status:NEW,id:57092,version:2.5-HEAD}]
 [{status:NEW,id:56960,version:2.4.10}]
 [{status:NEW,id:53218,version:2.4.7}]
 [{status:NEW,id:56881,version:2.4.10}]
 [{status:NEW,id:56832,version:2.4.10}]
 [{status:NEW,id:56858,version:2.5-HEAD}]
 [{status:NEW,id:53420,version:2.4.1}]
 [{status:NEW,id:56870,version:2.4.10}]
 [{status:NEW,id:56241,version:2.4.1}]
 [{status:NEW,id:53824,version:2.2.22}]
 [{status:NEW,id:55306,version:2.2.25}]
 [{status:NEW,id:54936,version:2.4.4}]
 [{status:NEW,id:54254,version:2.4.3}]
 [{status:NEW,id:54145,version:2.5-HEAD}]
 [{status:NEW,id:54030,version:2.5-HEAD}]
 [{status:NEW,id:53539,version:2.4.2}]
 [{status:NEW,id:45923,version:2.2.9}]
 [{status:NEW,id:52275,version:2.3.15-beta}]
 [{status:NEW,id:53023,version:2.4.1}]
 [{status:NEW,id:52845,version:2.4.1}]
 [{status:NEW,id:52439,version:2.5-HEAD}]
 [{status:NEW,id:52623,version:2.2.22}]
 [{status:NEW,id:52342,version:2.2.21}]
 [{status:NEW,id:42682,version:HEAD}]
 [{status:NEW,id:51709,version:2.2.19}]
 [{status:NEW,id:26005,version:2.0.48}]
 [{status:NEW,id:50592,version:2.2.17}]
 [{status:NEW,id:48958,version:2.2.3}]
 [{status:NEW,id:47634,version:2.2.11}]
 [{status:NEW,id:24243,version:2.0.47}]
 [{status:NEW,id:40312,version:2.2-HEAD}]
 [{status:NEW,id:46270,version:2.5-HEAD}]
 [{status:NEW,id:41646,version:2.2.4}]
 [{status:NEW,id:39299,version:2.0.55}]


 Best regards,
 CJ




-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: [Patch] mod_ssl SSL_CLIENT_CERT_SUBJECTS - access to full client certificate chain

2014-11-06 Thread Dirk-Willem van Gulik

 On 06 Nov 2014, at 14:14, Andreas B. regis...@progandy.de wrote:
 
 Am 06.11.2014 um 08:34 schrieb Dirk-Willem van Gulik:
 On 06 Nov 2014, at 07:05, Kaspar Brand httpd-dev.2...@velox.ch 
 mailto:httpd-dev.2...@velox.ch wrote:
 
 11.3.1 Certificate exact match
 …
  CertificateExactAssertion ::= SEQUENCE {
  serialNumber  CertificateSerialNumber,
  issuerName }
 ...
 (i.e., we are again back at the point that uniqueness of an X.509
 certificate is achieved by its issuer DN plus serial number)
 And even that is LDAPs take on it - in a lot of practical cases it gets more 
 complex; especially if the leave of the  one but last intermediate is cross 
 signed.
 
 Making above more of an LDAP thing than a ‘protocol’ thing.
 
 So therefore:
 
 Is there another way to do this?
 Manually performing what certificateExactMatch is specifying, I would
 say - i.e., use the (SSL_CLIENT_M_SERIAL,SSL_CLIENT_I_DN) tuple as a
 unique identifier for a specific client certificate.
 I would argue that the ‘best’ thing we can do is first on the SSL 
 termination side — follow the chain; and if there we *conclude* that all is 
 well - and we present the evidence of that conclusion ’to what is behind’.
 
 And ideally make that as hard to mis-interpret. So Graham his idea of 
 providing a single ‘unique' string which gives the DN’s in order (along with 
 the usual SSL_CLIENT.. env vars; including the full ANS.1 if you are so 
 inclined) is quite pragmatic. As that string is ‘unique’ within the promise 
 the web frontend with its current settings is making.
 
 And it cuts out a lot of easy to make errors. And those who want to do 
 better can simply use SSL_CLIENT_CERT and SSL_CLIENT_CERT_0…N — with 
 sufficient code to understand things like cross signing and funny order 
 issues. As parsing that is not trivial when there are multiple 
 selfsigned/roots in a chain ‘up’.
 
 Dw.
 If you want to identify a specific certificate, wouldn't it be possible to 
 use sha1/256 fingerprints created with X509_digest? Or is that something LDAP 
 doesn't support? That could be exported with SSL_CLIENT_CERT_THUMB and 
 SSL_CLIENT_CERT_THUMB_ALG for the algorithm used.

One could. The issue is that in some systems (e.g. a medical one I am currently 
trying to come to terms with) the certs ar renewed very often; and the 
fingerprint does not stay stable. This is also an issue with using the serial 
and the issuer DN.

Dw.

Re: APR_INT32_MAX used by mod_delate-2.2.29

2014-11-06 Thread Jeff Trawick
On Sat, Nov 1, 2014 at 6:03 PM, Yann Ylavic ylavic@gmail.com wrote:

 mod_deflate in httpd-2.2.29 is using APR_INT32_MAX which is only
 available since APR-1.3.

 However httpd-2.x seems to require APR-1.2 only
 (http://httpd.apache.org/docs/2.2/install.html#requirements).

 Should we apply something like :

 Index: modules/filters/mod_deflate.c
 ===
 --- modules/filters/mod_deflate.c(revision 1635743)
 +++ modules/filters/mod_deflate.c(working copy)
 @@ -47,6 +47,10 @@
  #define APR_WANT_STRFUNC
  #include apr_want.h

 +#ifndef APR_INT32_MAX
 +#define APR_INT32_MAX 0x7fff
 +#endif
 +
  #include zlib.h

  static const char deflateFilterName[] = DEFLATE;
 ?

 Regards,
 Yann.


+1 here

(babbling about generic concerns with using a level of APR that we haven't
used in years has been removed)

-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: APR_INT32_MAX used by mod_delate-2.2.29

2014-11-06 Thread Yann Ylavic
On Thu, Nov 6, 2014 at 2:26 PM, Jeff Trawick traw...@gmail.com wrote:
 On Sat, Nov 1, 2014 at 6:03 PM, Yann Ylavic ylavic@gmail.com wrote:

 mod_deflate in httpd-2.2.29 is using APR_INT32_MAX which is only
 available since APR-1.3.

 However httpd-2.x seems to require APR-1.2 only
 (http://httpd.apache.org/docs/2.2/install.html#requirements).

 Should we apply something like :

 Index: modules/filters/mod_deflate.c
 ===
 --- modules/filters/mod_deflate.c(revision 1635743)
 +++ modules/filters/mod_deflate.c(working copy)
 @@ -47,6 +47,10 @@
  #define APR_WANT_STRFUNC
  #include apr_want.h

 +#ifndef APR_INT32_MAX
 +#define APR_INT32_MAX 0x7fff
 +#endif
 +
  #include zlib.h

  static const char deflateFilterName[] = DEFLATE;
 ?

 Regards,
 Yann.


 +1 here

 (babbling about generic concerns with using a level of APR that we haven't
 used in years has been removed)

OK, patch against 2.2.x (only) proposed by r1637104.


Re: Building APR using cmake on Windows

2014-11-06 Thread Jeff Trawick
On Thu, Oct 30, 2014 at 1:53 PM, Edward Lu chaos...@gmail.com wrote:

 Using Visual Studio 2013 command line tools, on Windows Server 2012 rc2.
 Running everything from batch files through Cygwin. I cloned the APR repo
 from here: https://github.com/apache/apr

 I downloaded some prebuilt libxml2 and iconv binaries for Windows. Then, I
 ran the commands:

 C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\vcvars32.bat
 C:\Program Files (x86)\CMake\bin\cmake -G NMake Makefiles
 -DLIBXML2_LIBRARIES=C:\lib\libxml2-2.7.8.win32\lib\libxml2.lib
 -DLIBXML2_INCLUDE_DIR=C:\lib\libxml2-2.7.8.win32\include
 -DLIBXML2_ICONV_INCLUDE_DIR=C:\lib\iconv-1.9.2.win32\include
 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=..\built ..

 That worked fine. Then, I tried running nmake on the generated makefiles,
 which ran fine until it tried to link libapr-2.dll. That got me the
 following errors:

 Linking C shared library libapr-1.dll
Creating library libapr-1.lib and object libapr-1.exp
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedIncrement referenced in function _apr_atomic_inc32@4
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedExchangeAdd referenced in function _apr_atomic_add32@8
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedExchange referenced in function _apr_atomic_set32@8
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedDecrement referenced in function _apr_atomic_dec32@4
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedCompareExchange referenced in function _apr_atomic_cas32@12
 libapr-1.dll : fatal error LNK1120: 5 unresolved externals

 After some searching, I ran across a unimrcp thread that talked about
 solving this issue. I tracked down the APR patches they are using in that
 project here:
 https://sites.google.com/a/unimrcp.org/unimrcp/dependencies/apr-1.5.1.tar.gz?attredirects=0d=1

 Specifically, check out the patch labeled as 0002. it boils down to
 removing all the typesafe casts in apr_atomic.c, which introduces a bunch
 of compiler warnings. That seems to work, though I doubt it's the right
 thing to do. I also tried checking out the 1.5.x branch of APR, which give
 the same errors. I don't generally develop on Windows, so this one has me
 stumped. Any ideas?


I just filed this APR bug to track this:
https://issues.apache.org/bugzilla/show_bug.cgi?id=57191

Note that dev@apr is the proper mailing list for this issue.  And I'm
pretty sure that if I had the mail setup I had 15 years ago I would have
noticed that a lot sooner :(

-- 
Born in Roswell... married an alien...
http://emptyhammock.com/


Re: Building APR using cmake on Windows

2014-11-06 Thread Edward Lu
Thanks Jeff. Will move further discussion to the bug/proper mailing list.

On Thu, Nov 6, 2014 at 10:15 AM, Jeff Trawick traw...@gmail.com wrote:

 On Thu, Oct 30, 2014 at 1:53 PM, Edward Lu chaos...@gmail.com wrote:

 Using Visual Studio 2013 command line tools, on Windows Server 2012 rc2.
 Running everything from batch files through Cygwin. I cloned the APR repo
 from here: https://github.com/apache/apr

 I downloaded some prebuilt libxml2 and iconv binaries for Windows. Then,
 I ran the commands:

 C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\vcvars32.bat
 C:\Program Files (x86)\CMake\bin\cmake -G NMake Makefiles
 -DLIBXML2_LIBRARIES=C:\lib\libxml2-2.7.8.win32\lib\libxml2.lib
 -DLIBXML2_INCLUDE_DIR=C:\lib\libxml2-2.7.8.win32\include
 -DLIBXML2_ICONV_INCLUDE_DIR=C:\lib\iconv-1.9.2.win32\include
 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=..\built ..

 That worked fine. Then, I tried running nmake on the generated makefiles,
 which ran fine until it tried to link libapr-2.dll. That got me the
 following errors:

 Linking C shared library libapr-1.dll
Creating library libapr-1.lib and object libapr-1.exp
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedIncrement referenced in function _apr_atomic_inc32@4
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedExchangeAdd referenced in function _apr_atomic_add32@8
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedExchange referenced in function _apr_atomic_set32@8
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedDecrement referenced in function _apr_atomic_dec32@4
 apr_atomic.c.obj : error LNK2019: unresolved external symbol
 __InterlockedCompareExchange referenced in function _apr_atomic_cas32@12
 libapr-1.dll : fatal error LNK1120: 5 unresolved externals

 After some searching, I ran across a unimrcp thread that talked about
 solving this issue. I tracked down the APR patches they are using in that
 project here:
 https://sites.google.com/a/unimrcp.org/unimrcp/dependencies/apr-1.5.1.tar.gz?attredirects=0d=1

 Specifically, check out the patch labeled as 0002. it boils down to
 removing all the typesafe casts in apr_atomic.c, which introduces a bunch
 of compiler warnings. That seems to work, though I doubt it's the right
 thing to do. I also tried checking out the 1.5.x branch of APR, which give
 the same errors. I don't generally develop on Windows, so this one has me
 stumped. Any ideas?


 I just filed this APR bug to track this:
 https://issues.apache.org/bugzilla/show_bug.cgi?id=57191

 Note that dev@apr is the proper mailing list for this issue.  And I'm
 pretty sure that if I had the mail setup I had 15 years ago I would have
 noticed that a lot sooner :(

 --
 Born in Roswell... married an alien...
 http://emptyhammock.com/




Re: Server(_rec) unique identifier

2014-11-06 Thread Yann Ylavic
On Sat, Nov 1, 2014 at 10:25 AM, Kaspar Brand httpd-dev.2...@velox.ch wrote:
 On 29.10.2014 11:41, Yann Ylavic wrote:
 I chose to use (MD5 digest) all the IP:port from the s-addrs list
 (ie. VitualHost IP|*|_default_:port ...), plus s-server_hostname
 and s-port (ie. ServerName, be it configured or not, knowing that in
 the latter case, apr_gethostname() is used fot the main server, and
 the main server's one is used for the vhosts).

 Just an observation on the digest you're proposing: while it doesn't
 seem necessary to proactively kill MD5 in httpd when it is used for
 non-crypto purposes (see also RFC 6151), I would prefer another digest
 algorithm being picked for new things (apr_sha1_* perhaps, considering
 that APR doesn't currently have SHA-2 support?).

I tend to agree, although SHA-1 is not much better wrt to security,
and SHA-2 is not available in the minimal APR version(s) supported by
httpd(s).
In this case (non-crypto purpose, avoid non-malicious collisions),
SHA-1's digest has 4 more bytes (per scoreboard entry here), is
probably slower than MD5 (not in a fast path though), and this
concerns 2.2.x only.
Not sure it is worth the effort...


RE: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

2014-11-06 Thread Lu, Yingqi
Hi Yann,

I don't mind at all. I will keep discussion following your reply there.

Thanks,
Yingqi

-Original Message-
From: Yann Ylavic [mailto:ylavic@gmail.com] 
Sent: Thursday, November 06, 2014 5:00 AM
To: httpd
Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

Hi Yingqi,

let's continue discussing this on the original thread if you don't mind, I made 
an update there.

Thanks,
Yann.

On Thu, Nov 6, 2014 at 6:52 AM, Lu, Yingqi yingqi...@intel.com wrote:
 Hi Yann,

 I just took some testing on the most recent trunk version. I found out that 
 num_buckets is default to 1 (ListenCoresBucketsRatio is default to 0). Adding 
 ListenCoresBucketsRatio is great since user can have control over this. 
 However, I am thinking it may be better to make this default at 8. This will 
 make the SO_REUSEPORT support to be default enabled (8 buckets). In case 
 users are not aware of this new ListenCoresBucketsRatio configurable flag, 
 they can still enjoy the performance benefits.

 Please let me know what you think.

 Thanks,
 Yingqi

 -Original Message-
 From: Lu, Yingqi [mailto:yingqi...@intel.com]
 Sent: Thursday, October 30, 2014 9:10 AM
 To: dev@httpd.apache.org
 Subject: RE: Listeners buckets and duplication w/ and w/o SO_REUSEPORT 
 on trunk

 Hi Yann,

 Thank you very much for your help!

 As this is getting better, I am wondering if you guys have plan to put this 
 SO_REUSEPORT patch into the stable version. If yes, do you have a rough 
 timeline?

 The performance gain is great from the patch, I just want to more people 
 being able to take advantage of it.

 Thanks,
 Lucy

 -Original Message-
 From: Yann Ylavic [mailto:ylavic@gmail.com]
 Sent: Thursday, October 30, 2014 8:29 AM
 To: httpd
 Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT 
 on trunk

 Hi Yingqi,

 commited in r1635521, with some changes (please see commit log).
 These are not big changes, and your work on removing the global variables and 
 restoring some previous behaviour is great, thanks for the good work.

 Regards,
 Yann.


 On Wed, Oct 29, 2014 at 6:36 PM, Lu, Yingqi yingqi...@intel.com wrote:
 Thank you very much for your help!

 Thanks,
 Yingqi

 -Original Message-
 From: Yann Ylavic [mailto:ylavic@gmail.com]
 Sent: Wednesday, October 29, 2014 10:34 AM
 To: httpd
 Subject: Re: Listeners buckets and duplication w/ and w/o 
 SO_REUSEPORT on trunk

 Hi Yingqi,

 I'm working on it currently, will commit soon.

 Regards,
 Yann.

 On Wed, Oct 29, 2014 at 6:20 PM, Lu, Yingqi yingqi...@intel.com wrote:
 Hi All,

 I just want to check if there is any feedback/comments on this?

 For details, please refer to Yann Ylavic's notes and my responses below.

 Thanks,
 Yingqi

 -Original Message-
 From: Lu, Yingqi [mailto:yingqi...@intel.com]
 Sent: Friday, October 10, 2014 4:56 PM
 To: dev@httpd.apache.org
 Subject: RE: Listeners buckets and duplication w/ and w/o 
 SO_REUSEPORT on trunk

 Dear All,

 Attached patch is generated based on current trunk. It covers for 
 prefork/worker/event/eventopt MPM. It supposes to address following issues 
 regarding to SO_RESUEPORT support vs. current trunk version:

 1. Same as current trunk version implementation, when active_CPU_num = 8 
 or when so_reuseport is not supported by the kernel, ap_num_buckets is set 
 to 1. In any case, there is 1 dedicated listener per bucket.

 2. Remove global variables (mpm_listen, enable_default_listeners and 
 num_buckets). mpm_listen is changed to MPM local. enabled_default_listener 
 is completely removed. num_buckets is changed to MPM local 
 (ap_num_buckets). I rename have_so_reuseport to ap_have_so_reuseport. The 
 reason for keeping that global is because this variable is being used by 
 ap_log_common(). Based on the feedback here, I think it may not be a good 
 idea to change the function interface.

 3. Change ap_duplicated_listener to have more parameters. This function is 
 being called from MPM local (prefork.c/worker.c/event.c/eventopt.c). In 
 this function, prefork_listener (or worker_listener/even_listener/etc) 
 array will be initialized and be set value. ap_num_buckets is also 
 calculated inside this function. In addition, this version solves the issue 
 with one_process case (current trunk version has issue with one_process 
 enabled).

 4. Change ap_close_listener() back to previous (2.4.X version).

 5. Change dummy_connection back to previous (2.4.X version).

 6. Add ap_close_duplicated_listeners(). This is called from mpms when 
 stopping httpd.

 7. Add ap_close_child_listener(). When listener_thread (child process in 
 prefork) exit, only the dedicated listener needs to be closed (the rest are 
 already being closed in child_main when the child process starts).

 8. Remove duplication of listener when ap_num_buckets = 1 or without 
 SO_REUSEPORT support (ap_num_buckets = 1). With so_reuseport, only 
 duplicated (ap_num_buckets - 1) listeners (1 

RE: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

2014-11-06 Thread Lu, Yingqi
Hi Yann,

I do not see any documents regarding to this new configurable flag 
ListenCoresBucketsRatio (maybe I missed it) and also users may not be familiar 
with it, I still think maybe it is better to keep the default to 8 at least in 
the trunk. 

Regarding to how to make small systems take advantage of this patch, I actually 
did some testing on system with less cores. The data show that when system has 
less than 16 cores, more than 1 bucket does not bring any throughput and 
response time benefits. The patch is used mainly for big systems to resolve the 
scalability issue. That is the reason why we previously hard coded the ratio to 
8 (impact only on system has 16 cores or more). 

The accept_mutex is not much a bottleneck anymore with the current patch 
implantation. Current implementation already cut 1 big mutex into multiple 
smaller mutexes in the multiple listen statements case (each bucket has its 
dedicated accept_mutex). To prove this, our data show performance parity 
between 1 listen statement (listen 80, no accept_mutex) and 2 listen statements 
(listen 192.168.1.1 80, listen 192.168.1.2 80, with accept_mutex) with current 
trunk version. Comparing against without SO_REUSEPORT patch, we see 28% 
performance gain with 1 listen statement case and 69% gain with 2 listen 
statements case. 

Regarding to the approach that enables each child has its own listen socket, I 
did some testing with current trunk version to increase the number of buckets 
to be equal to a reasonable serverlimit (this avoids number of child processes 
changes). I also verified that MaxClient and ThreadPerChild were set properly. 
I used single listen statement so that accept_mutex was disabled. Comparing 
against the current approach, this has ~25% less throughput with significantly 
higher response time.

In addition to this, implementing the listen socket for each child separately 
has less performance as well as connection loss/timeout issues with current 
Linux kernel. Below are more information/data we collected with each child 
process has its own listen socket approach:
1. During the run, we noticed that there are tons of “read timed out” errors. 
These errors not only happen when the system is highly utilized, it even 
happens when system is only 10% utilized. The response time was high.
2. Compared to current trunk implementation, we found each child has its own 
listen socket approach results in significantly higher (up to 10X) response 
time at different CPU utilization levels. At peak performance level, it has 
20+% less throughput with tons of “connection reset” errors in additional to 
“read timed out” errors. Current trunk implementation does not have errors.
3. During the graceful restart, there are tons of connection losses. 

Based on the above findings, I think we may want to keep the current approach. 
It is a clean, working and better performing one :-)

Thanks,
Yingqi


-Original Message-
From: Yann Ylavic [mailto:ylavic@gmail.com]
Sent: Thursday, November 06, 2014 4:59 AM
To: httpd
Subject: Re: Listeners buckets and duplication w/ and w/o SO_REUSEPORT on trunk

Rebasing discussion here since this thread seems to be referenced in PR55897, 
and the discussion has somehow been forked and continued in [1].

[1]. 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201410.mbox/%3c9acd5b67aac5594cb6268234cf29cf9aa37e9...@orsmsx113.amr.corp.intel.com%3E

On Sat, Oct 11, 2014 at 1:55 AM, Lu, Yingqi yingqi...@intel.com wrote:
 Attached patch is generated based on current trunk. It covers for 
 prefork/worker/event/eventopt MPM.

The patch (modified) has now been applied to trunk with r1635521.

On Thu, Oct 30, 2014 at 5:10 PM, Lu, Yingqi yingqi...@intel.com wrote:
 As this is getting better, I am wondering if you guys have plan to put this 
 SO_REUSEPORT patch into the stable version.
 If yes, do you have a rough timeline?

The whole feature could certainly be proposed for 2.4.x since there is no 
(MAJOR) API change.

On Thu, Nov 6, 2014 at 6:52 AM, Lu, Yingqi yingqi...@intel.com wrote:
 I just took some testing on the most recent trunk version.
 I found out that num_buckets is default to 1 (ListenCoresBucketsRatio is 
 default to 0).
 Adding ListenCoresBucketsRatio is great since user can have control over this.
 However, I am thinking it may be better to make this default at 8. 
 This will make the SO_REUSEPORT support to be default enabled (8 buckets).
(8 buckets with 64 CPU cores, lucky you...).

Yes this change wrt to your original patch is documented in the commit message, 
including how change it to an opt-out.
I chose the opt-in way because I almost always find it safer, especially for 
backports to stable.
I have no strong opinion on this regarding trunk, though, could be an opt-out 
(easily) there.

Let's see what others say on this and the backport to 2.4.x.
Anyone?

 In case users are not aware of this new ListenCoresBucketsRatio 
 configurable flag, they can still enjoy the