Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-19 Thread Ivan Zahariev

  
  
Hello,
It turns out that there is an existing bug report for this since
  year 2011: https://bz.apache.org/bugzilla/show_bug.cgi?id=51747
The latest proposed patch by Jan Stürtz on 2015-08-28 fixes both
  the Linux and Windows versions and still applies OK. Furthermore,
  it uses the same algorithm as mine but is better coded by using
  the existing Apache APR library functions.
This patch seems suitable for production, and we're already
  evaluating it on our servers.
Best regards.
  --Ivan


On 17.1.2017 г. 17:49 ч., Ivan Zahariev
  wrote:


  
  Hi,
  Here is the patch --
https://github.com/famzah/mod_fcgid/commit/84c7c2dbf2047745c6aea87a4bc6b4061c98ac8e
  A few notes:
  
I tested it with several files which had random data and
  various lengths: 0b, 127b, 128b, 129b, 150MB, 1MB, 1b, 250MB,
  350MB, 8191b, 8192b, 8193b. The uploaded size and SHA1
  checksum match.

Also made sure that the special case "There are something
  left" in the code was tested too, at the end when "nvec != 0".
I can't test this on Windows but porting the patch in
  "fcgid_proc_win.c" looks easy, since the same loop method is
  used there too. Additionally, "fcgid_proc_win.c" doesn't group
  write requests in 8 chunks and does everything one-by-one (one
  read, then one write to the IPC socket).
  
  
  Please review the patch. If it looks good, I'll test it on our
  production machines and then submit it for mainstream merge.
  
  Best regards.
  --Ivan
  
  On 17.1.2017 г. 13:03 ч., Yann Ylavic
wrote:
  
  
    On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev <fam...@famzah.net> wrote:


  1. Delete each bucket after sending it to the "ipc_handle". I've looked through
the call tree and the *output_brigade is last used by proc_write_ipc().
Therefore, it should be safe to empty it while being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
different way. Instead of using apr_bucket_read(), they process FILE buckets
by apr_file_read() and manage the data buffer manually. This way the
original *output_brigade won't be modified and automatically split by
apr_bucket_read(). This requires more coding work.


I would go for 1., but keep in mind that you can't delete the bucket
while it is still pointed to by the iovec...

An (optimal) alternative would be to use apr_socket_sendfile() (for
file buckets) like sendfile_nonblocking() does in httpd's core output
filter (see server/core_filters.c).
But 1. is wise still for non file buckets.

Regards,
Yann.

  
  


  



Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Ivan Zahariev

  
  
Hi,
Here is the patch --
https://github.com/famzah/mod_fcgid/commit/84c7c2dbf2047745c6aea87a4bc6b4061c98ac8e
A few notes:

  I tested it with several files which had random data and
various lengths: 0b, 127b, 128b, 129b, 150MB, 1MB, 1b, 250MB,
350MB, 8191b, 8192b, 8193b. The uploaded size and SHA1 checksum
match.
  
  Also made sure that the special case "There are something
left" in the code was tested too, at the end when "nvec != 0".
  I can't test this on Windows but porting the patch in
"fcgid_proc_win.c" looks easy, since the same loop method is
used there too. Additionally, "fcgid_proc_win.c" doesn't group
write requests in 8 chunks and does everything one-by-one (one
read, then one write to the IPC socket).


Please review the patch. If it looks good, I'll test it on our
production machines and then submit it for mainstream merge.

Best regards.
--Ivan

On 17.1.2017 г. 13:03 ч., Yann Ylavic
  wrote:


      On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev <fam...@famzah.net> wrote:

  

1. Delete each bucket after sending it to the "ipc_handle". I've looked through
the call tree and the *output_brigade is last used by proc_write_ipc().
Therefore, it should be safe to empty it while being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
different way. Instead of using apr_bucket_read(), they process FILE buckets
by apr_file_read() and manage the data buffer manually. This way the
original *output_brigade won't be modified and automatically split by
apr_bucket_read(). This requires more coding work.

  
  
I would go for 1., but keep in mind that you can't delete the bucket
while it is still pointed to by the iovec...

An (optimal) alternative would be to use apr_socket_sendfile() (for
file buckets) like sendfile_nonblocking() does in httpd's core output
filter (see server/core_filters.c).
But 1. is wise still for non file buckets.

Regards,
Yann.



  



mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Ivan Zahariev

Hello,

If a large amount of data is POST'ed to a process running mod_fcgid, the 
Apache child uses an excessive amount of memory when processing it.


The client request is properly received and the following statement from 
the documentation is true: "Once the amount of request body read from 
the client exceeds FcgidMaxRequestInMem bytes, the remainder of the 
request body will be stored in a temporary file."


The problem occurs when the temporary file is being sent to the FastCGI 
handling process via its IPC socket. Here is the corresponding function 
which sends the prepared "output_brigade" to the IPC socket: 
https://gist.github.com/famzah/125a91971fba3450dd4926fe13e0ede6


The documentation of apr_bucket_read() clearly states that "if buckets 
are read in a loop, and aren't deleted after being processed, the 
potentially large bucket will slowly be converted into RAM resident heap 
buckets. If the file is larger than available RAM, an out of memory 
condition could be caused."


I need your guidance, in order to fix this properly. I've researched a 
bit and see the following possible options to fix this:


1. Delete each bucket after sending it to the "ipc_handle". I've looked
   through the call tree and the *output_brigade is last used by
   proc_write_ipc(). Therefore, it should be safe to empty it while
   being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
   different way. Instead of using apr_bucket_read(), they process FILE
   buckets by apr_file_read() and manage the data buffer manually. This
   way the original *output_brigade won't be modified and automatically
   split by apr_bucket_read(). This requires more coding work.


Best regards.
--Ivan


Re: mod_fcgid: Immediate HTTP error 503 if the max total process count is reached

2016-09-30 Thread Ivan Zahariev

Hi,

Thanks for the review. You can find my comments below in the reply.


On 20.9.2016 г. 16:33 ч., Eric Covener wrote:

Unfortunately there are not many reviewers for mod_fcgid.
I tried to take as a relative laymen and had a few comments/questions:

* a few C99 // comments were added and should be zapped

Fixed.


* the assert looks funny -- no assert.   maybe ap_debug_assert() so it
blows up in maintainer builds only?
It's a paranoid check. If it ever blows the whistle, this will be in 
some very hard to reproduce case, and I don't expect that we catch this 
on a maintainer build. I made this an APLOG_CRIT message because a 
mismatch in the counts between proctable_count_all_nonfree_nodes() and 
"g_total_process" could make your FastCGI requests fail erroneously with 
an immediate 503 HTTP error. A System Administrator who could encounter 
this bug will appreciate the immediate APLOG_CRIT message in the 
"error_log".


I don't insist this to remain like this and I will change it as you say, 
in order to comply with the usual development practices. Or we can give 
this function check a better name.



* The new flag is merged, but cannot be unset?
** Probably need to make it look like an AP_INIT_FLAG

Right. I've converted this to an AP_INIT_FLAG which makes more sense.

I don't understand what you mean by "it cannot be unset". I'm merging it 
in the same way as other mod_fcgid config options. Please read below for 
more information regarding this matter.



* I couldn't follow the concept in the comments around
max_process_count moving or being per-request. Is it maybe out of
date? I think the current impl takes an existing global only and
merges it with vhost server confs but it will always be a copy.


The original implementation allows you to set "max_process_count" in the 
GLOBAL_ONLY context. This is enforced by ap_check_cmd_context() in the 
AP_INIT_TAKE1() handler, because RSRC_CONF would otherwise allow us to 
define this config option also "per vhost".


I've added the new variable "max_process_used_no_wait_enable" in the 
same way.


The original implementation works well without further magic, because it 
really uses "max_process_count" in code which works with the 
"main_server" (global) context. Now I need to access the value of 
"max_process_count" inside a "request_rec" (per VirtualHost) variable. 
If one or more mod_fcgid directives are defined not only within the main 
server config, but also as additional settings in a VirtualHost context, 
then we need to merge "max_process_count", too. If we don't merge, the 
value of "max_process_count" returned by the "request_rec" variable will 
be the default init value (zero, in our case).


Long story short -- you are right, this is a global only variable and it 
merges with vhost confs always as a copy of the global value. This is 
just a behind-the-scenes action and doesn't affect Apache 
administrators, since they can't define this variable in a vhost context 
anyway. I tried to summarize this as the comment ["global only" but we 
use its main_server value in a "request_rec"] but maybe someone can 
propose a better text.


I've committed the source code changes and you can review them again. 
When we're satisfied with the changes, I'll test the final version on 
our server fleet.


Best regards.
--Ivan


Re: mod_fcgid: Immediate HTTP error 503 if the max total process count is reached

2016-09-19 Thread Ivan Zahariev

Hello devs,

It's been four months since I originally proposed this new feature in 
"mod_fcgid". During this time I've tested and deployed it on hundreds of 
busy production servers. It works as expected. If enabled, web visitors 
get an immediate response when FastCGI is overloaded, and no RAM is 
over-utilized.


You can find all the information and a patch in the enhancement request 
at Bugzilla: https://bz.apache.org/bugzilla/show_bug.cgi?id=59656


Can we get this merged into "mod_fcgid"?

Best regards.
--Ivan


On 2.6.2016 г. 11:57 ч., Ivan Zahariev wrote:

Hi Nick,

Thanks for the info.

I've followed your instructions and submitted an enhancement request: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=59656


Cheers.
--Ivan


On 31.5.2016 г. 13:45 ч., Nick Kew wrote:

On Tue, 2016-05-31 at 11:15 +0300, Ivan Zahariev wrote:

Hello,

I got no feedback. Am I posting this suggestion at the right mailing
list?

Sorry, I see your original post marked for attention in my mail
folder, but languishing hitherto unattended.  Just now opened your
link in a browser to take a look.  There could be others who
have done something similar.

As a general reply to this question, yes, this is the best
available mailinglist.  The other place to post it would be
as an enhancement request in bugzilla (issues.apache.org).
The keyword "PatchAvailable" there may help by marking it as
low-hanging fruit.







Re: mod_fcgid: Immediate HTTP error 503 if the max total process count is reached

2016-06-02 Thread Ivan Zahariev

Hi Nick,

Thanks for the info.

I've followed your instructions and submitted an enhancement request: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=59656


Cheers.
--Ivan


On 31.5.2016 г. 13:45 ч., Nick Kew wrote:

On Tue, 2016-05-31 at 11:15 +0300, Ivan Zahariev wrote:

Hello,

I got no feedback. Am I posting this suggestion at the right mailing
list?

Sorry, I see your original post marked for attention in my mail
folder, but languishing hitherto unattended.  Just now opened your
link in a browser to take a look.  There could be others who
have done something similar.

As a general reply to this question, yes, this is the best
available mailinglist.  The other place to post it would be
as an enhancement request in bugzilla (issues.apache.org).
The keyword "PatchAvailable" there may help by marking it as
low-hanging fruit.





Re: mod_fcgid: Immediate HTTP error 503 if the max total process count is reached

2016-05-31 Thread Ivan Zahariev

Hello,

I got no feedback. Am I posting this suggestion at the right mailing list?

Best regards.
--Ivan

On 19.5.2016 г. 10:40 ч., Ivan Zahariev wrote:

Hi all,

I'd like to propose a new configuration setting for "mod_fcgid". The 
source code changes to review follow:


  * The whole patch compared to version 2.3.9:

https://github.com/famzah/mod_fcgid/compare/2.3.9...maxnowait?diff=split=maxnowait
  * The whole patch as a single file:

https://github.com/famzah/mod_fcgid/compare/2.3.9...maxnowait.patch?diff=unified=maxnowait
  * Every single commit compared to version 2.3.9:
https://github.com/famzah/mod_fcgid/commits/maxnowait
  * There should be no merge conflicts with the current "trunk"
version 2.3.10.


The motivation is that the current behavior to queue up new pending 
requests differs from the RLimitNPROC directive behavior. When there 
is a spike in the web hits, lots of Apache children get busy just 
waiting for up to 30+ seconds to get an idle FastCGI process. Thus we 
"waste" Apache children doing nothing while they could serve static 
content. This also puts unneeded memory pressure. Additionally, users 
today won't wait for a page to load, if it takes more than a few 
seconds, and we'd rather notify them that we are currently overloaded 
by sending them a 503 HTTP response immediately.


Here is the documentation for the new directive: 
http://www.famzah.net/temp/FcgidMaxProcessesUsedNoWait.docs.txt


Let me know what you think.

Best regards.
--Ivan





mod_fcgid: Immediate HTTP error 503 if the max total process count is reached

2016-05-19 Thread Ivan Zahariev

  
  
Hi all,

I'd like to propose a new configuration setting for "mod_fcgid". The
source code changes to review follow:

  The whole patch compared to version 2.3.9:
https://github.com/famzah/mod_fcgid/compare/2.3.9...maxnowait?diff=split=maxnowait
  The whole patch as a single file:
https://github.com/famzah/mod_fcgid/compare/2.3.9...maxnowait.patch?diff=unified=maxnowait
  Every single commit compared to version 2.3.9:
https://github.com/famzah/mod_fcgid/commits/maxnowait
  There should be no merge conflicts with the current "trunk"
version 2.3.10.


The motivation is that the current behavior to queue up new pending
requests differs from the RLimitNPROC directive behavior. When there
is a spike in the web hits, lots of Apache children get busy just
waiting for up to 30+ seconds to get an idle FastCGI process. Thus
we "waste" Apache children doing nothing while they could serve
static content. This also puts unneeded memory pressure.
Additionally, users today won't wait for a page to load, if it takes
more than a few seconds, and we'd rather notify them that we are
currently overloaded by sending them a 503 HTTP response
immediately.

Here is the documentation for the new directive:
http://www.famzah.net/temp/FcgidMaxProcessesUsedNoWait.docs.txt

Let me know what you think.

Best regards.
--Ivan