Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-19 Thread Ivan Zahariev

  
  
Hello,
It turns out that there is an existing bug report for this since
  year 2011: https://bz.apache.org/bugzilla/show_bug.cgi?id=51747
The latest proposed patch by Jan Stürtz on 2015-08-28 fixes both
  the Linux and Windows versions and still applies OK. Furthermore,
  it uses the same algorithm as mine but is better coded by using
  the existing Apache APR library functions.
This patch seems suitable for production, and we're already
  evaluating it on our servers.
Best regards.
  --Ivan


On 17.1.2017 г. 17:49 ч., Ivan Zahariev
  wrote:


  
  Hi,
  Here is the patch --
https://github.com/famzah/mod_fcgid/commit/84c7c2dbf2047745c6aea87a4bc6b4061c98ac8e
  A few notes:
  
I tested it with several files which had random data and
  various lengths: 0b, 127b, 128b, 129b, 150MB, 1MB, 1b, 250MB,
  350MB, 8191b, 8192b, 8193b. The uploaded size and SHA1
  checksum match.

Also made sure that the special case "There are something
  left" in the code was tested too, at the end when "nvec != 0".
I can't test this on Windows but porting the patch in
  "fcgid_proc_win.c" looks easy, since the same loop method is
  used there too. Additionally, "fcgid_proc_win.c" doesn't group
  write requests in 8 chunks and does everything one-by-one (one
  read, then one write to the IPC socket).
  
  
  Please review the patch. If it looks good, I'll test it on our
  production machines and then submit it for mainstream merge.
  
  Best regards.
  --Ivan
  
  On 17.1.2017 г. 13:03 ч., Yann Ylavic
wrote:
  
  
On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev  wrote:


  1. Delete each bucket after sending it to the "ipc_handle". I've looked through
the call tree and the *output_brigade is last used by proc_write_ipc().
Therefore, it should be safe to empty it while being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
different way. Instead of using apr_bucket_read(), they process FILE buckets
by apr_file_read() and manage the data buffer manually. This way the
original *output_brigade won't be modified and automatically split by
apr_bucket_read(). This requires more coding work.


I would go for 1., but keep in mind that you can't delete the bucket
while it is still pointed to by the iovec...

An (optimal) alternative would be to use apr_socket_sendfile() (for
file buckets) like sendfile_nonblocking() does in httpd's core output
filter (see server/core_filters.c).
But 1. is wise still for non file buckets.

Regards,
Yann.

  
  


  



Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Jim Jagielski

> On Jan 17, 2017, at 6:03 AM, Yann Ylavic  wrote:
> 
> On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev  wrote:
>> 
>> 1. Delete each bucket after sending it to the "ipc_handle". I've looked 
>> through
>> the call tree and the *output_brigade is last used by proc_write_ipc().
>> Therefore, it should be safe to empty it while being processed there.
>> 2. Take the same approach as mod_http2, which handles FILE buckets in a
>> different way. Instead of using apr_bucket_read(), they process FILE buckets
>> by apr_file_read() and manage the data buffer manually. This way the
>> original *output_brigade won't be modified and automatically split by
>> apr_bucket_read(). This requires more coding work.
> 
> I would go for 1., but keep in mind that you can't delete the bucket
> while it is still pointed to by the iovec...
> 

+1



Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Ivan Zahariev

  
  
Hi,
Here is the patch --
https://github.com/famzah/mod_fcgid/commit/84c7c2dbf2047745c6aea87a4bc6b4061c98ac8e
A few notes:

  I tested it with several files which had random data and
various lengths: 0b, 127b, 128b, 129b, 150MB, 1MB, 1b, 250MB,
350MB, 8191b, 8192b, 8193b. The uploaded size and SHA1 checksum
match.
  
  Also made sure that the special case "There are something
left" in the code was tested too, at the end when "nvec != 0".
  I can't test this on Windows but porting the patch in
"fcgid_proc_win.c" looks easy, since the same loop method is
used there too. Additionally, "fcgid_proc_win.c" doesn't group
write requests in 8 chunks and does everything one-by-one (one
read, then one write to the IPC socket).


Please review the patch. If it looks good, I'll test it on our
production machines and then submit it for mainstream merge.

Best regards.
--Ivan

On 17.1.2017 г. 13:03 ч., Yann Ylavic
  wrote:


  On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev  wrote:

  

1. Delete each bucket after sending it to the "ipc_handle". I've looked through
the call tree and the *output_brigade is last used by proc_write_ipc().
Therefore, it should be safe to empty it while being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
different way. Instead of using apr_bucket_read(), they process FILE buckets
by apr_file_read() and manage the data buffer manually. This way the
original *output_brigade won't be modified and automatically split by
apr_bucket_read(). This requires more coding work.

  
  
I would go for 1., but keep in mind that you can't delete the bucket
while it is still pointed to by the iovec...

An (optimal) alternative would be to use apr_socket_sendfile() (for
file buckets) like sendfile_nonblocking() does in httpd's core output
filter (see server/core_filters.c).
But 1. is wise still for non file buckets.

Regards,
Yann.



  



Re: mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Yann Ylavic
On Tue, Jan 17, 2017 at 9:06 AM, Ivan Zahariev  wrote:
>
> 1. Delete each bucket after sending it to the "ipc_handle". I've looked 
> through
> the call tree and the *output_brigade is last used by proc_write_ipc().
> Therefore, it should be safe to empty it while being processed there.
> 2. Take the same approach as mod_http2, which handles FILE buckets in a
> different way. Instead of using apr_bucket_read(), they process FILE buckets
> by apr_file_read() and manage the data buffer manually. This way the
> original *output_brigade won't be modified and automatically split by
> apr_bucket_read(). This requires more coding work.

I would go for 1., but keep in mind that you can't delete the bucket
while it is still pointed to by the iovec...

An (optimal) alternative would be to use apr_socket_sendfile() (for
file buckets) like sendfile_nonblocking() does in httpd's core output
filter (see server/core_filters.c).
But 1. is wise still for non file buckets.

Regards,
Yann.


mod_fcgi: Excessive memory usage when large files are uploaded

2017-01-17 Thread Ivan Zahariev

Hello,

If a large amount of data is POST'ed to a process running mod_fcgid, the 
Apache child uses an excessive amount of memory when processing it.


The client request is properly received and the following statement from 
the documentation is true: "Once the amount of request body read from 
the client exceeds FcgidMaxRequestInMem bytes, the remainder of the 
request body will be stored in a temporary file."


The problem occurs when the temporary file is being sent to the FastCGI 
handling process via its IPC socket. Here is the corresponding function 
which sends the prepared "output_brigade" to the IPC socket: 
https://gist.github.com/famzah/125a91971fba3450dd4926fe13e0ede6


The documentation of apr_bucket_read() clearly states that "if buckets 
are read in a loop, and aren't deleted after being processed, the 
potentially large bucket will slowly be converted into RAM resident heap 
buckets. If the file is larger than available RAM, an out of memory 
condition could be caused."


I need your guidance, in order to fix this properly. I've researched a 
bit and see the following possible options to fix this:


1. Delete each bucket after sending it to the "ipc_handle". I've looked
   through the call tree and the *output_brigade is last used by
   proc_write_ipc(). Therefore, it should be safe to empty it while
   being processed there.
2. Take the same approach as mod_http2, which handles FILE buckets in a
   different way. Instead of using apr_bucket_read(), they process FILE
   buckets by apr_file_read() and manage the data buffer manually. This
   way the original *output_brigade won't be modified and automatically
   split by apr_bucket_read(). This requires more coding work.


Best regards.
--Ivan