[Mod-fcgid-users] fcgid large file uploading and input buffering

Gabriel Barazer Mon, 30 Apr 2007 06:22:05 -0700

Hello,

I experienced recently some problmes since a customer is doing large 
file uploads with PHP (which is run by mod_fcgid, of course) : It seems 
mod_fcgid is consuming much memory when uploading a file to PHP. I found 
in the source file fcgid_bridge.c:473 the problem : as said in the 
source, the entire request (stdin/post input) is loaded into memory 
before sending it to the fastcgi Application Server (PHP in our case). 
Although it's a well workaround for dealing with slow clients, I think 
this is not the good behavior to implement, here are the points 
highlighted :
- Uploading files is becoming a major security problem, since a DoS can 
be triggered by uploading a very large file (I experienced some attacks 
with 1/2GB over a fast connection)
- Additionnally, Video (=large) file uploading is becoming more and more 
popular, increasing the memory consumption.
- Dealing with slow clients must be done by the appliction server, which 
can take any appropriate measure (e.g. having a special queue processing 
for slow clients)
- Upload progress meter is not possible if all the input data is 
buffered before sent to the fastcgi process. (see RFC1867 : File Upload 
Progress hook handler)
- Upload of large files is better handled by the fast cgi AS, because of 
various method used to store the upload data during progress (at the 
application level , not the communication level that fastcgi is). e.g. 
PHP handles file upload by creating temporary files, which location of 
these can be customised by a php.ini directive. I think this task has 
not to be handled by the fastcgi layer (which serves as a comm./bridge 
protocol, not a input processor)
- There is no need for the fastcgi process manager to handle and buffer 
slow clients : A FastCGI application designed to handle load can handle 
multiple connections AND the mod_fcgid process manager already does 
multiple connection management with the adaptive spawning feature for 
application which are not multi-tasked/threaded. (I even know fastcgi 
applications which embed a process manager themselves)



What are the problems with slow clients :
- Sending input is very long, not constant : e.g. with shaped 
connections : data flow is sent by "peaks" foloowed by no data input for 
a variable time.
- Connection is longer busy at the Apache level, but at the fastcgi 
application level too (the workaround of buffering all the input prevent 
the fastcgi app from being busy buring the input loading).

How to deal with this, my proposal :
- What about buffering input like the output buffer, by chunks of, say, 
8Kbytes ? The major problem is the time to fill the buffer : if the time 
required to fill the buffer is too long, application can timeout, but I 
think this is the normal behavior of an application to manage 
communication timeout. What about don't buffering the input at all ? 
This way the data flow AND the data flow rate can by processed by the 
application (such as measuring the data flow rate to put a slow request 
in a special queue).
- Because maybe some users prefer the current behavior of buffering all 
the input data, a compatibility switch would be a nice thing (e.g. 
InputBuffering Off / On)

What do you think about it ?

BTW: who are the current maintainer(s) of this project ? The 
documentation of this project is not very up-to-date and I had to read 
the source code to know all the directives... Maybe can I be of some help ?

Regards,

Gabriel



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Mod-fcgid-users mailing list
Mod-fcgid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-fcgid-users

[Mod-fcgid-users] fcgid large file uploading and input buffering

Reply via email to