Randolf Richardson wrote:
I know that it's possible(and arguably best practice) to use Apache to
download large files efficiently and quickly, without passing them through
mod_perl. However, the data I need to download from my application is both
dynamically generated and sensitive so I cannot expose it to the internet
for anonymous download via Apache. So, I'm wondering if mod_perl has a
capability similar to the output stream of a java servlet. Specifically, I
want to return bits and pieces of the file at a time over the wire so that
I can avoid loading the entire file into memory prior to sending it to the
browser. Currently, I'm loading the entire file into memory before sending
it and

Is this possible with mod_perl and, if so, how should I go about
implementing it?

Yes, it is possible -- instead of loading the entire contents of a file into RAM, just read blocks in a loop and keep sending them until you reach EoF (End of File).

You can also use $r->flush along the way if you like, but as I understand it this isn't necessary because Apache HTTPd will send the data as soon as its internal buffers contain enough data. Of course, if you can tune your block size in your loop to match Apache's output buffer size, then that will probably help. (I don't know much about the details of Apache's output buffers because I've not read up too much on them, so I hope my assumptions about this are correct.)

One of the added benefits you get from using a loop is that you can also implement rate limiting if that becomes useful. You can certainly also implement access controls as well by cross-checking the file being sent with whatever internal database queries you'd normally use to ensure it's okay to send the file first.


You can also :
1) write the data to a file
2) $r->sendfile(...);
3) add a cleanup handler, to delete the file when the request has been served.
See here for details : 
http://perl.apache.org/docs/2.0/api/Apache2/RequestIO.html#C_sendfile_

For this to work, there is an Apache configuration directive which must be set to "on". I believe it is called "UseSendFile". Essentially what senfile() does, is to delegate the actual reading and sending of the file to Apache httpd and the underlying OS, using code which is specifically optimised for this purpose. It is much kore efficient than doing this in a read/write loop by yourself, at the cost of having less fine control over the operation.

Reply via email to