You can effectively stream a file byte by byte - you just need to print
a chunk at a time and mod_perl and apache will handle it
appropriately... I do this all the time to handle large data downloads
(the systems I manage are backed by peta bytes of data)...
The art is often not in the output - but in the way you get and process
data before sending it - I have code that will upload/download arbitrary
large files (using HTML5's file objects) without using excessive amounts
of memory... (all data is stored in chunks in a MySQL database)
Streaming has other advantages with large data - if you wait till you
generate all the data then you will find that you often get a time out -
I have a script which can take up to 2 hours to generate all the output
- but it never times out as it is sending a line of data at a time....
and do data is sent every 5-10 seconds... and the memory footprint is
trivial - as only data for one line of output is in memory at a time..
On 28/03/2015 16:25, John Dunlap wrote:
sendfile sounds like its exactly what I'm looking for. I see it in the
API documentation for Apache2::RequestIO but how do I get a reference
to it from the reference to Apache2::RequestRec which is passed to my
handler?
On Sat, Mar 28, 2015 at 9:54 AM, Perrin Harkins <phark...@gmail.com
<mailto:phark...@gmail.com>> wrote:
Yeah, sendfile() is how I've done this in the past, although I was
using mod_perl 1.x for it.
On Sat, Mar 28, 2015 at 5:55 AM, André Warnier <a...@ice-sa.com
<mailto:a...@ice-sa.com>> wrote:
Randolf Richardson wrote:
I know that it's possible(and arguably best practice)
to use Apache to
download large files efficiently and quickly, without
passing them through
mod_perl. However, the data I need to download from my
application is both
dynamically generated and sensitive so I cannot expose
it to the internet
for anonymous download via Apache. So, I'm wondering
if mod_perl has a
capability similar to the output stream of a java
servlet. Specifically, I
want to return bits and pieces of the file at a time
over the wire so that
I can avoid loading the entire file into memory prior
to sending it to the
browser. Currently, I'm loading the entire file into
memory before sending
it and
Is this possible with mod_perl and, if so, how should
I go about
implementing it?
Yes, it is possible -- instead of loading the
entire contents of a file into RAM, just read blocks in a
loop and keep sending them until you reach EoF (End of File).
You can also use $r->flush along the way if you
like, but as I understand it this isn't necessary because
Apache HTTPd will send the data as soon as its internal
buffers contain enough data. Of course, if you can tune
your block size in your loop to match Apache's output
buffer size, then that will probably help. (I don't know
much about the details of Apache's output buffers because
I've not read up too much on them, so I hope my
assumptions about this are correct.)
One of the added benefits you get from using a
loop is that you can also implement rate limiting if that
becomes useful. You can certainly also implement access
controls as well by cross-checking the file being sent
with whatever internal database queries you'd normally use
to ensure it's okay to send the file first.
You can also :
1) write the data to a file
2) $r->sendfile(...);
3) add a cleanup handler, to delete the file when the request
has been served.
See here for details :
http://perl.apache.org/docs/2.0/api/Apache2/RequestIO.html#C_sendfile_
For this to work, there is an Apache configuration directive
which must be set to "on". I believe it is called "UseSendFile".
Essentially what senfile() does, is to delegate the actual
reading and sending of the file to Apache httpd and the
underlying OS, using code which is specifically optimised for
this purpose. It is much kore efficient than doing this in a
read/write loop by yourself, at the cost of having less fine
control over the operation.
--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>
---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.