Graham Leggett wrote:
> Davi Arnaut wrote:
> 
>> Yes, first because size_t is 32 bits :). When you do a read like this on
>> a file bucket, the whole bucket is not read into memory. The read
>> function splits the bucket and changes the current bucket to refer to
>> data that was read.
> 
> 32 bits is 4GB. A large number of webservers don't have memory that 
> size, thus the problem.
> 
>> The problem lies that those new buckets keep accumulating in the
>> brigade! See my patch again.
> 
> Where?

I was referring to "it will attempt to read all 4.7GB". Such a thing
does not exist!

> We start with one 4.7GB bucket in a brigade.

No, we start with a file bucket on the brigade, I will try to explain
what happens to see if we are talking about the same thing.

Suppose we have a brigade containing a a file bucket, and the file size
is 4.7GB. We want to read it fully.

When we call apr_bucket_read() on this bucket, we end-up calling the
bucket read function (file_bucket_read). What does the bucket file read do ?

If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a
new mmap bucket and splits the bucket. So, after calling read on a file
bucket you have two buckets on the brigade. The first one is the mmap
bucket and last is file bucket with a updated offset.

The same thing happens if mmap is not supported, but the bucket type
will be a heap bucket. If we don't delete or flush those implicitly
created buckets they will keep the whole file in memory, but one single
read will not put the entire file on memory.

What Joe's patch does is remove this first implicitly created bucket
from the brigade, placing it on the brigade on a temporary brigade for
sending it to the client.

That's why splitting the brigade with magical values (16MB) is not such
a good idea, because the bucket type knows betters and will split the
bucket anyway.

--
Davi Arnaut

Reply via email to