On 8/23/2011 1:49 PM, Stefan Fritsch wrote:
>
> From looking at the code, I think the problem is the bucket structs.
> With N the number of requested ranges, the initial brigade is
> partitioned into 2*N buckets at the maximum. Then those buckets are
> copied into the output brigade N times, which means that O(N^2)
> buckets are created. The data is not copied, and only N "A-B" strings
> are allocated from the pool. But the sum of those is limited by
> LimitRequestFieldSize, so it shouldn't be a problem.
>
> Maybe the byte-range filter should call ap_pass_brigade every 10
> ranges or so? Then the buckets should be freed earlier (at least if
> all filters down the chain behave correctly).
I suggest we should be parsing and reassembling the list before we
start the bucket logic. I'd also suggest the following...
This example from the spec...
- Several legal but not canonical specifications of the second 500
bytes (byte offsets 500-999, inclusive):
bytes=500-600,601-999
bytes=500-700,601-999
does not say the last is 200 bytes and 400 bytes, but is explicitly the
second 500 bytes.
I propose we satisfy range requests in the only sensible manner, returning
the ranges in sequence, using a linked list of buckets and combining all
ranges or another mechanism to work out the applicable ranges.
The range processing is limited to some 4000 parts (consisting entirely
of invalid -, segments), and as a practical matter much less than 2500.
Reassemble the list of ranges in sequence as a pre-parsing step, and we
can much more efficiently generate the response with no duplication.
The spec is ambiguous but nowhere suggested that duplicate ranges would
be legitimate.