The current logic for concatenating small buckets in core_output_filter()
will have performance problems in two cases that I can think of:

* If you have a brigade consisting of several dozen small buckets,
  each one can get copied several times (because we only concatenate
  MAX_IOVEC_TO_WRITE of them at at time).

* If the brigade consists of, say, MAX_IOVEC_TO_WRITE+1 buckets of
  size 1MB each, the code will do a huge memory copy to (needlessly)
  concatenate the first MAX_IOVEC_TO_WRITE of them.

My proposed solution is to change the logic as follows:

* Skip the concatenation if there's >= 8KB of data already
  referenced in the iovec.

* Rather than creating a temporary brigade for concatenation,
  create a heap bucket.  Make it big enough to hold 8KB.  Pop
  the small buckets from the brigade, concatenate their contents
  into the heap bucket, and push the heap bucket onto the brigade.

* If we end up in the concatenation again during the foreach loop
  through the brigade, add the small buckets to the end of the
  previously allocated heap bucket.  If the heap bucket size is
  about to exceed 8KB, stop.

Comments?

Thanks,
--Brian



Reply via email to