> > Why write to the brigade instead of the top filter? > > Because we aren't always writing to the top filter. Sometimes, we are > writing to a lower filter. This is the problem with current > implementation.
I meant the top from the perspective of the writer (i.e., a lower filter only knows about filters below it). Sorry for the confusion. > > The problem with this type of optimization is that the brigade code > > cannot know the optimal size of a bucket -- only the next > > filter in the chain can know, since what is optimal will depend > > on what kind of processing is done next. > > However, the programmer knows what kind of data they are dealing with. If > it is bufferable, then it should be written to the brigade using a > buffering API. Otherwise, it should be written using a direct bucket API, > IMO. Hmmm... but I don't want to write to the brigade. I want to write to the filter stack just like I would write to any file handle. All of the complexity should be handled within the filter implementations and not exposed to the users of the filter, IMO. > > I think that if we are suffering from wasted cycles and pallocs > > due to premature brigade formations, then we should try it the other > > way -- always allocate the bucket structure off the stack, use a > > simple next pointer to connect brigades, and force the filter that > > needs to setaside the data to do so in a way that coalesces the > > bucket data. That was the main difference between the design we > > are using now and the one Greg proposed prior to the filters meeting. > > However, we don't always want to coalesce bucket data. I am picturing > this case. > > file bucket 9k -> 10 byte pool bucket -> 10 byte pool bucket. > > We want to coalesce the 2 10 byte buckets, but we don't want to coalesce > the file bucket. If the buckets are allocated off the stack, how do you > keep the file bucket around? I don't. Every buffering mechanism needs a threshold against which it writes to the next output (if it isn't blocked by waiting for something like an end-of-record) or writing to a large processing buffer if it is blocked. That prevents latency from getting too high, and provides the intermediate files that can be identified and cached just like a proxy does caching. > I have tried to allocate the brigades off the stack, but we rely on the > cleanups to ensure that buckets are freed properly. > > I guess I don't see what you are thinking. If the bucket structure is allocated on the stack, then it is cleaned or reused when the caller returns. The bucket data would either be sent out the filter tail (if it suffices for a large write) or copied to some other buffer as part of the coalescing of small writes. We are left with no need for cleanups, aside from the basic ones for the filter chain and the pools. This is hard to describe because I am still thinking of buckets in terms of the abstract model and not the current bucket code. In the abstract model, bucket structures are like windows -- they point to segments of data that are managed somewhere else. They don't really contain anything -- they just provide a sequence of instructions in how the data stream should be constructed at whatever point it needs to be constructed (that point is either when it becomes desirable to write to a non-brigade interface or when some filter needs to do per-character processing). ....Roy
