On 06/02/2014 11:20 PM, Amos Jeffries wrote: > On 3/06/2014 6:39 a.m., Alex Rousskov wrote: >> On 06/01/2014 02:37 AM, Amos Jeffries wrote: >>> Lets take a step aside and cover why the buf2 is a pointer instead of an >>> separate SBuf. >> >> I do not think those reasons are important enough in this context. It is >> like suggesting "let's examine the benefits of using uninitialized >> variables and creating dangling pointers" -- whatever the benefits are, >> they are insufficient to knowingly add such code to Squid. >> >> Giving Comm a raw pointer to an internal job member is very similar to >> "using uninitialized variables and creating dangling pointers". No >> amount of resulting optimizations can justify adding that kind of API IMO.
> The caller Job has always been providing IoCallback with a buffer to > fill. Before it was a shared char*. Now it is a shared SBuf*. I can argue that screwing up an SBuf pointer (which actually points to something else!) is a lot easier than screwing up a basic char pointer, but my main argument is that adding a broken API is a bad idea, regardless of whether we already have similar broken APIs. There have been bugs where not canceling comm_read (which is still not possible to do reliably!) led to disasters. There is no compelling reason to duplicate that experience now IMO. >>> * simple solution and ony way to guarantee that data-copy is not >>> performed is to have comm_read operate on a pointer to sb1. >> >> AFAICT, among correct solutions, there is no simple solution that avoids >> data copy. We can either remain "simple" and copy OR add "complexity" to >> avoid copying. > > The solution Squid has always had: > Caller provides comm_read() with a buffer *pointer* and explicitly > cancels the read or closes the FD before it dies so the IoCallback /w > pointer will be cleaned up. Right. The above solution is both complex and wrong. We have to keep using it until a better solution is available. No need to double the problem by adding buf2 pointer to the buf pointer. >>> comm_write() is a slightly different story. It *is* possible to take sb2 >>> as a copy then the write handler sb1.consume(len) for the written bytes. Yes, read and write needs are different and probably require a different optimal API. > All of the parser-ng progress is being done on the assumed basis that we > will have zero-copy to get data into SBuf somewhere between TCP read(2) > and parser. If we have to revert to TCP->char*->SBuf we incur +1 or +2 > data copies *per packet*. It is, of course, possible to eliminate copies. However, there should be no assumption that such elimination is done using raw SBuf pointers. I suspect performance gains from better parsers will outweigh extra read copies, but I may be wrong, and it is possible to add a good API that eliminates those extra copies. > Optimizing performance using SBuf to manage 3 bytes of "GET" method > makes no sense if squid has to re-allocate and memmove() several > up-to-64KB chunks to parse them in the first place. > > IMO it will be worth some time now sorting this out properly before we > add more processing code using SBuf. Agreed. The best time for sorting this out was before buf2 commit, but going through the motions now is better than doing it later. > If you want an IRC meet-up to make this faster and save some time I can > do that any of the next few days. Sure, I will ping you. Cheers, Alex.