On 06/03/2014 08:22 AM, Amos Jeffries wrote: > On 4/06/2014 1:08 a.m., Alex Rousskov wrote: >> On 06/03/2014 04:46 AM, Amos Jeffries wrote: >> >>> This replaces the SBuf::trim() to use CharacterSet instead of an SBuf() >>> list of characters and memchr() >>> >>> It seems to be faster for CharacterSet lookup than repeated memchr >>> calls, but Im not certain of that. It is certainly makes simpler parser >>> code with trim and a predefined CharacterSet than static SBuf set of chars. >> >> I agree that CharacterSet membership test should be faster than repeated >> memchr() calls. >> >> No objections to this patch, although I suspect that any code calling >> SBuf::trim() should actually use Tokenizer instead; we may be optimizing >> code that should not be used in the first place. > > The use-case that brought this up is scanning mime header lines. > > Tokenizer tok(...); > SBuf line; > while (tok.prefix(line, CharacterSet::LF)) { > // drop optional trailing CR* sequence > line.trim(CharacterSet::CR, false, true);
The above does not make sense to me: After tok.prefix(), "line" will contain LF characters only. There will be no CR characters to trim. Alex.