On 4/06/2014 1:08 a.m., Alex Rousskov wrote: > On 06/03/2014 04:46 AM, Amos Jeffries wrote: > >> This replaces the SBuf::trim() to use CharacterSet instead of an SBuf() >> list of characters and memchr() >> >> It seems to be faster for CharacterSet lookup than repeated memchr >> calls, but Im not certain of that. It is certainly makes simpler parser >> code with trim and a predefined CharacterSet than static SBuf set of chars. > > I agree that CharacterSet membership test should be faster than repeated > memchr() calls. > > No objections to this patch, although I suspect that any code calling > SBuf::trim() should actually use Tokenizer instead; we may be optimizing > code that should not be used in the first place.
The use-case that brought this up is scanning mime header lines. Tokenizer tok(...); SBuf line; while (tok.prefix(line, CharacterSet::LF)) { // drop optional trailing CR* sequence line.trim(CharacterSet::CR, false, true); ... use line content for something. .. skip tok past the LF, and repeat. } If there are no other objections I will drop this in at some point prior to the parser-ng merge. No rush on approval until then. Amos