Adrian Chadd wrote:
(My pre-breakfast 2c, so forgive me if I'm less clear than normal.)
2008/8/27 Kinkie <[EMAIL PROTECTED]>:
My thoughts: \0 is special, and would only be significant when strings
need to be exported from the memory-managed code onto nonmanaged code.
Generally speaking, the safest way to do so is by copy rather than by
reference, but I'd rather also keep the ability to export by reference
- hoping the caller knows what they're doing. In that case the \0 is a
must-have safeguard, in some cases might require copying. Unfortunate
but unavoidable.
Although plenty of current code assumes a NUL terminated, string, its
assumed primarily for two things:
* debug(); which can be replaced with %.*s or whatever it is, to pass
in a length before the string buffer;
Not relevant in Squid-3. debugs() uses stream operator of String class
which can do exactly whatever it wants to produce a sequence of bytes.
* iterating/parsing; which can be replaced by using the length
parameter in pointer arithmetic (you can toss the pointer arithmetic
too in like 99% of the cases; the parser is about where the possible
speed boosts from pointer arithmetic would even matter)
Thats the kicker, who Henrik pointed out. It requires the pre-filled
buffer being broken into String by the parser which will need some
custom replacement for strtok() (actually faster, but more bug prone).
Both of which can be eliminated without too much trouble. In fact, I
ended up with NUL terminated strings as a special flag case during
transition work so the existing code assuming NULs could still work
whilst I converted stuff over.
Well, tokenising should be replaced by substringing really.. it could
mean having to drop strtok().
.. and in reality, writing replacement str*() routines for your String
class instead of using C string.h functions makes everything much
easier. Including the above.
Kinkie, s27_adri has a whole lot of additional String.c functions for
manipulating strings.
Append operation on String/MemoryRegion objects is easy in this model,
but if the region is not at the end of the MemoryBlob or if the result
gets too large the it will need to trigger a copy to a new MemoryBlob of
sufficient size.
Yes.
Which won't happen in like >99% of the cases.
It depends: I expect a rather common case to be when only one String
owns a Buf/MemoryBlob. In that case modifications are cheap.
Actually, the most common operation for Squid once you've fully
reworked the whole environment to use this model is "lots of Strings
referencing a large buffer" (ie, the request and reply socket buffer;
the URL strings once those are converted over.) Almost all of the
strings in-play are the http header entry strings, and most of -those-
are never modified.
Most of the -rest- are one String referencing an entire buffer.
In any case, I agree with the general model of:
* Memory: some chunk of contiguous memory somewhere;
* MemoryRegion: some reference to { Memory, offset, length }
* String: a MemoryRegion and some routines to manipulate it
Adrian
True, BUT, BUT everything behind MemoryReagion is memory
allocator/management business and should not be involved with String.
Only the MemoryRegon API affects String.
Amos
--
Please use Squid 2.7.STABLE4 or 3.0.STABLE8