Kinkie wrote:
On Thu, Sep 18, 2008 at 7:26 PM, Alex Rousskov
<[EMAIL PROTECTED]> wrote:
On Thu, 2008-09-18 at 14:10 +0200, Kinkie wrote:

The feature branch is at https://code.launchpad.net/~kinkie/squid/stringng
Here is another round of comments, based on the current SBuf.h in the
above repository. I have marked my comments with these prefixes:

*B1*: a blocking issue. I do not recommend proceeding much further until
this issue is resolved because the code may not be accepted or would
have to be changed significantly to address the issue.

*B2*: I am not sure yet, but this may become a blocker for me. If you
are not certain, please consider save us some time by implementing the
requested changes. If you are certain that I am wrong, a discussion
would be nice and may save us from trouble later.

*B3*: Everything else.


*B1* I still think that string-manipulation interfaces should be moved
into a String class.

I could agree to that if I could see an equally efficient way to implement this.
I've called the class SBuf because I see it as being a bit of a String
and a bit of a Buf.
You may notice that there are NO functions which deal with encoding,
for instance. I don't plan to add them to this class.

I kind of fuzzily disagree, the point of this is to replace MemBuf + String with SBuf. Not implement both again independently duplicating stuff.


*B2* I still think that having NULL Strings is a bad idea.

Lazy instantiation?
I can agree to a name change. But then it's also a pretty
straightforward porting of a  MemBuf construct with the same name.
Reasoning goes, if it's in MemBuf there may be a valid reason for it.


Agnostic. One of the open bugs is to clear up handling of NULL strings. They are needed in abstract for some use-cases such a URI handling, where parts of the URI may be missing. Whether they need to be NULL or 0-length is iffy.

*B2* I am not sure placing your classes in include/ is better than
placing them in src/. We have discussed similar questions before. I do
not know whether there was consensus, but I hope we agreed that only
Squid-independent portability/wrapper code should live outside of src/.
I think your code will (or should) be Squid-dependent. It will make a
lot of things easier. It is difficult to develop outside of src/ and it
does not buy us anything in this case.

I agree it should be Squid-dependent, as it will have to hook into
MemPools, Debug, CacheMgr and various other places.
I'm trying to follow the squid 3 coding guidelines as expressed in the
wiki, but maybe I misunderstood them.
I can move it to src/ no problem.

I looked at this for IPAddress. It's rather tricky to extend things in include/ safely, so src/ is preferred for everything not an OS compatibility code.

Hopefully in a new sub-folder (src/buffers/ ?) consistent with the SourceLayout plans.


*B2* class outOfBoundsException: public std::exception

It is better to inherit all your customized exceptions from the existing
TextException. Otherwise, the catching code will become unnecessary long
and the exception class code will duplicate a lot of things.
TextException itself should be polished and split or renamed to
Exception, but that is a different topic. Most generic catching code
should catch TextExceptions (which includes their children).

I've followed recommended c++ coding guidelines, but I have no issues
in implementing this.

*B1* Exception specifications: type method(...) throw(...);

Surprisingly (well, at least it was to me!), exception specifications
are a bad idea in virtually all cases except for empty specifications.
Empty specifications should also be avoided unless you Know What You Are
Doing and Absulutely Sure About That.

Please remove all your empty throw() exception specifications.

Please comment out (but do not remove) your non-empty exception
specifications. They are useful as documentation:
   type method(...); // throw(...)
or perhaps
   type method(...); // throws ...

Thinking I am nuts? Here is one of the authoritative references on this
fascinating subject:  http://www.gotw.ca/publications/mill22.htm

Ok. Leaving them documented is the most important thing to me.
I can also move them to doxygen comments so that they'll be more visible.

*B1* outOfBoundsException::what() returns a string that does not exist
outside of that method. One more reason to move non-trivial code outside
of the header where we do not have to review it (yet) :-).


*B2* Move helper class declarations outside of SBuf. Class is not a
namespace. Nested classes complicate the overall impression from the API
and tempt you to do things you really should not.

If possible at all, I'd prefer not to.
The reason why classes are placed like that is to enforce
accessibility constraints without using friend classes as much as
possible (see below)

*B1* dump() methods should be constant.

Trivial, will do.

*B3* If a class does not have both dump() and print() methods, let's
call the only printing method print(). This will make it easier to
manipulate printable objects for debugging.

The convention I followed is: dump() is for debugging, stats etc.
print() is for operator <<.
I can adhere to any other convention, as long as there's consensus -
ideally to the point where it's sanctioned in the coding guidelines.

The whole stats output dump() system IIRC was partially completed moving to sstream << . So a specific sstream<< and ostream << 'friends' might be better than print() and dump() anyways.


*B2* If SBufStore is refcounted, why is it not using the RefCountable
API we already have? If this is some kind of optimization, I would
prefer to see it done after the code is reviewed, merged, and debugged.

This code is very low-level. and it's also very self-contained. I aim
for maximum efficiency.

Ditto on the RefCountable. This was the API I was asuming you would use anyway. It's already well debugged, and tested, and low-level inline. Just inherit from RefCount and use I think.

If we find in testing that we need the efficiency, we can test the two.


*B2* Please make SBufStore data members private. They are too sensitive
to expose them like that. With the sensitive info open, we will not
catch all the abuses (or spend too much time explaining why a public
interface should not be used).

The whole SBufStore class is a protected member of SBuf. Noone except
for SBuf and its derived or friend classes can even see objects of
that class, let alone manipulate them.
In this case I've deliberately broken object layering to increase
performance and code compactness significantly; otherwise SBuf would
have to proxy most of its methods to the underlying SBufStore, which
then would lack a significant understanding of what's going on. It
would also make code much harder to follow and read as the logic would
be split in two parts.

*B3* Please rename SBufStore::init() to a more traditional allocate().
Even your own comment says that the block is not initialized!

Ok.

*B2* init() is missing asserts that the buffer has not been allocated
twice. If that code is meant to be called immediately from constructors
only, then at least make it private and add a comment about that. It is
important to note that the object is in invalid state until init() has
been called.

*B1* SBufStore::init() and SBufStore in general should not change the
reference counter. Reference counting is the prerogative of smart
pointers, not the objects they point to (even if the object stores the
counter). You need to add a smart pointer so that the code becomes clear
and manageable. Again, I suggest using RefCountable API to start with.
It will be easy to replace/optimize later, with no significant effect on
user code.

It's there as a shortcut, but I can move that out of that function no problem.

Shortcuts like that in refcounting are a debuggers nemesis. The counts need to be kept symetrical at the layer doing the counting/copying.

The auth memory leak bugs this year have all been due to refcount asymetry. I'm suspicious that the FD leaks still undiscovered are also.


Again, this is why I keep suggesting that a polished API is agreed on
first, before we start finding bugs in the unpolished API
implementation.


*B2* SBufStore constructors should be declared before init() and other
methods. Same for SBuf.

It's cosmetic, will do.

*B1* SBufStore(const SBuf &S) is not a copy constructor but the comment
says it is.

Ok.

*B2* SBufStore(const SBuf &S) makes implicit conversions possible. Do we
really need these? Doesn't creating "store" from a buffer that uses that
"store" make you feel uncomfortable?

No, it doesn't. See my previous comments about who's the only ones who
can do that.
If that was a public API, then I'd agree with you 100%; but it's not.

*B1* SBufStore is missing a copy constructor and an assignment operator.
You do not have to actually implement them, but you have to declare them
to make sure the defaults are not used. Make them private if you think
they should never be used.

With a fatal assert or loud DBG_CRITICAL debugs() complaint to catch bad behavior.


*B3* It is probably too early to review statistics, but "alloc" does not
seem to contain the "number of allocation operations". This is probably
because it is maintained in the code that does not allocate anything
(SBuf).

IIRC it counts the number of SBuf objects allocated, which may or may
not allocate a SBufStore.


Agree. Nice if you want to report:
  N SBuf using X memory.
But the store still needs to do counting and link somehow to produce that report.

*B1* SBuf assignment does not handle "assignment to self".

Right. Is it safe to implement that as a noop?

Worst case, do it with a fatal assertion and we will find out in testing.


*B1* Appending interface and/or comments do not make sense:

575      * Append to the supplied SBuf, <b>modifying</b> this.
         SBuf append(SBuf& S)
If we are appending TO the supplied SBuf, then the method should be
const (we would not be modifying this) and should be named appendTo. I
do not think we need such a method. If we are appending FROM the
supplied SBuf, then that SBuf should be const (we will not be modifying
it) and the comments should not talk about that supplied buffer
maintenance (again, because we do not modify it).

Comment is broken. I'll fix it.
*this is not const, except for const methods and for those marked as
*Copy methods.

The return value should be documented, especially if you return a copy
of "this" buffer. Why would you do that?

Hm.. should return a reference, not a copy. Reason is:

foo.append(bar).append(gazonk).append(baz).chop();

Most of the above comments apply to other SBuf::append() methods as
well.


*B1* Please review all methods for const-correctness. Comparison
operators, for example, should have constant arguments.

Ok.

*B2* I would remove numerous comments about what gets changed and what
does not. Const declarations already say as much!

I prefer being redundant than unclear, but it's fine.

*B3* Is dumpStats() needed? Is it replaceable with stats().dump() or
similar? "Smaller" APIs are often "better" and have fewer bugs.

stats classes are not visible, but it can be arranged, and probably
makes more sense.

*B3* slice() should probably be renamed to something like cut() or
chop() since you are not producing slices, you are cutting away
content.


+1. The parsing stuff I ran into last string change mostly uses copy-n-cut() semantics. less change later to name it so here now.


*B2* slice() should be implemented by calling other, more primitive
chopping methods. Same for other methods that simply combine existing
simpler actions. Please do not force us to review the same code many
times.

I'll check. In my opinion, where doing otherwise doesn't make things
significantly more efficient, slice() should be the low-level
primitive, and the other methods the shortcuts.

*B2* I am not sure search methods should throw exceptions when the thing
is not found. Exceptions are expensive and complex things. Not finding
something is a common case. The searcher calling find() should assume
that the thing may not be there. If we want to provide a confident
searcher a convenient but safe method, lets have two methods:

   // returns invalid position if not found (like std::npos)
   Position find(...) const;
   // throws if not found
   Position findOrThrow(...) const;

std::npos sounds like a good idea. Ignorance on my part, I didn't know
it existed.

I usually expect things like search should return a signed 'offset'. -N for errors and invalids.


This gives user a choice. A confident user will be able to use/pass
findOrThrow() results immediately, without checking, but safely.


*B1* Please move the Tokenizer class into its own header. FWIW, I may
not be able to review it until SBuf/String issues are resolved.

Sure, I have full intention of complying with the coding guidelines.

Regarding String, I recall that the aim of this class is to replace
MemBuf and char[] instances with something which increases efficiency
while retaining all the advantages of automatic memory management.


Thanks for taking the time and for being so thorough..


/2c followup. Dont have tiume to review the code itself today though sorry.

Amos
--
Please use Squid 2.7.STABLE4 or 3.0.STABLE9

Reply via email to