On Mon, 28 Feb 2011 14:55:59 -0700, Alex Rousskov wrote:
On 02/27/2011 06:12 PM, Amos Jeffries wrote:
This begins the libanyp.la SourceLayout changes by moving the
protocol_t
type code to stand-alone files inside its namespace.
On the most part there are no behaviour changes. The automatic
generation of the enum->text string produces an upper-case strings
array, so explicit down-casing is added for the obvious places such
as
scheme:// syntax.
+ static char out[BUFSIZ];
+ *out = '\0';
+ if (in != NULL) {
+ int p = 0;
+ for (; p < BUFSIZ && in[p] != '\0'; ++p)
+ out[p] = xtolower(in[p]);
+ out[p] = '\0';
The last line will write outside the "out" buffer boundary if "in" is
longer than BUFSIZ. The first assignment would be a waste of cycles
if
"in" has characters in it which is probably the common case. You
probably want to reshuffle to have just one (and correct) termination
assignment at position p, where p is always correct and smaller than
BUFSIZ.
Thanks. Fixed.
I could not find where this function is used in the patch. Is there
some
code missing from the patch?
Um, yes. This is shuffled to URLScheme method which is its only user.
+/// Display the Protocol Type (in upper case).
+inline std::ostream &
+operator <<(std::ostream &os, ProtocolType const &p)
+{
+ if (p < PROTO_UNKNOWN)
+ os << ProtocolType_str[p];
+ return os;
+}
Should we print something when p is PROTO_UNKNOWN or worse? Where do
we
use this operator?
Any debugs dumping objects using the type. Potentially any streamed
output of request/reply status lines.
If the protocol is unset or unknown the ProtocolType value is not
useful. The caller must do the output of any replacement data it has.
Added documentation.
- if (request->protocol == PROTO_HTTP)
+ else if (request->protocol == AnyP::PROTO_HTTP)
return httpCachable(method);
- if (request->protocol == PROTO_GOPHER)
+ else if (request->protocol == AnyP::PROTO_GOPHER)
return gopherCachable(request);
- if (request->protocol == PROTO_CACHEOBJ)
+ else if (request->protocol == AnyP::PROTO_CACHEOBJ)
Your call, but I would not change "if" to "else if" here. The change
makes understanding the code more difficult and is inconsistent with
the
rest of that function.
Okay. Removed the else's.
- CbDataList<protocol_t> *q = new CbDataList<protocol_t>
(urlParseProtocol(t));
- *(Tail) = q;
- Tail = &q->next;
+ for (int p = AnyP::PROTO_NONE; p < AnyP::PROTO_UNKNOWN;
++p) {
+ if (strcasecmp(t, AnyP::ProtocolType_str[p]) != 0) {
+ CbDataList<AnyP::ProtocolType> *q = new
CbDataList<AnyP::ProtocolType>(static_cast<AnyP::ProtocolType>(p));
The above changes the behavior from creating a PROTO_NONE-based
CbDataList to silently skipping some of the unknown protocols. Do we
check for the unknown protocols in the caller? If not, will this code
change cause any user-visible changes?
We never have checked for unknown protocols here. Just converted them
to matchable entries of NONE.
Adding a config warning to mention skipped protocols now.
Two possibly visible changes:
* the "FILE" protocol no longer accepted and silently converted to FTP
* unknown protocols no longer silently accepted and converted to NONE,
thus any potential for mis-matching unknowns and comment text is
removed.
Also, urlParseProtocol() does not recognize some of the protocols for
which the strcasecmp test will succeed, right? Should
urlParseProtocol()
be extended to be consistent with the new enumeration logic above?
Yes. This is done as part of the next step moving URL parsing into
class URL. It is moved to use the same for-loop pattern to fill
URLSchemes.
Meanwhile we will have protocols accepted by ACL proto which are not
yet matchable. This does not seem to be a problem since other parts of
the parser will reject unknown protocols at present.
When a similar change was done for request methods, we ended up
creating
an HttpRequestMethod class with a few useful methods. IIRC, the
primary
motivation was correct handling of unknown methods -- we needed a
place
to store the unknown method image/string. Do we need any need to
handle
unknown protocols by storing their image/string?
Good point. At present URLScheme class fills that purpose but only does
the registered protocols right now. The URL updates plan to change it to
handle unknowns as string images.
In the final URL and protocol design we have AnyP::ProtocolType which
enumerates the registered protocols accepted in URLs (lower case) and
the request protocol field (upper case). Then URLScheme which expands
that to allow other protocol names accepted in URLs.
Thank you for checking this.
Amos